Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Machine Learning Engineer - VLM/LLM Evaluation

$238k - $302k

Waymo

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver-The World's Most Experienced Driver-to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo's fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states.

The mission of the Waymo AI Foundations team is to develop machine learning solutions addressing open problems in autonomous driving, towards the goal of safely operating Waymo vehicles in dozens of cities and under all driving conditions. As part of our work, we also initiate and foster collaborations with other research teams in Alphabet. AI Foundations areas that we are currently focusing on include reinforcement learning, learning from demonstration, generative modeling, Bayesian inference, hierarchical learning, and robust evaluation.

This role follows a hybrid work schedule and you will report to a Senior Staff Software Engineer.

You will:

  • Work with a creative team of people who help to build the state-of-the-art Foundation Models that are used throughout Waymo's systems, both onboard autonomous vehicles and offboard in simulation
  • Lead the development of end-to-end evaluation systems and benchmarks for Waymo Foundation models, encompassing the entire lifecycle from pretraining and supervised fine-tuning (SFT) to reinforcement learning (RL), for evaluating the quality, safety, and realism of embodied AI agents
  • Partner within and across organizations to land disruptive and innovative tech in production
  • Implement and extend large large scale data and evaluation pipelines

You have:

  • Master's degree or PhD degree in Computer Science, similar technical field of study, or equivalent practical experience
  • 5+ years of experience in ML engineering and applied Deep Learning, with a strong portfolio of shipped products or publication record
  • Experience with large scale distributed system
  • Proficient programming skills (eg: Python, C/C++)
  • Strong analytical and debugging skills

We prefer:

  • ML infra experience: training, evaluating and deploying ML models at scale
  • Deep learning experience, especially with generative models, e.g., LLMs/VLMs, and/or reinforcement learning
  • Proficiency and in-depth knowledge of the inner workings of an ML framework (e.g. Pytorch, JAX, Tensorflow)

In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees. Benefits for this role include:

  • Health, dental, vision, life, disability insurance
  • Retirement Benefits: 401(k) with company match
  • Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
  • Sick Time: 40 hours/year (statutory, where applicable); 5 days/event (discretionary)
  • Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks
  • Baby Bonding Leave: 18 weeks
  • Holidays: 13 paid days per year

The expected base salary range for this full-time position across US locations is listed below. Actual starting pay will be based on job-related factors, including exact work location, experience, relevant training and education, and skill level. Your recruiter can share more about the specific salary range for the role location or, if the role can be performed remote, the specific salary range for your preferred location, during the hiring process.

Waymo employees are also eligible to participate in Waymo's discretionary annual bonus program, equity incentive plan, and generous Company benefits program, subject to eligibility requirements.

Salary Range

$238,000—$302,000 USD

Vacancy posted 10 days ago
Similar jobs that could be interesting for youBased on the Staff Machine Learning Engineer - VLM/LLM Evaluation in San Francisco, CA vacancy
  • $170k - $216k

     ...builds the system which learns the spatial-...  ...sensors, enabling engineers like you to (1) develop...  ...for cutting-edge VLM foundation models....  ...Develop and rigorously evaluate metrics and...  ...years of experience in Machine Learning, with a focus...  ...model development (LLM, VLM, or similar... 
    Suggested
    Remote work

    Waymo

    San Francisco, CA
    8 days ago
  • $180k - $270k

     ...security and privacy protection. To learn more about Plaud, please...  ...on. Possess strong software engineering skills (especially in Python)...  ...systems, data pipelines, or evaluation harnesses that can run at scale...  ..." looks like for a Speech LLM, translating capabilities (like... 
    Suggested
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    4 days ago
  •  ...frontier research for their next generation of LLM products. Join us if you: Wish to work...  ...advancement. Responsibilities Own LLM evaluation processes and methods with a focus on...  ...abrupt shift in focus. You must be able to learn, implement, and extend state-of-the-art... 
    Suggested
    Local area
    Shift work

    Capitolis

    San Francisco, CA
    2 days ago
  •  ...degree in Computer Science, Machine Learning, Artificial Intelligence, or...  ...designing and using metrics for evaluating complex AI systems , (...  ...for researchers and software engineers who are passionate about developing...  ...models and Generative AI (LLM/VLM) solutions. These solutions... 
    Suggested

    Waymo

    San Francisco, CA
    2 days ago
  • $208k - $300k

    Machine Learning Engineer - Model Evaluations, Public Sector San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC Ready to Apply? Join the team shaping...  ...performance, robustness, and safety metrics, including LLM‑judge‑based evaluations. Design test datasets and... 
    Suggested
    Full time

    Scale AI, Inc.

    San Francisco, CA
    1 day ago
  •  ...Fortune 500. By bridging the gap between LLM capabilities and domain-specific...  ...improve its fundamentals?" CTGT's Senior Machine Learning Engineer will operate deep within the model...  ...improvements in model output. Build the evaluation and deployment loops needed to ship... 

    CTGT

    San Francisco, CA
    2 days ago
  • $251k - $310k

     ...We are the software engineering team responsible...  ...future-looking deep-learning-based explorations....  ...Analyze, finetune, and evaluate model performance...  ...driving and machine learning, and be able...  ...Large Language Models (LLM) or Vision Language Models (VLM), prompt engineering... 
    Remote work

    Waymo

    San Francisco, CA
    10 days ago
  •  ...construction veterans and world‑class engineers to solve physical‑world problems...  ...team—we’d love to have you join us. Machine Learning Engineer: Evaluation Bedrock is bringing autonomy to the...  ...Engineers who are currently Senior or Staff level with 5+ years of professional... 
    Work at office
    Flexible hours

    Bedrock Robotics Inc

    San Francisco, CA
    3 days ago
  • $200k

     ...data security and privacy protection. To learn more about Plaud, please visit and...  ...living at the intersection of research and engineering, eager to design novel sequence modeling...  ...serving frameworks (e.g., vLLM, TensorRT-LLM, SGLang) to minimize latency for real-time... 
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    4 days ago
  • $200k

     ...data security and privacy protection. To learn more about Plaud, please visit and follow...  ...-throughput, ultra-low-latency inference engines for large language models or foundational...  ...Deep, under-the-hood familiarity with modern LLM serving frameworks like vLLM, TensorRT-LLM... 
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    4 days ago
  • $181.1k - $318.4k

    Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms San Francisco Bay Area, California, United States Machine Learning...  .../Decoder models. Familiarity with Nvidia TensorRT-LLM, vLLM, DeepSpeed, Nvidia Triton Server etc. Experience writing... 
    Relocation

    Apple Inc.

    San Francisco, CA
    3 days ago
  •  ...well-funded startup with their search for Machine Learning Engineers. Their product helps AI teams turn complex documents into LLM-ready inputs with exceptional accuracy. This...  ...improve LLM accuracy Build data pipelines, evaluate model performance, and integrate models... 
    Work at office

    DRH Search

    San Francisco, CA
    16 hours ago
  •  ...Title: Machine Learning Engineer Job Type: Contract Contract Length: 6 months Target Start...  ...Deployment: Design, build, and deploy LLM and non-LLM based models to solve...  ...of end-to-end experience in training, evaluating, and deploying machine learning models... 
    Contract work
    Immediate start
    Remote work

    DeWinter Group

    San Francisco, CA
    2 days ago
  • $250k - $334.53k

     ...builds the system which learns the spatial-temporal...  ...set of sensors, enabling engineers like you to (1) develop...  ...recipes for human and machine labeling of data sets...  ...methods and recipes for evaluating real-world performance...  ...continuing scaling of VLM data needs. Collaborate... 
    Remote work

    Waymo

    San Francisco, CA
    10 days ago
  •  ...is looking for a Member of Technical Staff in Data Analysis and Evaluation to ensure the quality and performance...  ...ideal candidate will have strong software engineering and statistical skills, along with experience in machine learning frameworks. Applicants from diverse... 

    Cohere

    San Francisco, CA
    9 hours ago
  •  ...The role We’re looking for a Machine Learning Engineer to build and ship consumer-facing AI systems...  ...problem framing → prototyping → offline evaluation → online experimentation → production...  ...to real product workflows (LLM + tools/RAG, multimodal modeling, policy... 
    Full time
    Immediate start
    Worldwide
    Night shift

    Eight Sleep

    San Francisco, CA
    29 days ago
  • $200k - $260k

     ...Senior Machine Learning Engineer, Voice AI San Francisco About the Role Together AI is building...  ...-on with inference engines like TRT-LLM and SGLang to optimize how we serve models...  ...'s infrastructure. Build quality evaluation frameworks that guide model selection... 
    Full time

    Together AI

    San Francisco, CA
    16 hours ago
  • $168k - $198k

     ...Machine Learning Engineer San Francisco, California, United States Checkr is building the data...  ...services. Design with LLMs and APIs. Use LLM APIs (OpenAI, Anthropic, etc.) as...  ...room with alignment, not confusion. Evaluate and iterate fast. Build evaluation... 
    Work at office
    Local area
    Remote work
    Relocation
    Flexible hours
    3 days per week

    Checkr

    San Francisco, CA
    16 hours ago
  •  ...San Francisco, CA. You’ll be: Evaluating and implementing LLM based knowledge graphs, advanced RAG...  ...the platform through features like learn from feedback, search personalization...  ...product and contribute to the AI/ML engineering strategy You’ll be successful if you... 

    Onyx

    San Francisco, CA
    3 days ago
  • $180k - $270k

     ...persona. Genies is looking for a Senior Machine Learning Engineer to join our Avatar Technology team,...  ...including data processing, training, evaluation, optimization, and deployment. Develop...  ...quality. Collaborate with Behavior and LLM teams to connect motion systems with higher... 
    Full time
    Work at office

    Cerebras

    San Francisco, CA
    4 days ago
  •  ...made. We’re hiring our Founding ML Engineer, the first full-time machine learning hire who will turn research and...  ...training. Implement backtesting and evaluation frameworks with clear performance...  ...execution systems. Experience with LLM/RAG workflows for parsing financial... 
    Full time
    Immediate start
    Relocation
    Visa sponsorship
    Relocation package

    Poesis LLC

    San Francisco, CA
    3 days ago
  • $189.72k - $332.01k

     ...(ATG), Pinterest’s advanced machine learning team. ATG’s goal is to keep...  ...collaboration with product engineering teams. The team also publishes...  ...from Generative AI alignment, evaluation, and mitigations - ensuring...  ...Familiarity with LLM‑powered productivity tools for... 
    Work experience placement
    Work at office
    Local area
    Remote work
    Relocation
    Relocation package

    I did my part and supported the Regular Toilet

    San Francisco, CA
    3 days ago
  • $131.4k - $235.95k

     ...Growth Experience Technology Machine Learning Team (GET-ML) @Autodesk...  ...Support Assistant (CSA), an LLM-driven conversational platform...  ...driven workflows, query routing, evaluation and measurement, and the...  ...members partner closely with ML engineers, MLOps, product managers,... 
    For contractors
    Work experience placement
    Work at office
    Remote work

    Autodesk

    San Francisco, CA
    3 days ago
  • $131.4k - $235.95k

     ...tools for making buildings, machines, and even the latest movies,...  ...As a Senior Machine Learning Engineer focused on Machine Learning...  ...partner closely with researchers, evaluation engineers, and product teams...  ...running production ML or LLM inference services,... 
    For contractors
    Remote work

    Autodesk

    San Francisco, CA
    2 days ago
  • $150k - $220k

     ...Founding Machine Learning Engineer San Francisco Compensation ~ Estimated base salary $150...  ...You'll work at the intersection of LLM inference, browser understanding, and...  ...optimizations between client and server Build evaluation frameworks and data pipelines to... 
    H1b
    Work at office
    Visa sponsorship
    Sleeping nights

    Composite.ai

    San Francisco, CA
    1 day ago
  • $166k - $210.25k

     ...significant strides in LLM quality for these products...  ...seeking multiple GenAI Engineers from junior levels to...  ...Lake and MLflow. To learn more, follow Databricks...  ...hyperparameter tuning, and model evaluation, enabling rapid...  ...Looking For 2–8 years of machine learning engineering... 
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    4 days ago
  • $204k - $259k

     ...serving as the foundation for training and validating the AV stack. We are an advanced ML and engineering team that leverages state-of-the-art computer vision, deep learning, and generative AI to automatically analyze driving logs, generate rich scene understanding,... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    4 days ago
  •  ...Role You will be Shepherd's first Machine Learning Engineer, embedded in the Fully Autonomous Underwriting...  ...Develop confidence scoring and evaluation frameworks that define when the system...  ...with agentic frameworks or multi-step LLM orchestration (LangChain, LangGraph, or... 

    Shepherd

    San Francisco, CA
    26 days ago
  • $140k - $265k

     ...Machine Learning Engineer, Search Quality Mountain View, CA About Glean: Glean is the Work AI platform...  ...enterprise SaaS connectors, flexible LLM choice, and robust APIs, Glean gives...  ...natural language question-answering, evaluation, and experimentation. We interact regularly... 
    Work at office
    Home office
    Flexible hours
    3 days per week

    Glean.info

    San Francisco, CA
    3 days ago
  •  ...this is the place. The Role As a Senior Machine Learning Engineer, you will build the intelligence layer...  ...across LLMs, OCR pipelines, voice AI, evaluation systems, and backend production...  ...structured facts and decisions. Design LLM‑powered extraction, classification, validation... 
    Work at office

    Hike-Medical

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Machine Learning Engineer - VLM/LLM Evaluation. Be the first to apply!