Staff Machine Learning Engineer - VLM/LLM Evaluation
$238k - $302kWaymo
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo’s fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states. The mission of the Waymo AI Foundations team is to develop machine learning solutions addressing open problems in autonomous driving, towards the goal of safely operating Waymo vehicles in dozens of cities and under all driving conditions. As part of our work, we also initiate and foster collaborations with other research teams in Alphabet. AI Foundations areas that we are currently focusing on include reinforcement learning, learning from demonstration, generative modeling, Bayesian inference, hierarchical learning, and robust evaluation. This role follows a hybrid work schedule and you will report to a Senior Staff Software Engineer. You will: * Work with a creative team of people who help to build the state-of-the-art Foundation Models that are used throughout Waymo’s systems, both onboard autonomous vehicles and offboard in simulation * Lead the development of end-to-end evaluation systems and benchmarks for Waymo Foundation models, encompassing the entire lifecycle from pretraining and supervised fine-tuning (SFT) to reinforcement learning (RL), for evaluating the quality, safety, and realism of embodied AI agents * Partner within and across organizations to land disruptive and innovative tech in production * Implement and extend large large scale data and evaluation pipelines You have: * Master’s degree or PhD degree in Computer Science, similar technical field of study, or equivalent practical experience * 5+ years of experience in ML engineering and applied Deep Learning, with a strong portfolio of shipped products or publication record
- Experience with large scale distributed system
- Proficient programming skills (eg: Python, C/C++)
- Strong analytical and debugging skills
- ML infra experience: training, evaluating and deploying ML models at scale
- Deep learning experience, especially with generative models, e.g., LLMs/VLMs,
- Health, dental, vision, life, disability insurance
- Retirement Benefits: 401(k) with company match
- Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours
- Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks
- Baby Bonding Leave: 18 weeks
- Holidays: 13 paid days per year
$238,000—$302,000 USD
$204k - $259k
...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an autonomous driving technology company with the mission to be the world's most trusted... ...hybrid work schedule and you will report to a Senior Staff Software Engineer. You will: Work with a creative...SuggestedFull timeTemporary workRemote work$204k - $259k
...builds the system which learns the spatial-... ...sensors, enabling engineers like you to (1) develop... ...for cutting-edge VLM foundation models.... ...Develop and rigorously evaluate metrics and... ...years of experience in Machine Learning, with a focus... ...model development (LLM, VLM, or similar...SuggestedFull timeRemote work$200k - $365k
...and privacy protection. To learn more about Plaud, please... .... Possess strong software engineering skills (especially in Python)... ...systems, data pipelines, or evaluation harnesses that can run at scale... ...good" looks like for a Speech LLM, translating capabilities (like...SuggestedFull timeWork at officeWorldwide$200k - $300k
...connectors, flexible LLM choice, and robust APIs... ...reliably better over time: evaluation pipelines, quality... ..., and the tooling engineers use to understand what... ...evaluation, reinforcement learning from human feedback,... ...large systems involving machine learning. ~...SuggestedHome officeFlexible hours3 days per week$240.45k - $300.3k
...Senior Machine Learning Engineer - Model Evaluations, Public Sector The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic... ...performance, robustness, and safety metrics, including LLM-judge-based evaluations. Design test datasets and...SuggestedFull time- ...Fortune 500. By bridging the gap between LLM capabilities and domain-specific... ...improve its fundamentals?" CTGT's Senior Machine Learning Engineer will operate deep within the model... ...improvements in model output. Build the evaluation and deployment loops needed to ship...
$200k
...data security and privacy protection. To learn more about Plaud, please visit and follow... ...-throughput, ultra-low-latency inference engines for large language models or foundational... ...Deep, under-the-hood familiarity with modern LLM serving frameworks like vLLM, TensorRT-LLM...Full timeWork at officeWorldwide$200k
...data security and privacy protection. To learn more about Plaud, please visit and... ...living at the intersection of research and engineering, eager to design novel sequence modeling... ...serving frameworks (e.g., vLLM, TensorRT-LLM, SGLang) to minimize latency for real-time...Full timeWork at officeWorldwide- ...for their next generation of LLM products. Join us if you:... ...attacks. Collaborate with our engineering team to deliver real-world applications... ...LLMs, and conduct rigorous evaluation and benchmarking.... ...in focus. You must be able to learn, implement, and extend state-...Local areaShift work
- ...construction veterans and world-class engineers to solve physical-world problems... ...team-we'd love to have you join us. Machine Learning Engineer: Evaluation Bedrock is bringing autonomy to... ...who are currently Senior or Staff level with 5+ years of professional...Work at officeFlexible hours
- ...Title: Machine Learning Engineer Job Type: Contract Contract Length: 6 months Target Start... ...Deployment: Design, build, and deploy LLM and non-LLM based models to solve... ...of end-to-end experience in training, evaluating, and deploying machine learning models...Contract workImmediate startRemote work
$204k - $259k
...serving as the foundation for training and validating the AV stack. We are an advanced ML and engineering team that leverages state-of-the-art computer vision, deep learning, and generative AI to automatically analyze driving logs, generate rich scene understanding,...Full timeRemote work- ...Role We're hiring an ML Engineer who will turn research and data... ...Implement backtesting and evaluation frameworks with clear performance... ...frameworks; required: scikit-learn and XGBoost; preferred: PyTorch... ...coding agents. Experience with LLM/RAG workflows for parsing...Full timeWork at officeImmediate startVisa sponsorshipWork visaRelocation package3 days per week
$131.4k - $235.95k
...tools for making buildings, machines, and even the latest movies,... ...As a Senior Machine Learning Engineer focused on Machine Learning... ...partner closely with researchers, evaluation engineers, and product teams... ...running production ML or LLM inference services,...For contractorsRemote work$200k - $250k
...Founding Machine Learning Engineer - On-site - San Francisco, CA Location: San Francisco, CA... ...reliability, reinforcement learning, evaluation systems, and the infrastructure required... ...multimodal systems, perception models, or LLM-powered products. Ability to design...Work at officeImmediate start- ...Senior Machine Learning Engineer Location: San Francisco About Hum.ai Hum.ai is building... ...large foundation models (beyond just LLM fine-tuning). This role is focused... ...Shaping benchmark design and model evaluation frameworks Building agentic AI capabilities...Work experience placementRemote work
$150k - $220k
...Founding Machine Learning Engineer San Francisco Compensation ~ Estimated base salary $150... ...You'll work at the intersection of LLM inference, browser understanding, and... ...optimizations between client and server Build evaluation frameworks and data pipelines to...H1bWork at officeVisa sponsorshipSleeping nights- ...persona. Genies is looking for a Senior Machine Learning Engineer to join our Avatar Technology team,... ...data) for training, fine-tuning, and evaluation. Build data pipelines for... ...environments. Collaborate with Behavior and LLM teams to integrate predictive motion systems...Full timeWork experience placementWork at office
$164.2k - $205.2k
...made significant strides in LLM quality for these products.... ...are seeking multiple GenAI Engineers from junior levels to more senior... ...tuning, and model evaluation, enabling rapid experimentation... ...Looking For ~2-8 years of machine learning engineering experience in high...Work at officeLocal areaWorldwide$200k - $260k
...Senior Machine Learning Engineer, Voice AI San Francisco About the Role Together AI is building... ...-on with inference engines like TRT-LLM and SGLang to optimize how we serve models... ...'s infrastructure. Build quality evaluation frameworks that guide model selection...Full time- ...persona. Genies is looking for a Senior Machine Learning Engineer to join our Avatar Technology team,... ...including data processing, training, evaluation, optimization, and deployment.... ...quality. Collaborate with Behavior and LLM teams to connect motion systems with higher...Full timeWork experience placementWork at office
$225k - $325k
...high-ownership role for ML engineers who want to build production... ...constraints. As a Founding Senior Machine Learning Engineer at Retell, you'll... ...models and audio models, evaluate them with rigorous benchmarks... ...Interview (45 min) : LLM theory specific coding Interview...H1bWork at office$140k - $265k
...Machine Learning Engineer, Search Quality Mountain View, CA About Glean: Glean is the Work AI platform... ...enterprise SaaS connectors, flexible LLM choice, and robust APIs, Glean gives... ...natural language question-answering, evaluation, and experimentation. We interact regularly...Work at officeHome officeFlexible hours3 days per week- ...Role You will be Shepherd's first Machine Learning Engineer, embedded in the Fully Autonomous Underwriting... ...Develop confidence scoring and evaluation frameworks that define when the system... ...with agentic frameworks or multi-step LLM orchestration (LangChain, LangGraph, or...
$168k - $198k
...Machine Learning Engineer San Francisco, California, United States Checkr is building the data... ...services. Design with LLMs and APIs. Use LLM APIs (OpenAI, Anthropic, etc.) as... ...room with alignment, not confusion. Evaluate and iterate fast. Build evaluation...Work at officeLocal areaRemote workRelocationFlexible hours3 days per week- ...Machine Learning Engineer We're assisting a well-funded startup with their search for Machine Learning... ...AI teams turn complex documents into LLM-ready inputs with exceptional accuracy... ...LLM accuracy Build data pipelines, evaluate model performance, and integrate...Work at office
- ...San Francisco, CA. You’ll be: Evaluating and implementing LLM based knowledge graphs, advanced RAG... ...the platform through features like learn from feedback, search personalization... ...product and contribute to the AI/ML engineering strategy You’ll be successful if you...
- ...Check out our About page to learn more. The Mission:... ...will be Shepherd's first Machine Learning Engineer, embedded in the Fully Autonomous... ...confidence scoring and evaluation frameworks that define when... ...frameworks or multi-step LLM orchestration (LangChain, LangGraph...Work at office
- ...the place. The Role As a Senior Machine Learning Engineer, you will build the intelligence layer... ...across LLMs, OCR pipelines, voice AI, evaluation systems, and backend production... ...structured facts and decisions. Design LLM-powered extraction, classification, validation...Work at office
$200k - $300k
...Machine Learning Engineer, Enterprise Brain Mountain View, CA Glean is the Work AI platform that... ...enterprise SaaS connectors, flexible LLM choice, and robust APIs, Glean gives organizations... .... Lead development of scalable evaluation, benchmarking, and optimization loops....Work at officeHome officeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Machine Learning Engineer - VLM/LLM Evaluation. Be the first to apply!
- staff security engineer San Francisco, CA
- assistant engineer San Francisco, CA
- engineering aide San Francisco, CA
- assistant chief engineer San Francisco, CA
- staff engineer San Francisco, CA
- technology administrator San Francisco, CA
- senior staff systems engineer San Francisco, CA
- staff data engineer San Francisco, CA
- software engineer staff San Francisco, CA
- assistant engineering manager San Francisco, CA

