ML Evals Engineer Build Benchmarking Pipelines
AI Chopping Block, Inc.
AI Chopping Block, Inc. in San Francisco is looking for a Research Engineer to build evaluation systems for their product Firecrawl. The ideal candidate will have over 3 years of experience in Machine Learning engineering, and the role includes responsibilities such as designing metrics, building pipelines, and defining benchmarks to ensure output quality. The position offers a hybrid work option, competitive salary, and equity opportunities. #J-18808-Ljbffr
- ...About the Role ML Ops Engineer — Agentic AI Lab (Founding Team... ...) Backed by 8VC, we're building a world-class team to tackle... ...LLMs, agent-native pipelines, retrieval-augmented... ...manage evaluation and benchmarking frameworks (e.g. OpenLLM-Evals, RAGAS, LangSmith) Integrate...PipelineFull time
- ...About the Role As a Research Engineer at Mercor, you’ll work at... ...applied AI research. You’ll own benchmarking pipelines, evaluation systems, and... .... You’ll design and run evals, build rubrics and scorers, and turn... ...hands‑on experience with ML models and evaluation code....PipelineWork at office
$151k - $257k
...Zoox is seeking a Machine Learning Engineer for 3D Simulation in San Francisco. This role focuses on developing advanced... ...Responsibilities include collaborating with researchers and building scalable cloud pipelines. Compensation ranges from $151,000 to $257,000 annually,...Pipeline- ...Jack & Jill is seeking a Founding AI/ML Engineer in San Francisco. The role involves building Generative Engine Optimization systems for AI search. Ideal candidates have strong NLP fundamentals and experience in production ML systems. You'll collaborate with world-class...Pipeline
- ...AI We're a small, senior team building the intelligent OS for the... ...The Role We're hiring a Senior ML Engineer to help build the AI systems that... ...across the full ML stack: data pipelines, model training, retrieval, ranking, evals, and deployment. You will ship features...PipelineWork at officeRelocationRelocation packageFlexible hours
- ...ML Systems Engineer – Robotics & AI We are building the full-stack foundation for the next generation of humanoid robots... ...sharded training, tensor/pipeline parallelism, gradient accumulation... ...for real-world robotics, not toy benchmarks. Tight collaboration between systems...Pipeline
$250k - $350k
...Most AI roles build on top of models. This one builds what... ...actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world... ..., and AI. This isn’t clean benchmark data. It’s messy, continuous... ...: High-throughput video pipelines handling millions of hours...Pipeline- ...A tech startup in San Francisco is seeking a Machine Learning Engineer to shape technical direction, automate ML life cycles, and contribute to the architectural roadmap. The ideal candidate has 3 to 5 years in full-stack Deep Learning and Computer Vision, prior startup...Pipeline
$150k - $200k
...speed of thought. Mithrl is building the world’s first... ...Scientist. It is a discovery engine that transforms messy biological... ...THE ROLE We are hiring an ML Engineer, Analysis and Simulation... ...data Validate results, benchmark pipelines, and ensure scientific...PipelineWork at office- ...monitoring platform for AI agents. Engineering teams at some of the fastest... ...logs and trying debug flaky evals that just aren't matching... ...and more. Your Focus Build out a world-class product -... ...Architect, implement, and scale ML pipelines Quick iteration without...PipelineTemporary work
- ...Key Engineer For Ai-native Core Tools And Models We're looking... ...engineer with significant AI/ML + LLM experience to build AI-native core tools and models... ...agents Experience building ML pipelines Experience fine-tuning and benchmarking LLMs/foundation models...Pipeline
- ...Mach9 ML Engineer Role At Mach9, ML Engineers build the perception models at the core of our AI-enabled CAD system... ...power our automated extraction pipeline — image and 3D detection and localization... ..., not just publishing or benchmarking. Working knowledge of geometric...Pipeline
$200k
...Founding ML Engineer San Francisco, on-site, full-time - $200,000 - $500,000 per year... ...welcome! More concretely, you will… You'll build our ML pipeline for peptide drug discovery from the... ...) on our proprietary data and benchmark against classical approaches Design rigorous...PipelineFull timeNight shiftDay shiftAfternoon shift- ...out of the box. At Chef, we're building that model: the Food Foundation Model. As a Senior ML Engineer, Foundation Models, you will... .... Your models won't just benchmark well; they'll serve millions... ..., fine-tuning, and alignment pipelines that improve the model's ability...PipelineFlexible hours
$500 per month
...backed defense tech startup building autonomous, edge deployed... ...We're a small team of engineers, former US military operators... ...had to work outside a benchmark. You'll partner directly... ...classification, and sensor fusion ML model development, training pipelines, and on-device deployment...PipelinePermanent employmentWork at officeMonday to FridayFlexible hoursNight shiftWeekend work- ...the Role Chef Robotics is building autonomous robots that work... ...configurations. As a Senior ML Engineer, Manipulation, you will own... ...to build data collection pipelines using teleoperation, kinesthetic... ...metrics and regression benchmarks that accurately predict real...PipelineFlexible hours
- ...Position: Senior ML Performance Engineer Location: SF Bay Area (US) or Toronto... ...infrastructure company is building a high-performance,... ...across GPU clusters Define benchmarking methodologies, metrics, and... ...improvements Build automated pipelines for continuous performance...PipelineFull time
$180k - $240k
We're looking for an ML engineer to push the boundaries of what our... ...agents can do. You'll design and build the AI systems that... ...signals Build and refine LLM pipelines: prompt engineering, retrieval... ...industrial problems, not just benchmarks #J-18808-Ljbffr Optimized,...Pipeline- ...Founding Applied ML Engineer Title of Role: Founding Applied ML Engineer... ...What You Will Do Design, build, and iterate on machine learning systems and pipelines for audio dataset creation and... ...delivery workflows. Evaluate and benchmark model performance, establish...PipelineWork at office
- ...out 1962 new Machine Learning Engineer opportunities posted on AI Chopping Block Design, build, and maintain scalable machine... ...Develop and optimize end-to-end ML pipelines encompassing data collection,... ..., and storage. Write tests, benchmarks, and diagnostics to detect significant...PipelineFlexible hours
- ...an LLM-first search engine and our specialized data... ...one, and popular benchmarks do not effectively cover... ...this role, you will build specialized evals to improve answer quality... ...automated evaluation pipelines to assess answer... ...methods to real-world ML problems Experience defining...Pipeline
- ...Onsite Machine Learning Engineer Location: Onsite in... ...Are UniversalAGI is building OpenAI for Physics. AI... ...UniversalAGI is hiring an ML Engineer to help ship... ...training/fine-tuning, benchmarking, and delivering... ...preprocessing and data generation pipelines to support model...PipelineWork at officeFlexible hours1 day per week
$100k - $200k
...scales voice and chat AI agents ML‑Infrastructure Engineer Salary $100K - $200K Equity... ...to the rest of our pipeline. Making our pipelines go fast... ...experiment with all of them. You'll benchmark the latest models across... ...make pragmatic calls on build‑vs‑buy, self‑host‑vs‑...PipelineFull timeLive inWork at office$141k - $249k
...with autonomy and algorithm engineers to scale safe self-driving systems... ...Expand the model deployment pipeline to new GPUs and embedded... ...on the truck. - Create and benchmark new CUDA kernels for inference... ...them. - Experience in Bazel build systems, and integrating third...PipelineWork at officeWork from homeFlexible hours$250k - $350k
...just another scribe. We're building the AI intelligence platform... ...The Role: As a Staff ML Engineer on the Frontier AI team at Ambience... ...training runs, fine-tuning pipelines, and production deployment.... ...to ML libraries, benchmarks, or evaluation frameworks....PipelineWork at officeImmediate startRemote workFlexible hours3 days per week$198k - $230k
...individual work styles. Senior MLOps Engineer (Applied AI Focus) As a... ...Annotation & Measurement Pipelines: Own the design and implementation... ...criteria that allows us to benchmark models and make informed data... ...and cost efficiencies. Build & Integrate: Collaborate with...PipelineWork at officeRemote workWork from homeWorldwideHome officeFlexible hours- ...Research Scientist We're building the first truly private... ...your data. Our core ML challenge: how do we... ...Create synthetic data pipelines to let models squeeze... ...up numbers on a public benchmark, we're trying to make models... ...previously created evals used by Open AI, completed...PipelineShift work
- ...Job Title: ML Software Engineer About Xterra Xterra is a Khosla Ventures-backed company building AI agents that reason about complex scientific... ...the harnesses that run evals at scale, and making sure our... ...Designing data systems — building pipelines and infrastructure that...Pipeline
- ...About ZETIC.ai ZETIC.ai builds an end-to-end on-device AI deployment and benchmarking platform that helps companies run... ...Description We’re hiring an ML Software Engineer (On-Device AI Model Optimizations... ...evaluation + profiling pipelines: on-device benchmarks, regression...PipelineFull time
$150k - $200k
...VLA models so they can be swapped, benchmarked, and upgraded as the SOTA evolves —... ...robust data collection and curation pipelines for production robot fleets. Build reliable, high‑speed robot autonomy... ...record developing and deploying ML systems from research through production...Pipeline
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Evals Engineer Build Benchmarking Pipelines. Be the first to apply!
- computer vision machine learning engineer San Francisco, CA
- machine learning ai engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- pipeline tech San Francisco, CA



