Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Evals Engineer Build Benchmarking Pipelines

AI Chopping Block, Inc.

AI Chopping Block, Inc. in San Francisco is looking for a Research Engineer to build evaluation systems for their product Firecrawl. The ideal candidate will have over 3 years of experience in Machine Learning engineering, and the role includes responsibilities such as designing metrics, building pipelines, and defining benchmarks to ensure output quality. The position offers a hybrid work option, competitive salary, and equity opportunities. #J-18808-Ljbffr

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the ML Evals Engineer Build Benchmarking Pipelines in San Francisco, CA vacancy
  •  ...About the Role ML Ops Engineer — Agentic AI Lab (Founding Team...  ...) Backed by 8VC, we're building a world-class team to tackle...  ...LLMs, agent-native pipelines, retrieval-augmented...  ...manage evaluation and benchmarking frameworks (e.g. OpenLLM-Evals, RAGAS, LangSmith) Integrate... 
    Pipeline
    Full time

    Fabrion

    San Francisco, CA
    2 days ago
  •  ...About the Role As a Research Engineer at Mercor, you’ll work at...  ...applied AI research. You’ll own benchmarking pipelines, evaluation systems, and...  .... You’ll design and run evals, build rubrics and scorers, and turn...  ...hands‑on experience with ML models and evaluation code.... 
    Pipeline
    Work at office

    Mercor

    San Francisco, CA
    4 days ago
  • $151k - $257k

     ...Zoox is seeking a Machine Learning Engineer for 3D Simulation in San Francisco. This role focuses on developing advanced...  ...Responsibilities include collaborating with researchers and building scalable cloud pipelines. Compensation ranges from $151,000 to $257,000 annually,... 
    Pipeline

    jobs.frontdoordefense.com - Jobboard

    San Francisco, CA
    3 days ago
  •  ...Jack & Jill is seeking a Founding AI/ML Engineer in San Francisco. The role involves building Generative Engine Optimization systems for AI search. Ideal candidates have strong NLP fundamentals and experience in production ML systems. You'll collaborate with world-class... 
    Pipeline

    Jack & Jill

    San Francisco, CA
    4 days ago
  •  ...AI We're a small, senior team building the intelligent OS for the...  ...The Role We're hiring a Senior ML Engineer to help build the AI systems that...  ...across the full ML stack: data pipelines, model training, retrieval, ranking, evals, and deployment. You will ship features... 
    Pipeline
    Work at office
    Relocation
    Relocation package
    Flexible hours

    Highlight AI

    San Francisco, CA
    10 hours ago
  •  ...ML Systems Engineer – Robotics & AI We are building the full-stack foundation for the next generation of humanoid robots...  ...sharded training, tensor/pipeline parallelism, gradient accumulation...  ...for real-world robotics, not toy benchmarks. Tight collaboration between systems... 
    Pipeline

    Maxwell Bond

    San Francisco, CA
    4 days ago
  • $250k - $350k

     ...Most AI roles build on top of models. This one builds what...  ...actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world...  ..., and AI. This isn’t clean benchmark data. It’s messy, continuous...  ...: High-throughput video pipelines handling millions of hours... 
    Pipeline

    Trades Workforce Solutions

    San Francisco, CA
    3 days ago
  •  ...A tech startup in San Francisco is seeking a Machine Learning Engineer to shape technical direction, automate ML life cycles, and contribute to the architectural roadmap. The ideal candidate has 3 to 5 years in full-stack Deep Learning and Computer Vision, prior startup... 
    Pipeline

    OpenReq

    San Francisco, CA
    4 days ago
  • $150k - $200k

     ...speed of thought. Mithrl is building the world’s first...  ...Scientist. It is a discovery engine that transforms messy biological...  ...THE ROLE We are hiring an ML Engineer, Analysis and Simulation...  ...data Validate results, benchmark pipelines, and ensure scientific... 
    Pipeline
    Work at office

    Mithrl

    San Francisco, CA
    19 days ago
  •  ...monitoring platform for AI agents. Engineering teams at some of the fastest...  ...logs and trying debug flaky evals that just aren't matching...  ...and more. Your Focus Build out a world-class product -...  ...Architect, implement, and scale ML pipelines Quick iteration without... 
    Pipeline
    Temporary work

    Raindrop

    San Francisco, CA
    1 day ago
  •  ...Key Engineer For Ai-native Core Tools And Models We're looking...  ...engineer with significant AI/ML + LLM experience to build AI-native core tools and models...  ...agents Experience building ML pipelines Experience fine-tuning and benchmarking LLMs/foundation models... 
    Pipeline

    Silimate

    San Francisco, CA
    1 day ago
  •  ...Mach9 ML Engineer Role At Mach9, ML Engineers build the perception models at the core of our AI-enabled CAD system...  ...power our automated extraction pipeline — image and 3D detection and localization...  ..., not just publishing or benchmarking. Working knowledge of geometric... 
    Pipeline

    Mach9

    San Francisco, CA
    1 day ago
  • $200k

     ...Founding ML Engineer San Francisco, on-site, full-time - $200,000 - $500,000 per year...  ...welcome! More concretely, you will… You'll build our ML pipeline for peptide drug discovery from the...  ...) on our proprietary data and benchmark against classical approaches Design rigorous... 
    Pipeline
    Full time
    Night shift
    Day shift
    Afternoon shift

    Stealth Deep Tech

    San Francisco, CA
    3 days ago
  •  ...out of the box. At Chef, we're building that model: the Food Foundation Model. As a Senior ML Engineer, Foundation Models, you will...  .... Your models won't just benchmark well; they'll serve millions...  ..., fine-tuning, and alignment pipelines that improve the model's ability... 
    Pipeline
    Flexible hours

    Chef Robotics

    San Francisco, CA
    2 days ago
  • $500 per month

     ...backed defense tech startup building autonomous, edge deployed...  ...We're a small team of engineers, former US military operators...  ...had to work outside a benchmark. You'll partner directly...  ...classification, and sensor fusion ML model development, training pipelines, and on-device deployment... 
    Pipeline
    Permanent employment
    Work at office
    Monday to Friday
    Flexible hours
    Night shift
    Weekend work

    Aurelius Systems

    San Francisco, CA
    1 day ago
  •  ...the Role Chef Robotics is building autonomous robots that work...  ...configurations. As a Senior ML Engineer, Manipulation, you will own...  ...to build data collection pipelines using teleoperation, kinesthetic...  ...metrics and regression benchmarks that accurately predict real... 
    Pipeline
    Flexible hours

    Chef Robotics

    San Francisco, CA
    2 days ago
  •  ...Position: Senior ML Performance Engineer Location: SF Bay Area (US) or Toronto...  ...infrastructure company is building a high-performance,...  ...across GPU clusters Define benchmarking methodologies, metrics, and...  ...improvements Build automated pipelines for continuous performance... 
    Pipeline
    Full time

    Amadeus Search

    San Francisco, CA
    4 days ago
  • $180k - $240k

    We're looking for an ML engineer to push the boundaries of what our...  ...agents can do. You'll design and build the AI systems that...  ...signals Build and refine LLM pipelines: prompt engineering, retrieval...  ...industrial problems, not just benchmarks #J-18808-Ljbffr Optimized,... 
    Pipeline

    Optimized, Inc.

    San Francisco, CA
    5 days ago
  •  ...Founding Applied ML Engineer Title of Role: Founding Applied ML Engineer...  ...What You Will Do Design, build, and iterate on machine learning systems and pipelines for audio dataset creation and...  ...delivery workflows. Evaluate and benchmark model performance, establish... 
    Pipeline
    Work at office

    Recruiting from Scratch

    San Francisco, CA
    6 days ago
  •  ...out 1962 new Machine Learning Engineer opportunities posted on AI Chopping Block Design, build, and maintain scalable machine...  ...Develop and optimize end-to-end ML pipelines encompassing data collection,...  ..., and storage. Write tests, benchmarks, and diagnostics to detect significant... 
    Pipeline
    Flexible hours

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  •  ...an LLM-first search engine and our specialized data...  ...one, and popular benchmarks do not effectively cover...  ...this role, you will build specialized evals to improve answer quality...  ...automated evaluation pipelines to assess answer...  ...methods to real-world ML problems Experience defining... 
    Pipeline

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    3 days ago
  •  ...Onsite Machine Learning Engineer Location: Onsite in...  ...Are UniversalAGI is building OpenAI for Physics. AI...  ...UniversalAGI is hiring an ML Engineer to help ship...  ...training/fine-tuning, benchmarking, and delivering...  ...preprocessing and data generation pipelines to support model... 
    Pipeline
    Work at office
    Flexible hours
    1 day per week

    UniversalAGI

    San Francisco, CA
    3 days ago
  • $100k - $200k

     ...scales voice and chat AI agents ML‑Infrastructure Engineer Salary $100K - $200K Equity...  ...to the rest of our pipeline. Making our pipelines go fast...  ...experiment with all of them. You'll benchmark the latest models across...  ...make pragmatic calls on build‑vs‑buy, self‑host‑vs‑... 
    Pipeline
    Full time
    Live in
    Work at office

    Voiceflow

    San Francisco, CA
    4 days ago
  • $141k - $249k

     ...with autonomy and algorithm engineers to scale safe self-driving systems...  ...Expand the model deployment pipeline to new GPUs and embedded...  ...on the truck. - Create and benchmark new CUDA kernels for inference...  ...them. - Experience in Bazel build systems, and integrating third... 
    Pipeline
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    2 days ago
  • $250k - $350k

     ...just another scribe. We're building the AI intelligence platform...  ...The Role: As a Staff ML Engineer on the Frontier AI team at Ambience...  ...training runs, fine-tuning pipelines, and production deployment....  ...to ML libraries, benchmarks, or evaluation frameworks.... 
    Pipeline
    Work at office
    Immediate start
    Remote work
    Flexible hours
    3 days per week

    Ambience Healthcare

    San Francisco, CA
    10 hours ago
  • $198k - $230k

     ...individual work styles. Senior MLOps Engineer (Applied AI Focus) As a...  ...Annotation & Measurement Pipelines: Own the design and implementation...  ...criteria that allows us to benchmark models and make informed data...  ...and cost efficiencies. Build & Integrate: Collaborate with... 
    Pipeline
    Work at office
    Remote work
    Work from home
    Worldwide
    Home office
    Flexible hours

    CreatorIQ

    San Francisco, CA
    5 days ago
  •  ...Research Scientist We're building the first truly private...  ...your data. Our core ML challenge: how do we...  ...Create synthetic data pipelines to let models squeeze...  ...up numbers on a public benchmark, we're trying to make models...  ...previously created evals used by Open AI, completed... 
    Pipeline
    Shift work

    Workshop Labs

    San Francisco, CA
    2 days ago
  •  ...Job Title: ML Software Engineer About Xterra Xterra is a Khosla Ventures-backed company building AI agents that reason about complex scientific...  ...the harnesses that run evals at scale, and making sure our...  ...Designing data systems — building pipelines and infrastructure that... 
    Pipeline

    Xterraai

    San Francisco, CA
    3 days ago
  •  ...About ZETIC.ai ZETIC.ai builds an end-to-end on-device AI deployment and benchmarking platform that helps companies run...  ...Description We’re hiring an ML Software Engineer (On-Device AI Model Optimizations...  ...evaluation + profiling pipelines: on-device benchmarks, regression... 
    Pipeline
    Full time

    CAPSA

    San Francisco, CA
    4 hours ago
  • $150k - $200k

     ...VLA models so they can be swapped, benchmarked, and upgraded as the SOTA evolves —...  ...robust data collection and curation pipelines for production robot fleets. Build reliable, high‑speed robot autonomy...  ...record developing and deploying ML systems from research through production... 
    Pipeline

    Deft AI, Inc.

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Evals Engineer Build Benchmarking Pipelines. Be the first to apply!