Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Evaluation Engineer — Real‑World AI Metrics

Arcada Labs Incorporated

Arcada Labs Incorporated is seeking an ML Research Engineer in San Francisco to lead evaluations of AI models based on human preferences. You will design experiments and analysis pipelines to enhance our understanding of AI capabilities and contribute to user-facing tools and leaderboards. Ideal candidates should have experience with modern AI systems and model evaluation methodologies, along with strong statistical judgment and a passion for advancing model capabilities. #J-18808-Ljbffr Arcada Labs Incorporated

Vacancy posted 12 hours ago
Similar jobs that could be interesting for youBased on the ML Evaluation Engineer — Real‑World AI Metrics in San Francisco, CA vacancy
  • $250k - $350k

    Most AI roles build on top of models. This one builds what makes them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what’s happening on live job sites using wearable devices, large-scale video, and AI. This isn... 
    Suggested

    Trades Workforce Solutions

    San Francisco, CA
    1 day ago
  •  ...Systems, Inc in San Francisco is seeking a Senior Machine Learning Engineer to lead perception architecture for defense applications. The ideal candidate will have over 5 years of experience in real-world perception systems, strong skills in Python and C++, and a proven... 
    Suggested

    Aurelius Systems, Inc

    San Francisco, CA
    2 days ago
  • At Dynamo AI, we believe that LLMs must be developed with safety, privacy, and real-world responsibility in mind. Our ML team comes from a culture of academic research driven to democratize...  .... Responsibilities Own LLM evaluation processes and methods with a focus on... 
    Suggested
    Local area
    Shift work

    Capitolis

    San Francisco, CA
    1 day ago
  • Acceler8 Talent is seeking an experienced ML Engineer in San Francisco to build the core agent intelligence layer for a team of Google...  ...production experience in building LLM based agents that perform real world actions. This is an exciting opportunity within an early-... 
    Suggested

    Acceler8 Talent

    San Francisco, CA
    1 day ago
  •  ...Ventures is looking for an Applied Research Engineer to design, train, and deploy AI models that enhance business processes....  ...proprietary data. Your experience in training ML models and ability to handle messy data will drive real business outcomes. You'll be responsible... 
    Suggested

    Alumni Ventures

    San Francisco, CA
    2 days ago
  • $150k - $300k

     ...An early-stage AI data company that went...  ...You will own the ML systems that turn...  ...party APIs. Build, evaluate, and iterate on retrieval...  ...learning, metric learning, representation...  ...ML research and engineering cycle, from...  ...hundreds of millions of real‑world records. Join a... 

    Open Select

    San Francisco, CA
    4 days ago
  •  ...to the internet for AI agents. Our APIs...  ...boundaries of what our ML systems can do. We'...  ...a Founding ML Engineer to own the research...  ...know how to build and evaluate retrieval systems,...  ...contrastive learning, metric learning, and...  ...systems over messy real-world data Background in... 

    Crustdata (YC F24)

    San Francisco, CA
    3 days ago
  •  ...Machine Learning Engineer Location:...  ...OpenAI for Physics. AI startup based in...  ...is hiring an ML Engineer to help...  ...model training and evaluation. Run training...  ...clearly (metrics, dashboards, short...  ...Passionate about solving real customer...  ...collaboration Access to world-class investors... 
    Work at office
    Flexible hours
    1 day per week

    UniversalAGI

    San Francisco, CA
    3 days ago
  • $200k - $300k

     ...machines in the physical world, starting with...  ...base. Our AI-powered robots automate...  .... As a Senior ML Engineer, Manipulation, you...  ...physical robots in real environments. We...  ...finger) Implement and evaluate modern policy...  ...Define evaluation metrics and regression benchmarks... 
    Flexible hours

    Chef Robotics, Inc.

    San Francisco, CA
    3 days ago
  •  ..., Proficiency in Python and standard ML frameworks (e.g., JAX, TensorFlow) ,...  ...Desirable) Experience designing and using metrics for evaluating complex AI systems , (Desirable) Track record of...  ...looking for researchers and software engineers who are passionate about developing... 

    Waymo

    San Francisco, CA
    1 day ago
  • A leading AI evaluation platform in San Francisco is looking for a Senior Software Engineer specializing in ML infrastructure. The successful candidate will design and develop robust real-time data and API systems, enabling insights for researchers and developers. Ideal... 

    LMArena

    San Francisco, CA
    2 hours ago
  • $200k - $280k

    A leading voice AI startup is seeking a Founding Senior Machine Learning Engineer to fine-tune and deploy human-like voice...  ..., handling millions of real-time calls. This role offers...  ...Candidates should have real-world experience in deploying ML models and work well in a fast... 

    Retell AI

    San Francisco, CA
    2 hours ago
  • A leading AI solutions company in San Francisco is seeking an ML Eval Engineer to design evaluation benchmarks and improve model performance. This role involves working with unstructured...  ...ML and engineering teams. You will develop metrics, conduct evaluations, and contribute to... 

    Reducto

    San Francisco, CA
    1 day ago
  • $180k - $270k

     ...Plaud is building the world’s most trusted AI work companion for...  ...defensible, and automated metrics that researchers and...  ...strong software engineering skills (especially in...  ..., data pipelines, or evaluation harnesses that can run...  ...deeply partner with ML researchers to define... 
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    12 hours ago
  •  ...We’re looking for an ML Engineer with 4-8 years of...  ...and owning production AI and ML systems used by real people. You’re comfortable...  ...with clear quality metrics and business impact....  ...judgment around evaluation and know how to...  ...fragmented markets in the world: hiring. We partner... 

    Paraform

    San Francisco, CA
    1 day ago
  • A cutting-edge AI company located in San Francisco is seeking an ML Eval Engineer to enhance model evaluations and ensure quality metrics. This role involves designing benchmarks, collaborating with teams to identify model weaknesses, and developing automated processes... 

    Reducto, Inc.

    San Francisco, CA
    2 days ago
  • Adaption Labs is seeking an Applied ML Engineer to bridge applied research and product development in San Francisco. You will work closely with customers to identify issues, implement ML solutions, and drive results across diverse environments. Ideal candidates have strong... 
    Flexible hours

    Adaption Labs

    San Francisco, CA
    12 hours ago
  • About Saris AI We're a San Francisco...  ...We’ve shipped real agents that handle...  .... Our core engineering team is looking for a hands-on ML Engineering Lead...  ...systems powering real-world workflows Define and oversee evaluation frameworks,...  ...and performance metrics to continuously... 

    Saris AI

    San Francisco, CA
    2 days ago
  •  ..., we’re building the world’s first AI teacher: one that listens...  ...re looking for an AI/ML Engineer to join our product...  ...while delivering real-world products. We use...  ...product, designing the evaluations that prove they work,...  ..., to concrete metrics and research plans, back... 
    Worldwide
    Shift work

    Ello

    San Francisco, CA
    3 days ago
  • About ZETIC.ai ZETIC.ai builds an end-to...  ...efficiently on real consumer devices—without...  ...We’re hiring an ML Software Engineer (On-Device AI...  ...Build and maintain evaluation + profiling pipelines...  ...models for real-world deployment. Strong...  ...to-end with clear metrics and deliverables... 
    Full time

    CAPSA

    San Francisco, CA
    1 day ago
  • $200k - $300k

     ...States to help them hire. AI/ML Engineer Location: San...  ...$55 million Series A from world-class investors and operators...  ...systems that create measurable real-world impact. This is an...  ...production applications. Define evaluation frameworks, metrics, and testing methodologies... 
    H1b
    Work at office
    Remote work
    Relocation package
    3 days per week

    Recruiting from Scratch

    San Francisco, CA
    2 days ago
  • $250k - $300k

     ...developing the most capable AI systems for...  ...As a Machine Learning Engineer at Ambience, you will...  ...opportunities. Scale Model Evaluation: Collaborate with...  ...pipelines, track performance metrics, and integrate real‑world feedback. Explore...  .... Who You Are Deep ML & NLP Fundamentals... 
    Work at office

    Dormont Manufacturing Company

    San Francisco, CA
    57 minutes ago
  •  ...About Poesis Poesis is the AI-native investment...  ...research with immediate real-world validation. Your work will...  ...re hiring our Founding ML Engineer, the first full-time...  ...Implement backtesting and evaluation frameworks with clear performance metrics. Deliver regular,... 
    Full time
    Immediate start
    Relocation
    Visa sponsorship
    Relocation package

    Poesis LLC

    San Francisco, CA
    2 days ago
  •  ...Machine Learning Engineer – Perception Models At Mach9, ML Engineers build the perception...  ...at the core of our AI‑enabled CAD system....  ...imagery to serve real surveyors and...  ...Design, train, and evaluate computer vision and...  ...evaluation methodology and metrics that reflect real surveying... 

    Mach9

    San Francisco, CA
    3 days ago
  • $200k - $400k

     ...platform to train AI video models. Troveo offers the world’s largest...  ...innovative strategic engineer to help us scale...  ...across the full ML lifecycle, from...  ...datasets to deploying, evaluating, and training...  ...targets, and real‑world outcomes....  ...frameworks with metrics like NDCG, mAP,... 
    Work experience placement

    Troveo AI

    San Francisco, CA
    2 days ago
  • $200k - $280k

     ...Machine Learning Engineer Join to apply...  ...role at Retell AI. Base pay range...  ...startups in the world, you’ll love it...  ...role for ML engineers who want...  ...and perform under real‑world constraints...  ...and audio models, evaluate them with...  ...define rigorous metrics, and measure model... 
    H1b
    Work at office

    Retell AI

    San Francisco, CA
    2 days ago
  •  ...new Machine Learning Engineer opportunities posted on AI Chopping Block...  ...optimize end-to-end ML pipelines encompassing...  ...deploy models into real-time products with a...  ...Learning Enginer, Core Evaluations The responsibilities...  ...requirements into measurable metrics, and designing and... 
    Flexible hours

    AI Chopping Block, Inc.

    San Francisco, CA
    3 days ago
  • $250k - $400k

     ...Define how large-scale AI systems for scientific...  ...models to be trained, evaluated, and deployed reliably...  ...isolation. It's building the engine that research runs on....  ...on, and deployed in real-world environments. The company...  ...building and scaling ML systems in production... 
    Remote work

    techire ai

    San Francisco, CA
    1 day ago
  •  ...Job Details Title: ML Engineer - Fraud Risk/AI Data Science Job Type: Contract...  ...to predict fraud risk in a real‑time environment, and...  ...features. Design, build, evaluate, and defend machine learning...  ...model results. Implement metrics like AUC, KS, and Gini to... 
    Contract work
    For contractors
    Work at office
    Remote work

    DeWinter Group

    San Francisco, CA
    1 hour ago
  •  ...seeking experienced ML engineers to design, build,...  ...Perplexity builds AI for those who...  ...user and business metrics. Build user modeling...  ...Build the data and evaluation foundations that let...  ..., feature stores, real-time serving)....  ...drive change in the world. Driving change is... 

    Neura Market

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Evaluation Engineer — Real‑World AI Metrics. Be the first to apply!