ML Evaluation Engineer — Real‑World AI Metrics

Arcada Labs Incorporated

Arcada Labs Incorporated is seeking an ML Research Engineer in San Francisco to lead evaluations of AI models based on human preferences. You will design experiments and analysis pipelines to enhance our understanding of AI capabilities and contribute to user-facing tools and leaderboards. Ideal candidates should have experience with modern AI systems and model evaluation methodologies, along with strong statistical judgment and a passion for advancing model capabilities. #J-18808-Ljbffr Arcada Labs Incorporated

Apply

Vacancy posted 12 hours ago

Similar jobs that could be interesting for youBased on the ML Evaluation Engineer — Real‑World AI Metrics in San Francisco, CA vacancy

ML Infra Engineer — Scale Real‑World AI (SF On-site)
$250k - $350k
Most AI roles build on top of models. This one builds what makes them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what’s happening on live job sites using wearable devices, large-scale video, and AI. This isn...
Suggested
Trades Workforce Solutions
San Francisco, CA
1 day ago
Senior ML Engineer, Perception - Real-World Edge Deployment
...Systems, Inc in San Francisco is seeking a Senior Machine Learning Engineer to lead perception architecture for defense applications. The ideal candidate will have over 5 years of experience in real-world perception systems, strong skills in Python and C++, and a proven...
Suggested
Aurelius Systems, Inc
San Francisco, CA
2 days ago
ML Engineer — LLM Evaluation
At Dynamo AI, we believe that LLMs must be developed with safety, privacy, and real-world responsibility in mind. Our ML team comes from a culture of academic research driven to democratize... .... Responsibilities Own LLM evaluation processes and methods with a focus on...
Suggested
Local area
Shift work
Capitolis
San Francisco, CA
1 day ago
Agentic ML Engineer: Build Real-World LLM Agents (Equity)
Acceler8 Talent is seeking an experienced ML Engineer in San Francisco to build the core agent intelligence layer for a team of Google... ...production experience in building LLM based agents that perform real world actions. This is an exciting opportunity within an early-...
Suggested
Acceler8 Talent
San Francisco, CA
1 day ago
Applied ML Engineer: Build Real-World AI in Production
...Ventures is looking for an Applied Research Engineer to design, train, and deploy AI models that enhance business processes.... ...proprietary data. Your experience in training ML models and ability to handle messy data will drive real business outcomes. You'll be responsible...
Suggested
Alumni Ventures
San Francisco, CA
2 days ago
Founding ML Engineer
$150k - $300k
...An early-stage AI data company that went... ...You will own the ML systems that turn... ...party APIs. Build, evaluate, and iterate on retrieval... ...learning, metric learning, representation... ...ML research and engineering cycle, from... ...hundreds of millions of real‑world records. Join a...
Open Select
San Francisco, CA
4 days ago
Founding ML Engineer
...to the internet for AI agents. Our APIs... ...boundaries of what our ML systems can do. We'... ...a Founding ML Engineer to own the research... ...know how to build and evaluate retrieval systems,... ...contrastive learning, metric learning, and... ...systems over messy real-world data Background in...
Crustdata (YC F24)
San Francisco, CA
3 days ago
ML Engineer
...Machine Learning Engineer Location:... ...OpenAI for Physics. AI startup based in... ...is hiring an ML Engineer to help... ...model training and evaluation. Run training... ...clearly (metrics, dashboards, short... ...Passionate about solving real customer... ...collaboration Access to world-class investors...
Work at office
Flexible hours
1 day per week
UniversalAGI
San Francisco, CA
3 days ago
Senior ML Engineer, Manipulation
$200k - $300k
...machines in the physical world, starting with... ...base. Our AI-powered robots automate... .... As a Senior ML Engineer, Manipulation, you... ...physical robots in real environments. We... ...finger) Implement and evaluate modern policy... ...Define evaluation metrics and regression benchmarks...
Flexible hours
Chef Robotics, Inc.
San Francisco, CA
3 days ago
Senior ML Engineer, Autonomous Driving Evaluation
..., Proficiency in Python and standard ML frameworks (e.g., JAX, TensorFlow) ,... ...Desirable) Experience designing and using metrics for evaluating complex AI systems , (Desirable) Track record of... ...looking for researchers and software engineers who are passionate about developing...
Waymo
San Francisco, CA
1 day ago
Senior ML Infra Engineer Real-Time, Low-Latency
A leading AI evaluation platform in San Francisco is looking for a Senior Software Engineer specializing in ML infrastructure. The successful candidate will design and develop robust real-time data and API systems, enabling insights for researchers and developers. Ideal...
LMArena
San Francisco, CA
2 hours ago
Founding Senior ML Engineer for Real-Time Voice AI
$200k - $280k
A leading voice AI startup is seeking a Founding Senior Machine Learning Engineer to fine-tune and deploy human-like voice... ..., handling millions of real-time calls. This role offers... ...Candidates should have real-world experience in deploying ML models and work well in a fast...
Retell AI
San Francisco, CA
2 hours ago
ML Evaluation Engineer: Benchmark & Model Quality
A leading AI solutions company in San Francisco is seeking an ML Eval Engineer to design evaluation benchmarks and improve model performance. This role involves working with unstructured... ...ML and engineering teams. You will develop metrics, conduct evaluations, and contribute to...
Reducto
San Francisco, CA
1 day ago
Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco
$180k - $270k
...Plaud is building the world’s most trusted AI work companion for... ...defensible, and automated metrics that researchers and... ...strong software engineering skills (especially in... ..., data pipelines, or evaluation harnesses that can run... ...deeply partner with ML researchers to define...
Full time
Work at office
Worldwide
Plaud
San Francisco, CA
12 hours ago
Applied ML Engineer
...We’re looking for an ML Engineer with 4-8 years of... ...and owning production AI and ML systems used by real people. You’re comfortable... ...with clear quality metrics and business impact.... ...judgment around evaluation and know how to... ...fragmented markets in the world: hiring. We partner...
Paraform
San Francisco, CA
1 day ago
ML Evaluation Engineer: Benchmark & Model Quality
A cutting-edge AI company located in San Francisco is seeking an ML Eval Engineer to enhance model evaluations and ensure quality metrics. This role involves designing benchmarks, collaborating with teams to identify model weaknesses, and developing automated processes...
Reducto, Inc.
San Francisco, CA
2 days ago
Applied ML Engineer: Real-World AI Solutions
Adaption Labs is seeking an Applied ML Engineer to bridge applied research and product development in San Francisco. You will work closely with customers to identify issues, implement ML solutions, and drive results across diverse environments. Ideal candidates have strong...
Flexible hours
Adaption Labs
San Francisco, CA
12 hours ago
ML Engineering Lead
About Saris AI We're a San Francisco... ...We’ve shipped real agents that handle... .... Our core engineering team is looking for a hands-on ML Engineering Lead... ...systems powering real-world workflows Define and oversee evaluation frameworks,... ...and performance metrics to continuously...
Saris AI
San Francisco, CA
2 days ago
AI/ML Engineer
..., we’re building the world’s first AI teacher: one that listens... ...re looking for an AI/ML Engineer to join our product... ...while delivering real-world products. We use... ...product, designing the evaluations that prove they work,... ..., to concrete metrics and research plans, back...
Worldwide
Shift work
Ello
San Francisco, CA
3 days ago
ML Software Engineer
About ZETIC.ai ZETIC.ai builds an end-to... ...efficiently on real consumer devices—without... ...We’re hiring an ML Software Engineer (On-Device AI... ...Build and maintain evaluation + profiling pipelines... ...models for real-world deployment. Strong... ...to-end with clear metrics and deliverables...
Full time
CAPSA
San Francisco, CA
1 day ago
AI/ML Engineer
$200k - $300k
...States to help them hire. AI/ML Engineer Location: San... ...$55 million Series A from world-class investors and operators... ...systems that create measurable real-world impact. This is an... ...production applications. Define evaluation frameworks, metrics, and testing methodologies...
H1b
Work at office
Remote work
Relocation package
3 days per week
Recruiting from Scratch
San Francisco, CA
2 days ago
Machine Learning Engineer
$250k - $300k
...developing the most capable AI systems for... ...As a Machine Learning Engineer at Ambience, you will... ...opportunities. Scale Model Evaluation: Collaborate with... ...pipelines, track performance metrics, and integrate real‑world feedback. Explore... .... Who You Are Deep ML & NLP Fundamentals...
Work at office
Dormont Manufacturing Company
San Francisco, CA
57 minutes ago
Founding Machine Learning Engineer
...About Poesis Poesis is the AI-native investment... ...research with immediate real-world validation. Your work will... ...re hiring our Founding ML Engineer, the first full-time... ...Implement backtesting and evaluation frameworks with clear performance metrics. Deliver regular,...
Full time
Immediate start
Relocation
Visa sponsorship
Relocation package
Poesis LLC
San Francisco, CA
2 days ago
ML Engineer
...Machine Learning Engineer – Perception Models At Mach9, ML Engineers build the perception... ...at the core of our AI‑enabled CAD system.... ...imagery to serve real surveyors and... ...Design, train, and evaluate computer vision and... ...evaluation methodology and metrics that reflect real surveying...
Mach9
San Francisco, CA
3 days ago
Senior Machine Learning Engineer
$200k - $400k
...platform to train AI video models. Troveo offers the world’s largest... ...innovative strategic engineer to help us scale... ...across the full ML lifecycle, from... ...datasets to deploying, evaluating, and training... ...targets, and real‑world outcomes.... ...frameworks with metrics like NDCG, mAP,...
Work experience placement
Troveo AI
San Francisco, CA
2 days ago
Senior Machine Learning Engineer
$200k - $280k
...Machine Learning Engineer Join to apply... ...role at Retell AI. Base pay range... ...startups in the world, you’ll love it... ...role for ML engineers who want... ...and perform under real‑world constraints... ...and audio models, evaluate them with... ...define rigorous metrics, and measure model...
H1b
Work at office
Retell AI
San Francisco, CA
2 days ago
ML Engineer
...new Machine Learning Engineer opportunities posted on AI Chopping Block... ...optimize end-to-end ML pipelines encompassing... ...deploy models into real-time products with a... ...Learning Enginer, Core Evaluations The responsibilities... ...requirements into measurable metrics, and designing and...
Flexible hours
AI Chopping Block, Inc.
San Francisco, CA
3 days ago
ML Engineer
$250k - $400k
...Define how large-scale AI systems for scientific... ...models to be trained, evaluated, and deployed reliably... ...isolation. It's building the engine that research runs on.... ...on, and deployed in real-world environments. The company... ...building and scaling ML systems in production...
Remote work
techire ai
San Francisco, CA
1 day ago
ML Engineer - Fraud Risk/AI Data Science
...Job Details Title: ML Engineer - Fraud Risk/AI Data Science Job Type: Contract... ...to predict fraud risk in a real‑time environment, and... ...features. Design, build, evaluate, and defend machine learning... ...model results. Implement metrics like AUC, KS, and Gini to...
Contract work
For contractors
Work at office
Remote work
DeWinter Group
San Francisco, CA
1 hour ago
Member of Technical Staff (ML Engineer, Recommendations & User Modeling)
...seeking experienced ML engineers to design, build,... ...Perplexity builds AI for those who... ...user and business metrics. Build user modeling... ...Build the data and evaluation foundations that let... ..., feature stores, real-time serving).... ...drive change in the world. Driving change is...
Neura Market
San Francisco, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Evaluation Engineer — Real‑World AI Metrics. Be the first to apply!