ML Research Engineer: Real-World Model Evaluation

Arcada

Arcada is seeking an ML Research Engineer to improve evaluation methods and understanding of AI capabilities. You will design large-scale experiments to analyze performance and contribute to public leaderboards and tools. The ideal candidate has deep expertise in evaluating models, statistical analysis, and transforming real-world issues into measurable tasks, supporting efforts across engineering, ML, and research. #J-18808-Ljbffr Arcada

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the ML Research Engineer: Real-World Model Evaluation in San Francisco, CA vacancy

ML Evaluation Engineer — Real‑World AI Metrics
Arcada Labs Incorporated is seeking an ML Research Engineer in San Francisco to lead evaluations of AI models based on human preferences. You will design experiments and analysis pipelines to enhance our understanding of AI capabilities and contribute to user-facing tools...
Suggested
Arcada Labs Incorporated
San Francisco, CA
1 day ago
Senior ML Engineer - Real-World AI Evaluations
Arena Intelligence, Inc. in San Francisco is seeking a Senior Machine Learning Engineer to enhance AI model evaluation systems. You will work on data pipelines, inference APIs, and new evaluation methods. The ideal candidate possesses strong programming skills, experience...
Suggested
Arena Intelligence, Inc.
San Francisco, CA
16 hours ago
Senior Machine Learning Engineer - Model Evaluations, Public Sector
$240.45k - $300.3k
...Senior Machine Learning Engineer - Model Evaluations, Public Sector San Francisco... ...The Public Sector ML team at Scale deploys advanced... ...safely, and effectively under real-world constraints. As an ML Engineer... .... Ability to convert research insights into measurable evaluation...
Suggested
Full time
Scale AI
San Francisco, CA
1 day ago
Senior ML Engineer, Perception - Real-World Edge Deployment
...Systems, Inc in San Francisco is seeking a Senior Machine Learning Engineer to lead perception architecture for defense applications. The ideal candidate will have over 5 years of experience in real-world perception systems, strong skills in Python and C++, and a proven...
Suggested
Aurelius Systems, Inc
San Francisco, CA
3 days ago
Agentic ML Engineer: Build Real-World LLM Agents (Equity)
Acceler8 Talent is seeking an experienced ML Engineer in San Francisco to build the core agent intelligence layer for a team of Google... ...production experience in building LLM based agents that perform real world actions. This is an exciting opportunity within an early-...
Suggested
Acceler8 Talent
San Francisco, CA
2 days ago
Real-World Robotics RL Research Engineer
Pantograph is looking for research engineers to build robots that learn through exploration in the real world. Ideal candidates will have strong foundations in reinforcement learning and experience working with large GPU clusters, Kubernetes, and complex distributed systems...
Pantograph
San Francisco, CA
2 days ago
Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco
$200k - $365k
...Plaud is building the world's most trusted AI... ...metrics that researchers and leadership can... ...Possess strong software engineering skills (especially... ...pipelines, or evaluation harnesses that can... ...scale against live model checkpoints.... ...deeply partner with ML researchers to define...
Full time
Work at office
Worldwide
Plaud
San Francisco, CA
4 days ago
ML Research Engineer - Hardware Codesign
$185k
...with software and research partners to co-... ...with AI models. In addition to... ...Hardware Codesign Engineer to operate at the... ...simulation and real measurements;... ...Proactively pull in new ML workloads,... ...drive initial evaluation of new... ...deploy them to the world through our products...
Relocation package
3 days per week
OpenAI
San Francisco, CA
1 day ago
AI/ML - Machine Learning Research Engineer, Machine Translation
$147.4k - $220.9k
...AI/ML - Machine Learning Research Engineer, Machine Translation Work Locations (3) Submit... ...and large language model (LLM) technologies. Our mission... ...learning approaches and evaluation methods. As a member of... ...MT technology to address real-world challenges. An ideal candidate...
Relocation
Apple
San Francisco, CA
16 hours ago
GenAI ML Engineer for Real-World Production Systems
$176k - $220k
...leading AI technology company is seeking a Machine Learning Research Engineer to develop critical ML systems for their GenAI platform. This high-impact... ...engineering skills, and a passion for solving real-world problems. The position offers competitive compensation...
Scale AI, Inc.
San Francisco, CA
1 day ago
Benchmarking Research Engineer: Frontier Model Evaluations
Refresh AI is seeking a Research Engineer in San Francisco to push the boundaries of benchmarking technology. You will build benchmarks that labs use for evaluating coding abilities and computer-use capability. Your role will require expertise in reinforcement learning...
Full time
Refresh AI
San Francisco, CA
2 days ago
Research Engineer, Model Evaluations - Remote-Friendly Impact
$320k
Anthropic in New York City is seeking a Research Engineer to develop evaluations for Claude’s capabilities. The ideal candidate should have strong Python... ...during training runs. The role offers a hybrid work model and competitive compensation ranging from $320,000 to $...
Remote job
Menlo Ventures
San Francisco, CA
4 days ago
ML Evaluation Engineer: Benchmark & Model Quality
A leading AI solutions company in San Francisco is seeking an ML Eval Engineer to design evaluation benchmarks and improve model performance. This role involves working with unstructured enterprise data and collaborating closely with the ML and engineering teams. You will...
Reducto
San Francisco, CA
2 days ago
ML Engineer, Public Sector: Model Evaluations & Safety
$208k - $300k
...leading AI company is seeking a Machine Learning Engineer in the Public Sector to develop automated evaluation pipelines for AI models. You will work on advanced AI systems and... ...strong programming background and experience in ML evaluation frameworks. Competitive salary...
Scale AI, Inc.
San Francisco, CA
2 days ago
Founding ML Research Engineer — Real-Time Voice AI
$225k - $400k
A pioneering AI research firm is seeking a Founding Machine Learning Research Engineer in San Francisco to develop innovative... ...AI systems for real-time voice agents. This... ...role requires a strong ML research background and... ...conducting research on LLMs, model prototyping, and...
Retell AI
San Francisco, CA
3 days ago
ML Research Engineer, ML Systems
$189.6k - $237k
...Scale's ML platform (RLXF) team... ...large language model training and inference... ...powering MLEs, researchers, data... ...automatic training and evaluation of LLM's, as... ...Strong software engineering skills,... ...systems for the world's most important... ...applications that deliver real impact. We work...
Full time
Scale AI
San Francisco, CA
1 day ago
Research Engineer - World Model
$180k
..., and focused on engineering excellence. This... ...All engineers and researchers are expected to have... ...build generative models that can... ...physical and virtual worlds. You’ll play a large... ...quantitative evaluations for world simulation... ...distillation for real-time generation....
Work at office
Local area
Relocation
xAI
San Francisco, CA
more than 2 months ago
Research Engineering Manager - Model Training
Perplexity is seeking a Research Engineering Manager to lead the... ...for developing the models that drive our products... ...experience for the world’s most sophisticated... ...iteration velocity. Design evaluations and improve the... ...rapidly iterate based on real‑world usage. Manage...
Perplexity AI Inc.
San Francisco, CA
1 day ago
ML Engineer LLM Evaluation
...must be developed with safety, privacy, and real-world responsibility in mind. Our ML team comes from a culture of academic research driven to democratize AI advancements... ...advancement. Responsibilities Own LLM evaluation processes and methods with a focus on...
Local area
Shift work
Dynamo AI
San Francisco, CA
1 day ago
Staff Software Engineer/Data Scientist, Large Model Evaluation
$238k - $302k
...the mission to be the world's most trusted driver.... ...states. The Large Model Evaluation team is at the nexus of... ...the full complexity of real-world driving. At its... ...quantitatively-minded engineers to research and propose new ways to assess the ML models deployed in the...
Full time
Remote work
Waymo
San Francisco, CA
16 hours ago
ML/AI Research Engineer Agentic AI Lab (Founding Team)
...ML/AI Research Engineer — Agentic AI Lab (Founding Team) Location... ...8VC, we're building a world-class team to tackle one... ...the design, training, evaluation, and optimization of agent-native AI models. You'll work at the... ...pipelines integrated with real-time or contextual...
Full time
Fabrion
San Francisco, CA
1 day ago
Member of Technical Staff, ML Research Engineer
...better on benchmarks, but still fail in real-world use. At Arcada Labs, we build... ...preference and judgment. That lets us evaluate models on what people actually care about, not... ...About the Role We’re looking for an ML Research Engineer to help us build better ways to evaluate...
Arcada Labs Incorporated
San Francisco, CA
1 day ago
Senior Machine Learning Research Engineer
...first audio data research company. We bring... ...AI labs bring to models. Our mission is to... ...bring AI into the real world, and we believe audio... ...of former Scale AI engineers and operators. In... .... We own the full ML lifecycle - from researching... ...training and evaluation datasets to...
Work at office
David AI
San Francisco, CA
1 day ago
Machine Learning Research Engineer, GenAI Applied ML
$189.6k - $237k
...This Role Lead applied ML engineering on Scale's Applied ML team... ...expertise, and drive research into real-world agent reliability failures... ...iteration Build data-driven evaluations and deploy rapid... ...power the world's leading models, and help enterprises and...
Full time
Scale AI
San Francisco, CA
5 days ago
Machine Learning Research Engineer, Agents - Enterprise GenAI
$250k - $350k
...Machine Learning Research Engineer, Agents - Enterprise GenAI... ...around the world. The Enterprise ML Research Lab works on... ...Building algorithms to real life enterprise datasets... ...state of the art models, developed both internally... ...a fair and thorough evaluation of all applicants....
Full time
Scale AI
San Francisco, CA
1 day ago
Model Evaluation & Data Quality Lead
...multimodal foundation models that have the... ...transform the world. Join us as we... ...member of our ML Data Team -... ...preparation and model evaluation. This role... ...consultation with our research and product... ...based on real-time information... ...: Partner with Engineering and AI Model teams...
Work at office
Worldwide
Flexible hours
Twelve Labs, Inc
San Francisco, CA
4 days ago
ML Research Engineer
...“control plane” for the physical world. We are starting with protecting... ...ultimately become the perception engine for a company’s physical footprint, enabling real-time perimeter visibility, autonomous... ...-language, and large language models to our world-class distributed perception...
Specter
San Francisco, CA
2 days ago
ML Research Engineer, Speech
...innovation through advanced hardware engineering and AI solutions. Our mission is... ...We are seeking an experienced ML Research Engineer, with a focus on speech modeling, to join our team. The person... ...interface applications to have real-world impact on patients with severe disabilities...
Flexible hours
Echo Neurotechnologies
San Francisco, CA
2 days ago
Research Engineer, World Models
$155k - $269k
...leader in Physical AI. With a world-class team, we're unlocking... ...efficient simulation. As a Research Engineer in the World Models team, you will develop... ...efficiency, and robustness for real-world applications. -... ...tooling for large-scale ML training (e.g., cloud platforms...
Full time
Work at office
Work from home
Flexible hours
Waabi
San Francisco, CA
26 days ago
Research Engineer, Model Evaluations Новое Remote-Friendly (Travel-Required) | San Francisco, CA | New York City, NY
$320k
...growing group of committed researchers, engineers, policy experts, and... ...Engineers to build the evaluations that tell us — and the world — what Claude can actually... ...leadership use to monitor model health during training,... ...running or supporting ML training infrastructure...
Remote job
Work at office
Visa sponsorship
Flexible hours
San Francisco, CA
a month ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Research Engineer: Real-World Model Evaluation. Be the first to apply!