ML Evaluation Engineer: Benchmark & Model Quality
Reducto
A leading AI solutions company in San Francisco is seeking an ML Eval Engineer to design evaluation benchmarks and improve model performance. This role involves working with unstructured enterprise data and collaborating closely with the ML and engineering teams. You will develop metrics, conduct evaluations, and contribute to model enhancements in a fast-paced environment. If you enjoy solving complex problems and care about precision, this is the role for you. #J-18808-Ljbffr Reducto
- A cutting-edge AI company located in San Francisco is seeking an ML Eval Engineer to enhance model evaluations and ensure quality metrics. This role involves designing benchmarks, collaborating with teams to identify model weaknesses, and developing automated processes....Quality
$200k - $365k
...Possess strong software engineering skills (especially... ...pipelines, or evaluation harnesses that can... ...against live model checkpoints. Can... ...deeply partner with ML researchers to define... ...) into measurable benchmarks. Are... ...transcription accuracy, audio quality, and reasoning of...QualityFull timeWork at officeWorldwide$240.45k - $300.3k
Senior Machine Learning Engineer - Model Evaluations, Public Sector The Public Sector ML team at Scale deploys advanced AI systems... .... * Design test datasets and benchmarks to measure generalization,... ..., regression testing, and quality assurance for ML systems. *...QualityFull time- ...Role We are hiring Engineers focused on AI Model Evaluation to build the systems that... ...models through automated benchmarking, dataset-driven testing,... ...realism, consistency, and quality across image, video, and... ...workflows. Collaborate with ML researchers and...Quality
- ...world responsibility in mind. Our ML team comes from a culture of... ...Responsibilities Own LLM evaluation processes and methods with a focus on generating benchmarks representative of real-world usage... .... Generate high quality synthetic data, curate labels,...QualityLocal areaShift work
- ...Block, Inc. in San Francisco is looking for a Research Engineer to build evaluation systems for their product Firecrawl. The ideal... ...designing metrics, building pipelines, and defining benchmarks to ensure output quality. The position offers a hybrid work option, competitive...Quality
$208k - $300k
...leading AI company is seeking a Machine Learning Engineer in the Public Sector to develop automated evaluation pipelines for AI models. You will work on advanced AI systems and... ...strong programming background and experience in ML evaluation frameworks. Competitive salary...- Refresh AI is seeking a Research Engineer in San Francisco to push the boundaries of benchmarking technology. You will build benchmarks that labs use for evaluating coding abilities and computer-use capability. Your role will require expertise in reinforcement learning...Full time
- Arcada is seeking an ML Research Engineer to improve evaluation methods and understanding of AI capabilities. You will design large-scale experiments to... .... The ideal candidate has deep expertise in evaluating models, statistical analysis, and transforming real-world issues...
$204k - $259k
...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an... ...demonstration, generative modeling, Bayesian inference,... ...systems and benchmarks for Waymo Foundation... ...for evaluating the quality, safety, and realism... ...experience Experience in ML engineering and...QualityFull timeTemporary workRemote work- A fast-growing AI company seeks a Software Engineer to focus on Model Evaluation & Benchmarking. This role involves building evaluation systems for multimodal AI, ensuring reliable performance. The ideal candidate will possess strong Python programming skills, familiarity...
- ...Labs in San Francisco is seeking a dedicated member for our ML Data Team to lead video data preparation and evaluation. This role includes defining dataset needs, automating processes, and enhancing data quality through collaboration. Ideal candidates should have over 5...QualityFlexible hours
- ...Francisco seeks a Machine Learning Engineer to join their Applied AI... ...candidate will fine-tune models for specific customer needs, manage evaluation infrastructure, and ensure high-quality training data processes. They must possess applied ML experience, strong programming...Quality
- ...consisting of a variety of LLM, speech, and vision models. Partner with ML infrastructure and training engineers to build a fast, cost-effective, accurate, and... ...opportunities to produce faster models without sacrificing quality. Use techniques like in-flight batching,...QualityFull timeContract workFlexible hours
- ...multimodal foundation models that have the ability... ...a vital member of our ML Data Team - which leads... ...preparation and model evaluation. This role comes with... ...partnership, annotation, and quality evaluation work as... ...: Partner with Engineering and AI Model teams to...QualityWork at officeWorldwideFlexible hours
- ...candidate with a PhD in chemistry to design tasks and workflows evaluating scientific reasoning. Ideal candidates will have strong... ...cheminformatics is a plus. This role is crucial for improving data quality and model evaluation in a collaborative environment. #J-18808-Ljbffr...Quality
- ...hiring a Senior Staff Machine Learning Engineer, focusing on driving evaluation strategies and data infrastructure... ...field, extensive experience in ML/AI systems, and strong leadership in... ...significantly impact ... operations to ensure quality and efficiency in AI applications....QualityRemote job
$25 per hour
...is seeking AI Training Experts to assist in training and evaluating cutting-edge AI models. The role involves completing tasks such as analyzing and... ...and can work from home. Prolific creates a global pool for quality human data, connecting researchers with quality...QualityRemote jobHourly payWork from homeFlexible hours- ...experienced data operations professional for their ML Data Team. This role focuses on video-language data preparation, model evaluation, and requires strong skills in Python and... ...datasets, and a commitment to ensuring high-quality data. The position includes benefits like...QualityFlexible hours
$148.5k - $266.2k
...world. As a Machine Learning Engineering Manager on the Model Delivery team within... ...Research, you will lead production ML engineering across deployment, monitoring, evaluation, reliability, and... ...regression tests to prevent quality regressions Lead reliability...QualityFor contractorsRemote work- ...technology company located in San Francisco is seeking an innovative Quality Engineer for their AI products. This role blends ops, strategy, and... ...labs, and ensure user satisfaction through effective evaluation baselines. Competitive salary and benefits offered, with a focus...Quality
- ...Machine Learning Engineer Location: Onsite... ...building foundation AI models for physics that... ...is hiring an ML Engineer to help ship... .../fine-tuning, benchmarking, and delivering results... ...training and evaluation. Run training and... ...and maintain high-quality, reproducible work...QualityWork at officeFlexible hours1 day per week
- ...Founding Applied ML Engineer Title of Role: Founding Applied ML... ...focuses on providing high-quality audio datasets and associated... ...Apply state-of-the-art models in automatic speech recognition... ...and delivery workflows. Evaluate and benchmark model performance,...QualityWork at office
- ...company is seeking a Machine Learning Engineering Manager in San Francisco. The successful... ...developing and maintaining production ML systems, ensuring quality and operational excellence. Responsibilities include improving ML models, managing production releases, and advancing...QualityRemote work
- ...California is seeking a highly skilled professional for foundation model development. The ideal candidate will focus on gathering and generating high-quality text data through advanced data engineering techniques. Candidates should have strong expertise in machine learning...Quality
$250k - $350k
...started. The Role: As a Staff ML Engineer on the Frontier AI team at Ambience, you'll own the hardest model quality problems across our clinical AI products... ...-source contributions to ML libraries, benchmarks, or evaluation frameworks. Why Here: Our products...QualityWork at officeImmediate startRemote workFlexible hours3 days per week- ...Machine Learning Engineer opportunities posted... ...end-to-end ML pipelines encompassing... ...practices such as model versioning, experiment... .... Ensure data quality, observability, and... ...Learning Enginer, Core Evaluations The... ...storage. Write tests, benchmarks, and diagnostics to...QualityFlexible hours
- ...you can train your models in the cloud, deploy... ...looking for a Senior ML Performance Engineer to architect and lead... ...our product quality and our customers’ success... ...testing platform for evaluating LLM inference workloads... ...and implement the benchmarking methodology, metrics...Quality
$250k - $300k
...Machine Learning Engineer - Speech Model Training $250,000 - $300,000 San Francisco, CA... ...alignment techniques to improve conversational quality Debug the hard problems in... ...Genuine comfort traversing the entire ML stack from signal processing to production...QualityPermanent employmentFull timeWork at officeImmediate startWorldwide- ...bring cutting‑edge models into production.... ...build the platform engineers turn to ship AI products... ...discover, evaluate, and select the right... ...production‑quality execution. You’ll... ...Partner with product, ML, and cross‑functional... ...model evaluation, benchmarking, or comparison frameworks...QualityFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Evaluation Engineer: Benchmark & Model Quality. Be the first to apply!
- machine learning ai engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- entry level machine learning engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- computer vision machine learning engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA

