Software Engineer, AI Data & Evaluation

Mercor Alabaster

About Mercor

Mercor's mission is to organize human intelligence to power the AI economy. We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development. Our vast talent network trains frontier AI models in the same way teachers teach students: by sharing knowledge, experience, and context that can't be captured in code alone. Today, more than 30,000 experts in our network collectively earn over $2 million a day.

Mercor is creating a new category of work where expertise powers AI advancement. Achieving this requires an ambitious, fast-paced and deeply committed team. You'll work alongside researchers, operators, and AI companies at the forefront of shaping the systems that are redefining society. Mercor is a profitable Series C company valued at $10 billion. We work in-person five days a week in our San Francisco, NYC, or London offices.

About the Role

As a Senior Software Engineer (AI Data & Evaluation) at Mercor, you will be at the core of building the data infrastructure and evaluation systems that power the next generation of frontier AI models. Our team's mission is to develop high-quality data types that push frontier models forward and drive the AI industry ahead.

Software Engineers on this team are builders and innovators first. You will design and develop the evaluation methods and flywheels that drive continuous model improvement, engineer synthetic data pipelines and environments that generate high-signal training data at scale, and build the operational automation that keeps it all running with precision and efficiency. This role demands a product- and impact-oriented mindset, a bias toward shipping, and the ability to thrive at the intersection of data engineering, systems design, and applied AI research.

You Will

Innovate and develop evaluation methodologies and flywheels that continuously improve data quality and model performance at scale.
Design and build synthetic data generation systems and simulation environments that produce high-signal, high-diversity training data for frontier AI models.
Architect and ship operational automation systems that maximize throughput, efficiency, and quality across the end-to-end data pipeline.
Collaborate cross-functionally with Operations, Research, and Product to translate evolving model needs into robust, scalable engineering solutions.
Own end-to-end delivery of critical systems - from prototyping novel ideas to scaling production infrastructure.

What We're Looking For

Strong software engineering skills with a proven track record shipping production systems end-to-end.
Deep interest in and experience with AI/ML data pipelines, evaluation frameworks, or training data systems.
Systems thinking: ability to design for scalability, quality, and operational reliability simultaneously.
Comfort operating with ownership and pragmatism in fast-moving, ambiguous environments.
Effective communication and collaboration with engineering, research, and operations teams.
Experience with synthetic data generation, reinforcement learning environments, or large-scale data quality systems is highly valued.

Why Mercor

Impact: Your work directly shapes the quality of data powering the world's leading AI labs' frontier models.
Learning: Get early, first-hand exposure to cutting-edge model capabilities months before they reach the market.
Growth: Work at the intersection of data engineering and AI research with fast paths to ownership and leadership.

Benefits

Bi-annual performance bonus structure
Generous equity grant vested over 4 years
Up to $15k Relocation bonus
$10K proximity bonus (if you live within 0.5 miles of our office)
$1.5K monthly stipend for meals
Free Equinox membership
$200 monthly laundry reimbursement
$200 monthly personal wellness reimbursement
Health, Dental, Vision insurance

Apply

Vacancy posted 7 hours ago

Similar jobs that could be interesting for youBased on the Software Engineer, AI Data & Evaluation in San Francisco, CA vacancy

Senior AI Evaluation Engineer — Metrics & Data Pipelines
$240k - $280k
A leading software monitoring company is seeking a Senior Software Engineer on its AI/ML team to build evaluation infrastructure for measuring the performance of AI systems. This role involves designing datasets, creating benchmarks, and ensuring AI features behave reliably...
Suggested
Sentry
San Francisco, CA
4 days ago
AI Evaluation Engineer — Data-Driven Contract Intelligence
Ironclad, located in San Francisco, is seeking an AI Evaluation Engineer to join their team. This role involves analyzing datasets, designing feedback... .... Applicants should have 8+ years of experience in ML or data science, particularly in NLP applications. Strong SQL and...
Suggested
Contract work
Ironclad
San Francisco, CA
12 hours ago
AI Evaluation Engineer: Data‑Driven NLP for Contracts
Ironclad Inc. is seeking an AI Evaluation Engineer to enhance contract management through AI. Located in San Francisco, the role involves analyzing datasets, designing feedback loops, and ensuring continuous improvement of ML systems. Ideal candidates will have a quantitative...
Suggested
Contract work
Flexible hours
Ironclad Inc.
San Francisco, CA
1 day ago
Software Engineer, Research - Human Data
$230k - $385k
...preferences - the Human Data team is at the heart of... .... The Human Data engineering team creates the systems... ...Role We're looking for software engineers to join the Human... ...infrastructure that power how our AI models are trained, aligned, and evaluated. You'll partner with...
Suggested
Work at office
Relocation package
OpenAI
San Francisco, CA
3 days ago
Senior Software Engineer, ML Data
...Senior Software Engineer, ML Data San Francisco, CA • Hybrid • Reports to Head of Vision & AI Who We Are Voxel is building the future of Computer Vision and Machine... ...data infrastructure required to train and evaluate our ML models. You'll work with petabytes...
Suggested
Work at office
Flexible hours
Voxel Labs
San Francisco, CA
2 days ago
Software Engineer II, Big Data
$123.7k - $254.67k
...marketers. We leverage massive data and cutting-edge science to... ...grow their business. As a Data Engineer at tvScientific, you will be a... ...Demonstrated ability to use AI to improve speed and quality in... ...Strong track record of critical evaluation and verification of AI-...
Work at office
Remote work
Relocation
Relocation package
tvScientific
San Francisco, CA
7 hours ago
Contract Senior Software Engineer - AI Code Review & Evaluation
$50 - $150 per hour
A leading AI company is seeking a software engineer to review and evaluate model-generated code. This contract role requires several years of software engineering experience, particularly as a full-stack engineer at notable tech firms. You will assess code quality and...
Hourly pay
Contract work
Flexible hours
Turing
San Francisco, CA
4 days ago
Senior Staff ML Engineer, Data & Evaluation (Remote)
Airbnb, Inc. is hiring a Senior Staff Machine Learning Engineer, focusing on driving evaluation strategies and data infrastructure for CSxAI initiatives. This role... ...PhD in a relevant field, extensive experience in ML/AI systems, and strong leadership in technical...
Remote job
airbnb, Inc.
San Francisco, CA
12 hours ago
Senior Software Engineer, Data Infrastructure
$200k - $400k
...Senior Data Infrastructure Engineer Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology... ...and product teams to architect data solutions, evaluate performance, and scale new features. Tune...
Full time
Work at office
Local area
Decagon
San Francisco, CA
7 hours ago
Software Engineer, Human Data — AI Alignment Platforms
A leading AI research company in San Francisco seeks software engineers for their Human Data team. The role focuses on building robust systems for gathering and evaluating human feedback that improve AI models. Ideal candidates are strong in full-stack development and enjoy...
Work at office
Flexible hours
OpenAI
San Francisco, CA
1 day ago
Software Engineer (Data & AI) San Francisco, CA
...implement, and optimize data processing algorithms and AI models that enable intelligent... ...from data. Feature Engineering: Identify relevant features... ...: Work closely with software engineering, product, and... ...feature engineering, and model evaluation. Familiarity with cloud...
Suptask
San Francisco, CA
2 days ago
Software Engineer, Simulator Evaluation
$175k - $215k
...state-of-the-art Generative AI to create a training... ...Waymo Driver. The Simulator Evaluation team faces the ultimate data challenge: How do you mathematically... ...are looking for aSoftware Engineer to build the metrics and... ...report to Senior Staff Software Engineering Manager and...
Full time
Remote work
Waymo
San Francisco, CA
2 days ago
Software Engineering - Automation
$170k - $216k
...Software Engineer, Perception Evaluation and Test Automation Waymo is an autonomous driving technology company... ...access to millions of miles of driving data from a diverse set of sensors,... ...+ years of experience in industrial AI applications involving the creation,...
Full time
Remote work
Waymo
San Francisco, CA
4 days ago
Senior Software Engineer, Simulator Evaluation
$204k - $259k
...dynamics, and state-of-the-art Generative AI to create a training ground for the Waymo Driver. The Simulator Evaluation team faces the ultimate data challenge: How do you... ...real"? We are looking for aSenior Software Engineer to build the metrics and systems that...
Full time
Remote work
Waymo
San Francisco, CA
2 days ago
Software Engineer, Distributed Data Systems - Robotics
$230k - $385k
...cutting-edge hardware and software to explore a broad range of... ...seamlessly blend high-level AI capabilities with the constraints... ...Role As a Software Engineer, Distributed Data Systems, you will design... ...scale multimodal training and evaluation at OpenAI. You'll manage...
Work at office
Relocation package
OpenAI
San Francisco, CA
4 days ago
AIML - Sr. Software Development Engineer, Evaluation
$181.1k - $318.4k
...AIML - Sr. Software Development Engineer, Evaluation At Apple, we create world-class innovative products that... ...Intelligence features while protecting user data. We're seeking an experienced... ...and optimization of Apple's AI/ML features. Responsibilities:...
Immediate start
Relocation
Apple
San Francisco, CA
4 days ago
Software Engineer, Data Infrastructure
...mission is to architect AI that learns from and... ...innovation and systems engineering paired with a design‑minded... ...in AI. About the Role Data is the lifeblood of our... ...we’re looking for a Software Engineer to help build... ...batching, GPU‑aware loading, evaluation pipelines). Drive...
Work at office
Visa sponsorship
Flexible hours
Cartesia
San Francisco, CA
1 day ago
RL Environments Engineer - Data & Evaluation (SF Onsite)
$180k - $220k
David Joseph & Company is looking for a Software Engineer to develop datasets and evaluation systems that enhance AI models performance. This role involves designing data slices, running experiments, and collaborating with leading AI research teams. The ideal candidate...
David Joseph & Company
San Francisco, CA
2 days ago
Senior AI Data Engineer: AI Agent Evaluation
...Canada is seeking a dedicated professional to ensure the accuracy and reliability of Veeva AI Agents. The ideal candidate will have a strong background in automated evaluation pipelines, proficiency in Python, and deep knowledge of LLM common failure modes. Responsibilities...
Work at office
Flexible hours
Veeva Systems
San Francisco, CA
2 days ago
Software Engineer, Research Data Platform
$320k - $405k
...Software Engineer, Research Data Platform San Francisco, CA | New York City, NY About Anthropic Anthropic... ..., interpretable, and steerable AI systems. We want AI to be safe and... ...the data that goes into training and evaluating frontier models. We power the internal...
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
2 days ago
Software Engineer, Labelling, Data & Automation
$127k - $225k
...Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical... ...way. To learn more visit: As a Software Engineer on our Labelling and Data Automation team, you will build pipelines... ...curation, labelling, training and evaluation. - Previous experience in...
Full time
Work at office
Work from home
Flexible hours
Waabi
San Francisco, CA
12 hours ago
Software Engineer III/Senior, Data Platform
$202.5k - $247.5k
...Software Engineer III/Senior, Data Platform ngrok is an all-in-one cloud networking platform that secures... ...they're sharing localhost or running AI workloads in production. We're trusted... ...and actual compensation will be evaluated based on factors including, but not limited...
Permanent employment
Full time
Live in
Work at office
Local area
Remote work
Home office
Flexible hours
ngrok
San Francisco, CA
3 days ago
Software Engineer, Agent Evaluation and Quality
...Software Engineer, Agent Evaluation and Quality Engineering · Full-time · San Francisco; New York Our mission... ...sits at the intersection of product, data, and engineering: you'll instrument... ...Designing and building best-in-class AI evaluation system: curated datasets,...
Full time
Work at office
Anysphere
San Francisco, CA
4 days ago
Senior AI Data Engineer: Evaluation & Validation Lead
...skilled professional in San Francisco for a role focused on ensuring the accuracy and reliability of Veeva AI Agents. The position involves defining evaluation strategies, assessing LLM outputs, and creating high-quality datasets through rigorous validation methodologies...
Flexible hours
Veeva Systems, Inc.
San Francisco, CA
4 days ago
Software Engineer (Model Evaluation & Benchmarking)
...About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently... ...programming (C++, Java, Python, or similar). Strong data structures and algorithms fundamentals. Understanding...
SPREEAI
San Francisco, CA
1 day ago
Software Engineer, Agent Data Platform
...customer experiences with AI. We are primarily an in... ...'ll join a full-stack data team building the... ...areas: Platform Engineering: You have designed, built... ...Shared Qualities: Strong software engineering background... ...description. We strive to evaluate all applicants...
Full time
Flexible hours
Sierra
San Francisco, CA
12 hours ago
Software Engineer, Distributed Data Systems - Robotics
$255k - $405k
...multimodal functionalities into our AI products, ensuring they are reliable... ...societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale... ...large‑scale multimodal training and evaluation at OpenAI. You’ll manage distributed...
Full time
Work at office
Local area
Relocation package
Flexible hours
Slope
San Francisco, CA
2 days ago
Software Engineer, Data Platform
...now part of Superhuman, the AI productivity platform on a... ...we’re looking for a Senior Engineer to join our Data Platform team and help us build... ...of building complex software systems, including contributing... ...strategic thinker, able to evaluate whether to build internally...
Worldwide
Home office
Flexible hours
I did my part and supported the Regular Toilet
San Francisco, CA
4 days ago
Software Engineer (Full‑Stack / Infrastructure) -- Frontier AI Evaluation
...About the Team We build the data, evaluation, and experimentation... ...powering next‑generation agentic AI systems . Our work directly... ...‑tier startups, and elite engineering orgs . Revenue is already in... ...~1–3 years as a full‑stack software engineer ~ Background at a...
Remote work
Flexible hours
Emeraldadvantageconcepts
San Francisco, CA
22 days ago
Software Engineer, Evaluation Infrastructure
$127k - $223k
...Description Waabi, founded by AI visionary Raquel... ...more visit: The Evaluation Algorithms team is responsible... ...closed-loop simulation engine built with the latest... ...of metric and tag data - Work with large datasets... ...programming and strong software engineering...
Full time
Work at office
Work from home
Flexible hours
Waabi
San Francisco, CA
6 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, AI Data & Evaluation. Be the first to apply!