Software Engineer, AI Data & Evaluation
Mercor Alabaster
About Mercor Mercor's mission is to organize human intelligence to power the AI economy. We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development. Our vast talent network trains frontier AI models in the same way teachers teach students: by sharing knowledge, experience, and context that can't be captured in code alone. Today, more than 30,000 experts in our network collectively earn over $2 million a day. Mercor is creating a new category of work where expertise powers AI advancement. Achieving this requires an ambitious, fast-paced and deeply committed team. You'll work alongside researchers, operators, and AI companies at the forefront of shaping the systems that are redefining society. Mercor is a profitable Series C company valued at $10 billion. We work in-person five days a week in our San Francisco, NYC, or London offices. About the Role As a Senior Software Engineer (AI Data & Evaluation) at Mercor, you will be at the core of building the data infrastructure and evaluation systems that power the next generation of frontier AI models. Our team's mission is to develop high-quality data types that push frontier models forward and drive the AI industry ahead. Software Engineers on this team are builders and innovators first. You will design and develop the evaluation methods and flywheels that drive continuous model improvement, engineer synthetic data pipelines and environments that generate high-signal training data at scale, and build the operational automation that keeps it all running with precision and efficiency. This role demands a product- and impact-oriented mindset, a bias toward shipping, and the ability to thrive at the intersection of data engineering, systems design, and applied AI research. You Will
- Innovate and develop evaluation methodologies and flywheels that continuously improve data quality and model performance at scale.
- Design and build synthetic data generation systems and simulation environments that produce high-signal, high-diversity training data for frontier AI models.
- Architect and ship operational automation systems that maximize throughput, efficiency, and quality across the end-to-end data pipeline.
- Collaborate cross-functionally with Operations, Research, and Product to translate evolving model needs into robust, scalable engineering solutions.
- Own end-to-end delivery of critical systems - from prototyping novel ideas to scaling production infrastructure.
- Strong software engineering skills with a proven track record shipping production systems end-to-end.
- Deep interest in and experience with AI/ML data pipelines, evaluation frameworks, or training data systems.
- Systems thinking: ability to design for scalability, quality, and operational reliability simultaneously.
- Comfort operating with ownership and pragmatism in fast-moving, ambiguous environments.
- Effective communication and collaboration with engineering, research, and operations teams.
- Experience with synthetic data generation, reinforcement learning environments, or large-scale data quality systems is highly valued.
- Impact: Your work directly shapes the quality of data powering the world's leading AI labs' frontier models.
- Learning: Get early, first-hand exposure to cutting-edge model capabilities months before they reach the market.
- Growth: Work at the intersection of data engineering and AI research with fast paths to ownership and leadership.
- Bi-annual performance bonus structure
- Generous equity grant vested over 4 years
- Up to $15k Relocation bonus
- $10K proximity bonus (if you live within 0.5 miles of our office)
- $1.5K monthly stipend for meals
- Free Equinox membership
- $200 monthly laundry reimbursement
- $200 monthly personal wellness reimbursement
- Health, Dental, Vision insurance
Vacancy posted 7 hours ago
Similar jobs that could be interesting for youBased on the Software Engineer, AI Data & Evaluation in San Francisco, CA vacancy
$240k - $280k
A leading software monitoring company is seeking a Senior Software Engineer on its AI/ML team to build evaluation infrastructure for measuring the performance of AI systems. This role involves designing datasets, creating benchmarks, and ensuring AI features behave reliably...Suggested- Ironclad, located in San Francisco, is seeking an AI Evaluation Engineer to join their team. This role involves analyzing datasets, designing feedback... .... Applicants should have 8+ years of experience in ML or data science, particularly in NLP applications. Strong SQL and...SuggestedContract work
- Ironclad Inc. is seeking an AI Evaluation Engineer to enhance contract management through AI. Located in San Francisco, the role involves analyzing datasets, designing feedback loops, and ensuring continuous improvement of ML systems. Ideal candidates will have a quantitative...SuggestedContract workFlexible hours
$230k - $385k
...preferences - the Human Data team is at the heart of... .... The Human Data engineering team creates the systems... ...Role We're looking for software engineers to join the Human... ...infrastructure that power how our AI models are trained, aligned, and evaluated. You'll partner with...SuggestedWork at officeRelocation package- ...Senior Software Engineer, ML Data San Francisco, CA • Hybrid • Reports to Head of Vision & AI Who We Are Voxel is building the future of Computer Vision and Machine... ...data infrastructure required to train and evaluate our ML models. You'll work with petabytes...SuggestedWork at officeFlexible hours
$123.7k - $254.67k
...marketers. We leverage massive data and cutting-edge science to... ...grow their business. As a Data Engineer at tvScientific, you will be a... ...Demonstrated ability to use AI to improve speed and quality in... ...Strong track record of critical evaluation and verification of AI-...Work at officeRemote workRelocationRelocation package$50 - $150 per hour
A leading AI company is seeking a software engineer to review and evaluate model-generated code. This contract role requires several years of software engineering experience, particularly as a full-stack engineer at notable tech firms. You will assess code quality and...Hourly payContract workFlexible hours- Airbnb, Inc. is hiring a Senior Staff Machine Learning Engineer, focusing on driving evaluation strategies and data infrastructure for CSxAI initiatives. This role... ...PhD in a relevant field, extensive experience in ML/AI systems, and strong leadership in technical...Remote job
$200k - $400k
...Senior Data Infrastructure Engineer Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology... ...and product teams to architect data solutions, evaluate performance, and scale new features. Tune...Full timeWork at officeLocal area- A leading AI research company in San Francisco seeks software engineers for their Human Data team. The role focuses on building robust systems for gathering and evaluating human feedback that improve AI models. Ideal candidates are strong in full-stack development and enjoy...Work at officeFlexible hours
- ...implement, and optimize data processing algorithms and AI models that enable intelligent... ...from data. Feature Engineering: Identify relevant features... ...: Work closely with software engineering, product, and... ...feature engineering, and model evaluation. Familiarity with cloud...
$175k - $215k
...state-of-the-art Generative AI to create a training... ...Waymo Driver. The Simulator Evaluation team faces the ultimate data challenge: How do you mathematically... ...are looking for aSoftware Engineer to build the metrics and... ...report to Senior Staff Software Engineering Manager and...Full timeRemote work$170k - $216k
...Software Engineer, Perception Evaluation and Test Automation Waymo is an autonomous driving technology company... ...access to millions of miles of driving data from a diverse set of sensors,... ...+ years of experience in industrial AI applications involving the creation,...Full timeRemote work$204k - $259k
...dynamics, and state-of-the-art Generative AI to create a training ground for the Waymo Driver. The Simulator Evaluation team faces the ultimate data challenge: How do you... ...real"? We are looking for aSenior Software Engineer to build the metrics and systems that...Full timeRemote work$230k - $385k
...cutting-edge hardware and software to explore a broad range of... ...seamlessly blend high-level AI capabilities with the constraints... ...Role As a Software Engineer, Distributed Data Systems, you will design... ...scale multimodal training and evaluation at OpenAI. You'll manage...Work at officeRelocation package$181.1k - $318.4k
...AIML - Sr. Software Development Engineer, Evaluation At Apple, we create world-class innovative products that... ...Intelligence features while protecting user data. We're seeking an experienced... ...and optimization of Apple's AI/ML features. Responsibilities:...Immediate startRelocation- ...mission is to architect AI that learns from and... ...innovation and systems engineering paired with a design‑minded... ...in AI. About the Role Data is the lifeblood of our... ...we’re looking for a Software Engineer to help build... ...batching, GPU‑aware loading, evaluation pipelines). Drive...Work at officeVisa sponsorshipFlexible hours
$180k - $220k
David Joseph & Company is looking for a Software Engineer to develop datasets and evaluation systems that enhance AI models performance. This role involves designing data slices, running experiments, and collaborating with leading AI research teams. The ideal candidate...- ...Canada is seeking a dedicated professional to ensure the accuracy and reliability of Veeva AI Agents. The ideal candidate will have a strong background in automated evaluation pipelines, proficiency in Python, and deep knowledge of LLM common failure modes. Responsibilities...Work at officeFlexible hours
$320k - $405k
...Software Engineer, Research Data Platform San Francisco, CA | New York City, NY About Anthropic Anthropic... ..., interpretable, and steerable AI systems. We want AI to be safe and... ...the data that goes into training and evaluating frontier models. We power the internal...Work at officeVisa sponsorshipFlexible hours$127k - $225k
...Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical... ...way. To learn more visit: As a Software Engineer on our Labelling and Data Automation team, you will build pipelines... ...curation, labelling, training and evaluation. - Previous experience in...Full timeWork at officeWork from homeFlexible hours$202.5k - $247.5k
...Software Engineer III/Senior, Data Platform ngrok is an all-in-one cloud networking platform that secures... ...they're sharing localhost or running AI workloads in production. We're trusted... ...and actual compensation will be evaluated based on factors including, but not limited...Permanent employmentFull timeLive inWork at officeLocal areaRemote workHome officeFlexible hours- ...Software Engineer, Agent Evaluation and Quality Engineering · Full-time · San Francisco; New York Our mission... ...sits at the intersection of product, data, and engineering: you'll instrument... ...Designing and building best-in-class AI evaluation system: curated datasets,...Full timeWork at office
- ...skilled professional in San Francisco for a role focused on ensuring the accuracy and reliability of Veeva AI Agents. The position involves defining evaluation strategies, assessing LLM outputs, and creating high-quality datasets through rigorous validation methodologies...Flexible hours
- ...About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently... ...programming (C++, Java, Python, or similar). Strong data structures and algorithms fundamentals. Understanding...
- ...customer experiences with AI. We are primarily an in... ...'ll join a full-stack data team building the... ...areas: Platform Engineering: You have designed, built... ...Shared Qualities: Strong software engineering background... ...description. We strive to evaluate all applicants...Full timeFlexible hours
$255k - $405k
...multimodal functionalities into our AI products, ensuring they are reliable... ...societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale... ...large‑scale multimodal training and evaluation at OpenAI. You’ll manage distributed...Full timeWork at officeLocal areaRelocation packageFlexible hours- ...now part of Superhuman, the AI productivity platform on a... ...we’re looking for a Senior Engineer to join our Data Platform team and help us build... ...of building complex software systems, including contributing... ...strategic thinker, able to evaluate whether to build internally...WorldwideHome officeFlexible hours
- ...About the Team We build the data, evaluation, and experimentation... ...powering next‑generation agentic AI systems . Our work directly... ...‑tier startups, and elite engineering orgs . Revenue is already in... ...~1–3 years as a full‑stack software engineer ~ Background at a...Remote workFlexible hours
$127k - $223k
...Description Waabi, founded by AI visionary Raquel... ...more visit: The Evaluation Algorithms team is responsible... ...closed-loop simulation engine built with the latest... ...of metric and tag data - Work with large datasets... ...programming and strong software engineering...Full timeWork at officeWork from homeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, AI Data & Evaluation. Be the first to apply!
Related searches
- graduate software developer San Francisco, CA
- rust software engineer San Francisco, CA
- senior software design engineer San Francisco, CA
- software engineer student San Francisco, CA
- software engineer amazon San Francisco, CA
- software developer positions San Francisco, CA
- software engineer full time San Francisco, CA
- software qa engineer San Francisco, CA
- new graduate software engineer San Francisco, CA
- junior software developer San Francisco, CA


