Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, AI Data & Evaluation

Mercor Alabaster

About Mercor

Mercor's mission is to organize human intelligence to power the AI economy. We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development. Our vast talent network trains frontier AI models in the same way teachers teach students: by sharing knowledge, experience, and context that can't be captured in code alone. Today, more than 30,000 experts in our network collectively earn over $2 million a day.

Mercor is creating a new category of work where expertise powers AI advancement. Achieving this requires an ambitious, fast-paced and deeply committed team. You'll work alongside researchers, operators, and AI companies at the forefront of shaping the systems that are redefining society. Mercor is a profitable Series C company valued at $10 billion. We work in-person five days a week in our San Francisco, NYC, or London offices.

About the Role

As a Senior Software Engineer (AI Data & Evaluation) at Mercor, you will be at the core of building the data infrastructure and evaluation systems that power the next generation of frontier AI models. Our team's mission is to develop high-quality data types that push frontier models forward and drive the AI industry ahead.

Software Engineers on this team are builders and innovators first. You will design and develop the evaluation methods and flywheels that drive continuous model improvement, engineer synthetic data pipelines and environments that generate high-signal training data at scale, and build the operational automation that keeps it all running with precision and efficiency. This role demands a product- and impact-oriented mindset, a bias toward shipping, and the ability to thrive at the intersection of data engineering, systems design, and applied AI research.

You Will
  • Innovate and develop evaluation methodologies and flywheels that continuously improve data quality and model performance at scale.
  • Design and build synthetic data generation systems and simulation environments that produce high-signal, high-diversity training data for frontier AI models.
  • Architect and ship operational automation systems that maximize throughput, efficiency, and quality across the end-to-end data pipeline.
  • Collaborate cross-functionally with Operations, Research, and Product to translate evolving model needs into robust, scalable engineering solutions.
  • Own end-to-end delivery of critical systems - from prototyping novel ideas to scaling production infrastructure.
What We're Looking For
  • Strong software engineering skills with a proven track record shipping production systems end-to-end.
  • Deep interest in and experience with AI/ML data pipelines, evaluation frameworks, or training data systems.
  • Systems thinking: ability to design for scalability, quality, and operational reliability simultaneously.
  • Comfort operating with ownership and pragmatism in fast-moving, ambiguous environments.
  • Effective communication and collaboration with engineering, research, and operations teams.
  • Experience with synthetic data generation, reinforcement learning environments, or large-scale data quality systems is highly valued.
Why Mercor
  • Impact: Your work directly shapes the quality of data powering the world's leading AI labs' frontier models.
  • Learning: Get early, first-hand exposure to cutting-edge model capabilities months before they reach the market.
  • Growth: Work at the intersection of data engineering and AI research with fast paths to ownership and leadership.
Benefits
  • Bi-annual performance bonus structure
  • Generous equity grant vested over 4 years
  • Up to $15k Relocation bonus
  • $10K proximity bonus (if you live within 0.5 miles of our office)
  • $1.5K monthly stipend for meals
  • Free Equinox membership
  • $200 monthly laundry reimbursement
  • $200 monthly personal wellness reimbursement
  • Health, Dental, Vision insurance
Vacancy posted 7 hours ago
Similar jobs that could be interesting for youBased on the Software Engineer, AI Data & Evaluation in San Francisco, CA vacancy
  • $240k - $280k

    A leading software monitoring company is seeking a Senior Software Engineer on its AI/ML team to build evaluation infrastructure for measuring the performance of AI systems. This role involves designing datasets, creating benchmarks, and ensuring AI features behave reliably... 
    Suggested

    Sentry

    San Francisco, CA
    4 days ago
  • Ironclad, located in San Francisco, is seeking an AI Evaluation Engineer to join their team. This role involves analyzing datasets, designing feedback...  .... Applicants should have 8+ years of experience in ML or data science, particularly in NLP applications. Strong SQL and... 
    Suggested
    Contract work

    Ironclad

    San Francisco, CA
    12 hours ago
  • Ironclad Inc. is seeking an AI Evaluation Engineer to enhance contract management through AI. Located in San Francisco, the role involves analyzing datasets, designing feedback loops, and ensuring continuous improvement of ML systems. Ideal candidates will have a quantitative... 
    Suggested
    Contract work
    Flexible hours

    Ironclad Inc.

    San Francisco, CA
    1 day ago
  • $230k - $385k

     ...preferences - the Human Data team is at the heart of...  .... The Human Data engineering team creates the systems...  ...Role We're looking for software engineers to join the Human...  ...infrastructure that power how our AI models are trained, aligned, and evaluated. You'll partner with... 
    Suggested
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    3 days ago
  •  ...Senior Software Engineer, ML Data San Francisco, CA • Hybrid • Reports to Head of Vision & AI Who We Are Voxel is building the future of Computer Vision and Machine...  ...data infrastructure required to train and evaluate our ML models. You'll work with petabytes... 
    Suggested
    Work at office
    Flexible hours

    Voxel Labs

    San Francisco, CA
    2 days ago
  • $123.7k - $254.67k

     ...marketers. We leverage massive data and cutting-edge science to...  ...grow their business. As a Data Engineer at tvScientific, you will be a...  ...Demonstrated ability to use AI to improve speed and quality in...  ...Strong track record of critical evaluation and verification of AI-... 
    Work at office
    Remote work
    Relocation
    Relocation package

    tvScientific

    San Francisco, CA
    7 hours ago
  • $50 - $150 per hour

    A leading AI company is seeking a software engineer to review and evaluate model-generated code. This contract role requires several years of software engineering experience, particularly as a full-stack engineer at notable tech firms. You will assess code quality and... 
    Hourly pay
    Contract work
    Flexible hours

    Turing

    San Francisco, CA
    4 days ago
  • Airbnb, Inc. is hiring a Senior Staff Machine Learning Engineer, focusing on driving evaluation strategies and data infrastructure for CSxAI initiatives. This role...  ...PhD in a relevant field, extensive experience in ML/AI systems, and strong leadership in technical... 
    Remote job

    airbnb, Inc.

    San Francisco, CA
    12 hours ago
  • $200k - $400k

     ...Senior Data Infrastructure Engineer Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology...  ...and product teams to architect data solutions, evaluate performance, and scale new features. Tune... 
    Full time
    Work at office
    Local area

    Decagon

    San Francisco, CA
    7 hours ago
  • A leading AI research company in San Francisco seeks software engineers for their Human Data team. The role focuses on building robust systems for gathering and evaluating human feedback that improve AI models. Ideal candidates are strong in full-stack development and enjoy... 
    Work at office
    Flexible hours

    OpenAI

    San Francisco, CA
    1 day ago
  •  ...implement, and optimize data processing algorithms and AI models that enable intelligent...  ...from data. Feature Engineering: Identify relevant features...  ...: Work closely with software engineering, product, and...  ...feature engineering, and model evaluation. Familiarity with cloud... 

    Suptask

    San Francisco, CA
    2 days ago
  • $175k - $215k

     ...state-of-the-art Generative AI to create a training...  ...Waymo Driver. The Simulator Evaluation team faces the ultimate data challenge: How do you mathematically...  ...are looking for aSoftware Engineer to build the metrics and...  ...report to Senior Staff Software Engineering Manager and... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $170k - $216k

     ...Software Engineer, Perception Evaluation and Test Automation Waymo is an autonomous driving technology company...  ...access to millions of miles of driving data from a diverse set of sensors,...  ...+ years of experience in industrial AI applications involving the creation,... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    4 days ago
  • $204k - $259k

     ...dynamics, and state-of-the-art Generative AI to create a training ground for the Waymo Driver. The Simulator Evaluation team faces the ultimate data challenge: How do you...  ...real"? We are looking for aSenior Software Engineer to build the metrics and systems that... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $230k - $385k

     ...cutting-edge hardware and software to explore a broad range of...  ...seamlessly blend high-level AI capabilities with the constraints...  ...Role As a Software Engineer, Distributed Data Systems, you will design...  ...scale multimodal training and evaluation at OpenAI. You'll manage... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  • $181.1k - $318.4k

     ...AIML - Sr. Software Development Engineer, Evaluation At Apple, we create world-class innovative products that...  ...Intelligence features while protecting user data. We're seeking an experienced...  ...and optimization of Apple's AI/ML features. Responsibilities:... 
    Immediate start
    Relocation

    Apple

    San Francisco, CA
    4 days ago
  •  ...mission is to architect AI that learns from and...  ...innovation and systems engineering paired with a design‑minded...  ...in AI. About the Role Data is the lifeblood of our...  ...we’re looking for a Software Engineer to help build...  ...batching, GPU‑aware loading, evaluation pipelines). Drive... 
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia

    San Francisco, CA
    1 day ago
  • $180k - $220k

    David Joseph & Company is looking for a Software Engineer to develop datasets and evaluation systems that enhance AI models performance. This role involves designing data slices, running experiments, and collaborating with leading AI research teams. The ideal candidate... 

    David Joseph & Company

    San Francisco, CA
    2 days ago
  •  ...Canada is seeking a dedicated professional to ensure the accuracy and reliability of Veeva AI Agents. The ideal candidate will have a strong background in automated evaluation pipelines, proficiency in Python, and deep knowledge of LLM common failure modes. Responsibilities... 
    Work at office
    Flexible hours

    Veeva Systems

    San Francisco, CA
    2 days ago
  • $320k - $405k

     ...Software Engineer, Research Data Platform San Francisco, CA | New York City, NY About Anthropic Anthropic...  ..., interpretable, and steerable AI systems. We want AI to be safe and...  ...the data that goes into training and evaluating frontier models. We power the internal... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    2 days ago
  • $127k - $225k

     ...Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical...  ...way. To learn more visit: As a Software Engineer on our Labelling and Data Automation team, you will build pipelines...  ...curation, labelling, training and evaluation. - Previous experience in... 
    Full time
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    12 hours ago
  • $202.5k - $247.5k

     ...Software Engineer III/Senior, Data Platform ngrok is an all-in-one cloud networking platform that secures...  ...they're sharing localhost or running AI workloads in production. We're trusted...  ...and actual compensation will be evaluated based on factors including, but not limited... 
    Permanent employment
    Full time
    Live in
    Work at office
    Local area
    Remote work
    Home office
    Flexible hours

    ngrok

    San Francisco, CA
    3 days ago
  •  ...Software Engineer, Agent Evaluation and Quality Engineering · Full-time · San Francisco; New York Our mission...  ...sits at the intersection of product, data, and engineering: you'll instrument...  ...Designing and building best-in-class AI evaluation system: curated datasets,... 
    Full time
    Work at office

    Anysphere

    San Francisco, CA
    4 days ago
  •  ...skilled professional in San Francisco for a role focused on ensuring the accuracy and reliability of Veeva AI Agents. The position involves defining evaluation strategies, assessing LLM outputs, and creating high-quality datasets through rigorous validation methodologies... 
    Flexible hours

    Veeva Systems, Inc.

    San Francisco, CA
    4 days ago
  •  ...About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently...  ...programming (C++, Java, Python, or similar). Strong data structures and algorithms fundamentals. Understanding... 

    SPREEAI

    San Francisco, CA
    1 day ago
  •  ...customer experiences with AI. We are primarily an in...  ...'ll join a full-stack data team building the...  ...areas: Platform Engineering: You have designed, built...  ...Shared Qualities: Strong software engineering background...  ...description. We strive to evaluate all applicants... 
    Full time
    Flexible hours

    Sierra

    San Francisco, CA
    12 hours ago
  • $255k - $405k

     ...multimodal functionalities into our AI products, ensuring they are reliable...  ...societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale...  ...large‑scale multimodal training and evaluation at OpenAI. You’ll manage distributed... 
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Slope

    San Francisco, CA
    2 days ago
  •  ...now part of Superhuman, the AI productivity platform on a...  ...we’re looking for a Senior Engineer to join our Data Platform team and help us build...  ...of building complex software systems, including contributing...  ...strategic thinker, able to evaluate whether to build internally... 
    Worldwide
    Home office
    Flexible hours

    I did my part and supported the Regular Toilet

    San Francisco, CA
    4 days ago
  •  ...About the Team We build the data, evaluation, and experimentation...  ...powering next‑generation agentic AI systems . Our work directly...  ...‑tier startups, and elite engineering orgs . Revenue is already in...  ...~1–3 years as a full‑stack software engineer ~ Background at a... 
    Remote work
    Flexible hours

    Emeraldadvantageconcepts

    San Francisco, CA
    22 days ago
  • $127k - $223k

     ...Description Waabi, founded by AI visionary Raquel...  ...more visit: The Evaluation Algorithms team is responsible...  ...closed-loop simulation engine built with the latest...  ...of metric and tag data - Work with large datasets...  ...programming and strong software engineering... 
    Full time
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    6 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, AI Data & Evaluation. Be the first to apply!