Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Evaluation Engineer for Coding Agents

Repovive, Inc.

##### ###### ##### ### # # ### # # ######## ## ## ## ## ## ## # # # # # ####### #### ##### # # # # # # # ###### # ## ## ## ## # # # # # #### # ###### ## ### # ### # ###### $ curl repovive.com/jobs/69ed18d7682d4cf1d9e87166 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ █████████████████████████████████████████████████ ##### ###### ##### ### # # ### # # ######## ## ## ## ## ## ## # # # # # ####### #### ##### # # # # # # # ###### # ## ## ## ## # # # # # #### # ###### ## ### # ### # ###### ##### ###### ##### ### # # ### # # ######## ## ## ## ## ## ## # # # # # ####### #### ##### # # # # # # # ###### # ## ## ## ## # # # # # #### # ###### ## ### # ### # ###### $ curl repovive.com/jobs/69ed18d7682d4cf1d9e87166 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ █████████████████████████████████████████████████ ##### ###### ##### ### # # ### # # ######## ## ## ## ## ## ## # # # # # ####### #### ##### # # # # # # # ###### # ## ## ## ## # # # # # #### # ###### ## ### # ### # ###### $ curl repovive.com/jobs/69ed18d7682d4cf1d9e87166 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ █████████████████████████████████████████████████ Repovive © 2025 Repovive, Inc. All rights reserved. Back to Jobs Apply Now Compensation Not listed Posted April 25, 2026 Required Skills AI evaluation data pipelines agent instrumentation Requirements Mid/Senior Visa Sponsorship Not mentioned Relocation Not mentioned About the Role Build evaluation and quality systems for Cursor's coding agents. Interested in this role? Apply directly on Cursor's website Apply for this Position #J-18808-Ljbffr

Vacancy posted 20 hours ago
Similar jobs that could be interesting for youBased on the AI Evaluation Engineer for Coding Agents in San Francisco, CA vacancy
  • $100k - $150k

     ...Most AI systems generate text. We’re building one...  ...makes decisions . Pareto Agent is a policy-driven...  ...Role As our Founding AI Engineer - Agent Runtime , you will...  ...are constrained, evaluated, and enforced by a deterministic...  ...— you treat coding agents as an execution... 
    Suggested
    Summer work
    Work at office
    Flexible hours

    Pareto Agent

    San Francisco, CA
    21 hours ago
  •  ...AI Systems Engineer - Codex Core Agents About The Team The Codex Core Agents team builds the agent harness...  ...part of how models are trained and evaluated, making this one of the highest-...  ...model outputs, use tools, execute code, and complete long-horizon tasks safely... 
    Suggested

    OpenAI

    San Francisco, CA
    5 days ago
  •  ...Applied AI Engineer The Codex Core Agent team builds the kernel of Codex. We own making the agent better...  ...on agent behaviors across real-world coding tasks and long-horizon workflows....  ...that get better real-task data into evaluation and research. Work with product teams... 
    Suggested

    OpenAI

    San Francisco, CA
    3 days ago
  • $180k - $215k

     ...looking for a seasoned Product Engineer to own and accelerate AI across Rive's editor....  ...things: Improving Rive's AI agent - We shipped an AI agent that...  ...of design, animation, and code. Most AI tools focus on one...  ..., APIs, and techniques and evaluate how they apply to Rive. Work... 
    Suggested
    Full time
    Work experience placement
    Work at office
    Remote work

    Rive

    San Francisco, CA
    20 hours ago
  • Anysphere is seeking a Software Engineer for the Agent Quality team in San Francisco, CA. In this role...  ...design and build infrastructure to evaluate and improve ML agents. Responsibilities...  ...Ideal candidates will have experience in AI evaluations, data analysis, and solid software... 
    Suggested

    Anysphere

    San Francisco, CA
    3 days ago
  • AI Systems Engineer - Codex Core Agents Location San Francisco Employment Type Full time Department Applied...  ...model outputs, use tools, execute code, and complete long-horizon tasks safely...  ...development environments. Develop evaluation, experimentation, and debugging... 
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Slope

    San Francisco, CA
    4 days ago
  • $140k - $225k

     ...Technical Staff — SketchPro.ai Location: San Francisco...  ...grunt work through AI agents operating directly in...  ...What You'll Own Agent engineering across context design,...  ...broader AEC ecosystem Evaluation harnesses to determine...  ...video calls Heavy daily coding agent usage (Claude Code... 
    Full time
    H1b
    Work at office
    Visa sponsorship

    David Joseph & Company

    San Francisco, CA
    4 days ago
  • $100k - $250k

     ...A pioneering AI software firm is seeking a Senior or Staff AI Context & Harness Engineer in San Francisco. This role involves building and maintaining AI coding agents, researching improved performance methods, and employing advanced context engineering techniques. Candidates... 
    Work at office
    Remote work

    Hercules

    San Francisco, CA
    21 hours ago
  •  ...Brain Co. in San Francisco is looking for a Security Engineer for Applications & AI. You will be responsible for integrating security practices into...  ...of application security experience and proficiency in coding. Competitive salary, daily lunches, and strong team collaboration... 

    BRAIN CORP

    San Francisco, CA
    21 hours ago
  •  ...foundation of useful, agentic AI. We are here to take a big swing at the most ambitious engineering challenge of our careers. Everyone...  ...of an effective background agent. Let's design and build the rest...  ...Qwen, Llama) performance. Claude Code is all about agent/harness... 
    Work at office
    Immediate start

    Sail Research

    San Francisco, CA
    4 days ago
  • Cacheflow is seeking a Senior Applied Research Engineer to enhance the effectiveness of our AI systems through focused research and experimentation. This role involves designing information retrieval strategies and collaborating with engineers to turn validated approaches... 
    Flexible hours

    Cacheflow

    San Francisco, CA
    1 day ago
  •  ...Turn/River in San Francisco is seeking a VP of Agent Engineering to enhance engineering capacity in portfolio...  ...collaborate with CTOs and CPOs to build AI agents that streamline feature development, combining hands-on coding with strategic leadership. The ideal candidate... 

    TurnRiver.com

    San Francisco, CA
    20 hours ago
  • $170k - $216k

     ...Software Engineer, Quantitative Evaluations Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver...  ...software changes and simulated outcomes. Champion code health and best practices in a large and complex code base... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    3 days ago
  • $150k - $180k

     ...AI Evaluations Engineer – HealthcareLocation: Remote, located in the USType: Full-timeDepartment: EngineeringReports...  ...testing platform for AI voice agents, debugging and observability tools....  ...production-grade infrastructure with code, including APIs, services, and data pipelines... 
    Remote work
    Flexible hours

    Ellipsis Health

    San Francisco, CA
    1 day ago
  •  ...AI Engineer Opportunity at Goodfin Goodfin is an AI-native investment...  ...and improve RAG pipelines, evaluations, and reliability mechanisms....  ...particularly around LLM ecosystems and agent frameworks. Who You Are...  .... ~ Strong hands-on coding experience in Python and... 

    goodfin

    San Francisco, CA
    3 days ago
  •  ...AI Prompt Engineer San Francisco, CA (On-Site M-F) Our client is an...  ...call center and scheduling agents. About the Role As an...  ...sub-agent architectures, and evaluation harnesses to iteratively improve...  ...effectively using modern AI coding tools. ~ On-site... 

    latitude

    San Francisco, CA
    1 day ago
  •  ...their hybrid cloud and AI journeys. With support...  ...an AI Forward Deployed Engineer, you will work with customers...  ...and adoption. Evaluate Model Performance: Assess...  ...developing or working with agent‐based AI solutions (e.g...  ...engineering: Strong coding skills (ideally in Python... 
    Worldwide

    IBM Computing

    San Francisco, CA
    4 days ago
  •  ...we’re transforming how engineers create, access, and share...  ...looking for a Founding AI Engineer to help us...  ...including architecture, coding, testing, and deploying...  .../or Node.js You can evaluate tradeoffs and propose the...  ...already built your own agents) You have fine-tuned... 
    Work experience placement
    Work at office
    Flexible hours

    Falconer

    San Francisco, CA
    21 hours ago
  •  ...Accenture’s Global Responsible AI team within the Global Data &...  ...if you’re an experienced RAI Engineer with a Responsible AI background...  ...practices. Detecting, evaluating, and applying relevant RAI dimensions...  ...data preparation, design, coding, testing, deployment, and support... 
    Work experience placement
    Live in
    Work at office
    Local area

    Accenture

    San Francisco, CA
    2 days ago
  •  ...A pioneering AI technology firm based in San Francisco is seeking an AI Engineer to own the evaluation infrastructure for AI agents. This role requires designing automated pipelines and building observability systems, ensuring agent performance meets enterprise standards... 
    Remote work
    Flexible hours

    Fieldguide.ai

    San Francisco, CA
    21 hours ago
  • $164.7k - $266k

     ...and implement end‑to‑end AI workflows that power...  ...briefs into concrete, agent‑powered flows: from data...  ...Partner with IT, Data Engineering, and platform teams to...  ...language models (LLMs), coding assistants, or agentic...  ...retrieval strategies, LLM evaluation frameworks, and common... 
    Work at office
    Remote work
    2 days per week

    DocuSign

    San Francisco, CA
    21 hours ago
  • $150k - $250k

     ...Distyl AI Job Posting Distyl is an applied AI...  ...build AI systems using Evaluation-Driven Development —an...  ...production. AI Evaluation Engineers focus on designing and...  ...production Python code, build evaluation pipelines...  ...inform prompt design, agent logic, model selection,... 
    Work at office
    3 days per week

    Distyl AI

    San Francisco, CA
    3 days ago
  •  ...Are The Agentic AI Software Engineer - Cybersecurity Systems designs...  ...on building and maintaining agent-based artificial...  ...of autonomously generating code, conducting security analyses...  ...cybersecurity use cases. Develop evaluation metrics for AI accuracy in threat... 
    Local area
    Work from home

    Bishop Fox

    San Francisco, CA
    4 days ago
  • $170k - $210k

     ...Senior Software Engineer, AI Engineer Hybrid - SF Bay Area About...  ...to be fluent with modern AI coding tools (Claude Code, Cursor, Copilot...  ...healthcare-grade safety and evaluation. What You Will Do...  ...design, retrieval, tool use, agents, evaluation, and production operations... 
    Work at office
    Immediate start
    Remote work
    Shift work
    2 days per week

    Midi Health

    San Francisco, CA
    3 days ago
  •  ...Ironclad Inc. is seeking an AI Evaluation Engineer to enhance contract management through AI. Located in San Francisco, the role involves analyzing datasets, designing feedback loops, and ensuring continuous improvement of ML systems. Ideal candidates will have a quantitative... 
    Contract work
    Flexible hours

    Ironclad Inc

    San Francisco, CA
    1 day ago
  •  ...A cutting-edge AI firm in San Francisco is seeking a Research Engineer to develop evaluation systems and benchmarking pipelines for language models. Candidates should have a strong background in applied research, coding skills, and familiarity with ML models. You will... 

    Mercor Inc

    San Francisco, CA
    20 hours ago
  • $164.7k - $266k

     ...campaigns to autonomous, AI‑driven customer journeys. As an AI GTM Engineer on the Growth team, you’...  ...AI‑powered workflows, agents, and tooling that make Marketing...  ...language models (LLMs), coding assistants, or agentic...  ...strategies, LLM evaluation frameworks, and common failure... 
    Contract work
    Work at office
    Local area
    Remote work
    2 days per week

    Unavailable

    San Francisco, CA
    21 hours ago
  • $115k - $200k

     ...About the job Forward Deployed AI Engineer Forward Deployed AI...  ...-backed startup applying AI agents to billions of events daily to...  ...in production. Strong coding skills in Python, Java, TypeScript...  ...(precision, recall, evaluation). Understanding how LLM systems... 
    Work at office
    Visa sponsorship

    Jenn Nguyen and Friends

    San Francisco, CA
    5 days ago
  •  ...Ironclad, located in San Francisco, is seeking an AI Evaluation Engineer to join their team. This role involves analyzing datasets, designing feedback loops, and partnering closely with AI Engineers to improve model quality. Applicants should have 8+ years of experience... 
    Contract work

    Ironclad Inc

    San Francisco, CA
    22 hours ago
  •  ...re looking for a founding engineer focused on building production agents—someone who will push our...  ...Agents on a New AI Platform This isn’t a typical...  ...they can operate on real code and real systems. Make Agents...  ...Collaborate Closely with Product & Evaluation: Work with PMs and... 
    Flexible hours

    Guild.ai, Inc.

    San Francisco, CA
    21 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Evaluation Engineer for Coding Agents. Be the first to apply!