Software Engineer, Agent Evaluation and Quality Engineering · · San Francisco; New York Apply →
Anysphere
Our mission is to automate coding. The first step in our journey is to build the best tool for professional programmers, using a combination of inventive research, design, and engineering. Our organization is very flat, and our team is small and talent dense. We particularly like people who are truth-seeking, passionate, and creative. We enjoy spirited debate, crazy ideas, and shipping code. About the Role As a Software Engineer on the Agent Quality team at Cursor, you’ll build the measurement, evaluation, and feedback-loop infrastructure that makes the Cursor core agent reliably better over time. This role sits at the intersection of product, data, and engineering: you’ll instrument what matters, help define how we judge quality, build pipelines and tooling to analyze agent behavior at scale, and partner closely with research, product, and infrastructure teams to turn insights into improvements. Your impact will compound across every Cursor product built on the shared harness—and across high-stakes decisions around model choice, quality, and cost. What you’ll work on Designing and building best-in‑class AI evaluation system: curated datasets, offline replay, scorers / judges, regression alerts, and dashboards. Designing feedback loops from real usage: collecting, cleaning, and interpreting user signals to inform model and harness changes. Developing analysis tooling and workflows for debugging agent behavior: deep dives on failure modes, clustering themes, and surfacing actionable insights. Improving reliability and guardrails by making quality measurable and operational: defining “good/bad/degraded” sessions, alerting, and triage primitives. You may be a fit if You’ve built and operated evaluation or measurement systems, such as AI evals, experimentation, ranking/relevance, or search quality. You can turn ambiguous “quality” questions into concrete metrics, pipelines, and decisions. You have strong data acumen, and can collaborate effectively with data scientists and researchers. You have taste and strong opinions on model and agent behaviors. You stay up-to-date and informed on emerging research and industry trends. You have strong software engineering fundamentals and enjoy shipping production systems. U.S. EQUAL EMPLOYMENT OPPORTUNITY INFORMATION (Completion is voluntary and will not subject you to adverse treatment) Anysphere, Inc. provides equal employment opportunities to applicants and employees without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, or disability. We invite all applicants to voluntarily self-identify their race, ethnicity, and gender. Submission of the information on this form is strictly voluntary and refusal to provide it will not subject you to any adverse treatment. Information obtained will be retained in a confidential file and separate from personnel records. This information may only be used in accordance with the provision of applicable federal laws, executive orders, and regulations. If you want more information about any of the sections, please check with a company representative. Male Female Decline to self-identify Race Hispanic or Latino White (Not Hispanic or Latino) Black or African American (Not Hispanic or Latino) Native Hawaiian or Other Pacific Islander (Not Hispanic or Latino) Asian (Not Hispanic or Latino) American Indian or Alaska Native (Not Hispanic or Latino) Two or More Races (Not Hispanic or Latino) Decline to self-identify SELF-IDENTIFICATION OF VETERAN STATUS (Completion is voluntary and will not subject you to adverse treatment) If you believe that you belong to any of the following categories of protected veterans, please indicate by making the appropriate selection. Disabled veteran – A veteran who served on active duty in the U.S. military and is entitled to disability compensation (or who but for the receipt of military retired pay would be entitled to disability compensation) under laws administered by the Secretary of Veterans Affairs, or was discharged or released from active duty because of a service‑connected disability. Recently separated veteran – A veteran separated during the three‑year period beginning on the date of the veteran’s discharge or release from active duty in the U.S. military, ground, naval, or air service. Active duty wartime or campaign badge veteran – A veteran who served on active duty in the U.S. military during a war, or in a campaign or expedition for which a campaign badge was authorized under the laws administered by the Department of Defense. Armed forces service medal veteran – A veteran who, while serving on active duty in the U.S. military ground, naval, or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985 (61 Fed. Reg. 1209). I identify as one or more of the classifications of protected veteran listed above. I am not a protected veteran. I decline to self‑identify for protected veteran status. #J-18808-Ljbffr Anysphere
$232k - $348k
...We’re looking for a Software Engineer to join the Vercel Agent team. Our mission... ...the agent acquires new capabilities at runtime... ...engineering, model evaluation, and retrieval-... ...space as needed. The San Francisco, CA base pay range... ...encourages everyone to apply for our available...SuggestedFull timeWork at officeRemote workWork from homeMonday to FridayFlexible hoursShift work- Research Engineer, Virtual Collaborator at Anthropic - San Francisco, CA | New York City, NY | Seattle, WA About Anthropic... ...for generating high-quality, open-ended tasks... ...robust rubric‑based evaluation systems that maintain... ...We encourage you to apply even if you do not believe...SuggestedWork at officeVisa sponsorshipFlexible hours
$213k - $339.9k
Software Engineer, Product Backend (8+ YOE) San Francisco, CA; Remote - US (Seattle, WA only)... ...design and implement new functionality in... ...stewards for experience quality and user delight... ...that allow agents, integrations, and... ...collaboration Based in New York City and are open...SuggestedFor contractorsLive inWork at officeRemote workFlexible hours$196k - $294k
...for a Full Stack Engineer to build the... ...personalized, agent-powered surface... ...engineers on a new team, working alongside... ...as a software engineer building... ...without sacrificing quality. You bias... ...ecosystem The San Francisco, CA base pay range... ...encourages everyone to apply for our...Suggested$180k - $260k
...Whether you’re previewing a new idea from v0 or... ...Senior Infrastructure Engineer on the CI/CD team, you... ...impacts how developers ship software on the web. We’re... ...location. Pay ranges outside San Francisco may be adjusted based... ...everyone to apply for our available positions...SuggestedWork from homeFlexible hours$196k - $339.9k
San Francisco, CA; New York, NY; Remote - US (Seattle, WA only) Airtable is the no-code app platform... ...Airtable’s services at scale. We enable engineers across the company to build, deploy,... ...Airtable Who You Are: 8+ years of software engineering experience, with deep expertise...Remote jobFor contractors$196k - $294k
...with a team of talented engineers to create durable,... ...developers to build AI agents, long-running processes... ...parallel execution, and evaluator-optimizer loops,... ...space as needed. The San Francisco, CA base pay range for... ...encourages everyone to apply for our available positions...Work from homeFlexible hours$215k - $230k
...all. The AI Engineering Team is chartered... ...involved in evaluating and... ...the LLM and agent space — including... ...in our San Francisco office. Employees... ...health and quality with operational... ...in how you apply AI to:... ...unfolding. Building new solutions... ..., New York, Washington...Work at officeLocal areaRemote workWorldwide3 days per week$185k - $275k
...About the job AI Agent Software Engineer $185K-$275K - San Francisco, CA | Apply Now Now Hiring: AI Agent Software Engineer in San Francisco, CA!Apply now with 24/7 PT.• Responsibilities:- Design and build AI-driven agents for voice and chat- Translate natural language...Immediate start$137k - $207k
...frontends and autonomous agents without infrastructure... ...a Senior Solutions Engineer to join our Professional... ...of experience in software engineering with at least... ...space as needed. The San Francisco, CA base pay range for... ...encourages everyone to apply for our available...Remote workWork from homeFlexible hours$179.4k - $224.25k
...customers. Solutions Engineers partner... ...-deployed Software and Machine learning... ...to develop agents in the initial... ...maximum target for new hire salaries... ...locations of San Francisco, New York, Seattle is:... ...and thorough evaluation of all applicants... ...the high-quality data and full-...Full time- ...Management Team Lead, North America - New York, San Francisco or Toronto Pigment Posted Mar 9 · Full... ...Quadrant\u2122 for Financial Planning Software. At Pigment, we take smart risks,... ...collaboration with other teams. Ensure high-quality, personalized or scaled experiences...Full timeFlexible hours
- ...with every release. About the role We’re looking for a backend engineer who’s excited to build the infrastructure behind cutting-edge... ...evolve robust, efficient open source libraries for tracing and evaluating LLM calls inside customer applications. Check out our SDKs...Flexible hours
$216k - $270k
Software is eating the world, but AI is eating... ...adjusts to this new reality, leading platform... ..., and production evaluations. This is the main... ...AI Data Engine, SGP, Donovan, and... ...in the location of San Francisco is: $216,000 — $2... ...provide the high‑quality data and full‑stack...Full timeLive in- ...Software Engineer, Agent Evaluation and Quality Engineering · Full-time · San Francisco; New York Our mission is to automate coding. The first step in our journey is to build the best tool for professional programmers, using a combination of inventive research,...Full timeWork at office
$197k - $247k
...With teams in Denver, San Francisco, and New York, we support more... ...the Team The Identity Engineering team is responsible... ...of 8 years in mobile software engineering Strong knowledge... ...to deliver high‑quality code across a... ...office expectations apply to all Symmetry roles...Full timeWork at officeLocal area2 days per week3 days per week- ...Client Location: On-site in San Francisco, CA Potential exposure to objectionable... ...0+ Years work experience as software engineer - Full-stack software... ...development, including code quality, security, and scalability.... .... • Stay up to date on new technologies and internal systems...Work experience placementSelf employment
- ...all. As an Engineering Manager on... ...with software — unencumbered... ...with LLMs, agents, and tools... ...team quality as a product... ...iteration, evaluation, reliability... ...in our San Francisco office. Employees... ...how you apply AI to:... ...Building new solutions... ...Francisco, New York,...Work at officeWorldwide3 days per week
- ...insights and feedback, AI agents, and automation, we help sales... ...nine offices, Paris, New York, San Francisco, Sydney, Madrid, London, Berlin... ...in As a Full-Stack Software Engineer for AI Voice Agent, you will... ...delivering high quality, secure, voice solutions to...Worldwide
$180.6k - $315k
...clients. As a Staff Agent Post-Training MLRE, you... ...maximum target for new hire salaries for the... ...in the locations of San Francisco, New York, Seattle is: $180,60... ...a fair and thorough evaluation of all applicants. About... ...provide the high-quality data and full-stack technologies...Full time$264k - $300k
...Teammates are agents that work like... ...for a Senior Engineering Manager to lead... ...team behind a new applied AI product introduction... ...based in our San Francisco office with an... ...upgrades (evaluating, integrating,... ..., and quality and evaluation... ...in enterprise software. This role is...Work at officeLocal areaWork from homeWorldwideShift work- ...feedback, AI agents, and automation... ..., Paris, New York, San Francisco, Sydney, Madrid... ...Infrastructure Engineer you'll be... ...delivering high quality, resilient... ...challenges; leads software design... ...their scope; applies a security-first... ...provides thoughtful evaluations and feedback...Worldwide
$135k - $236.25k
...Rippling, you can hire a new employee anywhere in... ...90 seconds. Based in San Francisco, CA, Rippling has raised... ...Team The IT Product Engineering team is critical to unifying... ...mission is to build software that enables... ...location; see which tier applies to your location here...Work at officeRemote work3 days per week$176k - $253k
...of Technical Staff, AI Quality, in San Francisco. Your main goal will be to turn agent quality into... ...standards through robust evaluation processes. You'll build... ...and work directly with engineers to ensure our AI systems... ...candidates have 3–6 years of software experience,...$165k - $190k
...hiring fullstack Applied AI Engineers to help us build AI agents that power every... ...intelligent, autonomous software a reality both... ...and setting new standards for how... ...be based in our San Francisco or New York office. Employees... ...agent architectures, evaluation pipelines, and...Work at officeFlexible hours- ...company based in San Francisco, with growing... ...in Atlanta, New York, London, Paris... ...production-grade AI agents : You'll... ...projects that engineers on our team... ...accountability, empathy, quality, and... ...encourage you to apply even if your... ...We strive to evaluate all applicants...Full timeFlexible hours
$200k - $400k
...clients and candidates. Software Engineer / Research Engineer, Agent Orchestration Location: San Francisco, CA or New York, NY Company Stage of... ...of distributed systems, applied machine learning,... ...testing infrastructure, and evaluation systems. Improve...Work at office- ...company based in San Francisco, with growing... ...in Atlanta, New York, London, Paris... ...platform to test AI agents against every... ...traditional software development... ...accountability, empathy, quality, and... ...you to apply even if your experience... ...We strive to evaluate all applicants...Full timeFlexible hours
- ...the role We're looking for an Application Security Engineer who lives in the code. Braintrust is a real-time, high... ...lead AI-specific security work: prompt injection, agent sandbox escapes, tool-use abuse, and the new attack surface that comes with LLM-native applications...Flexible hours
- Anysphere is seeking a Software Engineer for the Agent Quality team in San Francisco, CA. In this role, you will design and build infrastructure to evaluate and improve ML agents. Responsibilities include creating evaluation systems, defining quality metrics, and collaborating...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, Agent Evaluation and Quality Engineering · · San Francisco; New York Apply →. Be the first to apply!
- software sales engineer San Francisco, CA
- software engineer amazon San Francisco, CA
- software engineer student San Francisco, CA
- agile software developer San Francisco, CA
- rust software engineer San Francisco, CA
- software developer positions San Francisco, CA
- senior software design engineer San Francisco, CA
- software developer San Francisco, CA
- ngo software engineer San Francisco, CA
- startup software engineer San Francisco, CA

