Software Engineer, Agent Evaluation and Quality Engineering · · San Francisco; New York Apply →
Anysphere
Our mission is to automate coding. The first step in our journey is to build the best tool for professional programmers, using a combination of inventive research, design, and engineering. Our organization is very flat, and our team is small and talent dense. We particularly like people who are truth-seeking, passionate, and creative. We enjoy spirited debate, crazy ideas, and shipping code. About the Role As a Software Engineer on the Agent Quality team at Cursor, you’ll build the measurement, evaluation, and feedback-loop infrastructure that makes the Cursor core agent reliably better over time. This role sits at the intersection of product, data, and engineering: you’ll instrument what matters, help define how we judge quality, build pipelines and tooling to analyze agent behavior at scale, and partner closely with research, product, and infrastructure teams to turn insights into improvements. Your impact will compound across every Cursor product built on the shared harness—and across high-stakes decisions around model choice, quality, and cost. What you’ll work on Designing and building best-in‑class AI evaluation system: curated datasets, offline replay, scorers / judges, regression alerts, and dashboards. Designing feedback loops from real usage: collecting, cleaning, and interpreting user signals to inform model and harness changes. Developing analysis tooling and workflows for debugging agent behavior: deep dives on failure modes, clustering themes, and surfacing actionable insights. Improving reliability and guardrails by making quality measurable and operational: defining “good/bad/degraded” sessions, alerting, and triage primitives. You may be a fit if You’ve built and operated evaluation or measurement systems, such as AI evals, experimentation, ranking/relevance, or search quality. You can turn ambiguous “quality” questions into concrete metrics, pipelines, and decisions. You have strong data acumen, and can collaborate effectively with data scientists and researchers. You have taste and strong opinions on model and agent behaviors. You stay up-to-date and informed on emerging research and industry trends. You have strong software engineering fundamentals and enjoy shipping production systems. U.S. EQUAL EMPLOYMENT OPPORTUNITY INFORMATION (Completion is voluntary and will not subject you to adverse treatment) Anysphere, Inc. provides equal employment opportunities to applicants and employees without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, or disability. We invite all applicants to voluntarily self-identify their race, ethnicity, and gender. Submission of the information on this form is strictly voluntary and refusal to provide it will not subject you to any adverse treatment. Information obtained will be retained in a confidential file and separate from personnel records. This information may only be used in accordance with the provision of applicable federal laws, executive orders, and regulations. If you want more information about any of the sections, please check with a company representative. Male Female Decline to self-identify Race Hispanic or Latino White (Not Hispanic or Latino) Black or African American (Not Hispanic or Latino) Native Hawaiian or Other Pacific Islander (Not Hispanic or Latino) Asian (Not Hispanic or Latino) American Indian or Alaska Native (Not Hispanic or Latino) Two or More Races (Not Hispanic or Latino) Decline to self-identify SELF-IDENTIFICATION OF VETERAN STATUS (Completion is voluntary and will not subject you to adverse treatment) If you believe that you belong to any of the following categories of protected veterans, please indicate by making the appropriate selection. Disabled veteran – A veteran who served on active duty in the U.S. military and is entitled to disability compensation (or who but for the receipt of military retired pay would be entitled to disability compensation) under laws administered by the Secretary of Veterans Affairs, or was discharged or released from active duty because of a service‑connected disability. Recently separated veteran – A veteran separated during the three‑year period beginning on the date of the veteran’s discharge or release from active duty in the U.S. military, ground, naval, or air service. Active duty wartime or campaign badge veteran – A veteran who served on active duty in the U.S. military during a war, or in a campaign or expedition for which a campaign badge was authorized under the laws administered by the Department of Defense. Armed forces service medal veteran – A veteran who, while serving on active duty in the U.S. military ground, naval, or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985 (61 Fed. Reg. 1209). I identify as one or more of the classifications of protected veteran listed above. I am not a protected veteran. I decline to self‑identify for protected veteran status. #J-18808-Ljbffr Anysphere
$232k - $348k
...We’re looking for a Software Engineer to join the Vercel Agent team. Our mission... ...the agent acquires new capabilities at runtime... ...engineering, model evaluation, and retrieval-... ...space as needed. The San Francisco, CA base pay range... ...everyone to apply for our available positions...SuggestedFull timeWork at officeRemote workWork from homeMonday to FridayFlexible hoursShift work- Research Engineer, Virtual Collaborator at Anthropic - San Francisco, CA | New York City, NY | Seattle, WA About Anthropic... ...for generating high-quality, open-ended tasks... ...robust rubric‑based evaluation systems that maintain... ...We encourage you to apply even if you do not believe...SuggestedWork at officeVisa sponsorshipFlexible hours
$196k - $294k
...with a team of talented engineers to create durable,... ...developers to build AI agents, long-running processes... ...parallel execution, and evaluator-optimizer loops,... ...space as needed. The San Francisco, CA base pay range for... ...encourages everyone to apply for our available positions...SuggestedWork from homeFlexible hours$196k - $294k
...for a Full Stack Engineer to build the... ...personalized, agent-powered surface... ...engineers on a new team, working alongside... ...as a software engineer building... ...without sacrificing quality. You bias... ...ecosystem The San Francisco, CA base pay range... ...encourages everyone to apply for our...Suggested$180k - $260k
...Whether you’re previewing a new idea from v0 or... ...Senior Infrastructure Engineer on the CI/CD team, you... ...impacts how developers ship software on the web. We’re... ...location. Pay ranges outside San Francisco may be adjusted based... ...everyone to apply for our available positions...SuggestedWork from homeFlexible hours$260k - $400k
...Development Hybrid: San Francisco (preferred) / New York NO SPONSORSHIP... ...a product engineer. You will design... ...use to manage software environments securely... ...is high-quality engineering with... ...custom skills, or evaluation frameworks, and... ...how AI agents behave in real...- ...combination of inventive research, design, and engineering. Our organization is very flat, and our... ...IT Systems Engineer who thinks like a software engineer. You'll design, build, and... .... We work in‑person at our office in San Francisco. You might be a fit if… 5+ years of...Work at office
$185k - $275k
...About the job AI Agent Software Engineer $185K-$275K - San Francisco, CA | Apply Now Now Hiring: AI Agent Software Engineer in San Francisco, CA!Apply now with 24/7 PT.• Responsibilities:- Design and build AI-driven agents for voice and chat- Translate natural language...Immediate start$194k - $296k
...About the Role: We are looking for a Software Engineer to join our Activation team. In this role... ...to outfit your space as needed. The San Francisco, CA base pay range for this role is $1... ...by law. Vercel encourages everyone to apply for our available positions, even if...Work experience placementWork at officeRemote workWork from homeMonday to FridayFlexible hours$300k - $405k
...researchers, engineers, policy experts... ...a Full-Stack Software Engineer in RL... ...observability. The quality of Claude's... ...environments, evaluation systems, data... ...documentation tooling so new vendors and... ...you to apply even if you do... ...in San Francisco. We offer competitive...Work at officeVisa sponsorshipFlexible hoursShift work$320k
...committed researchers, engineers, policy experts, and... ...Engineers to build the evaluations that tell us — and the... ...throughout the lifecycle of a new capability — from... ...We encourage you to apply even if you do not... ...corporation headquartered in San Francisco. We offer competitive...Remote jobWork at officeVisa sponsorshipFlexible hours$179.4k - $224.25k
...customers. Solutions Engineers partner... ...-deployed Software and Machine learning... ...to develop agents in the initial... ...maximum target for new hire salaries... ...locations of San Francisco, New York, Seattle is:... ...and thorough evaluation of all applicants... ...the high-quality data and full-...Full time$184k - $259.44k
Software Engineer, Frontier AI Infrastructure San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC Ready to Apply? Join the team shaping the future of AI at... ...a fair and thorough evaluation of all applicants. About... ...provide the high-quality data and full-stack technologies...Full timeWork at office3 days per week$200k - $230k
...and customers. With teams in Denver, San Francisco, and New York, we’re proud to support more than 400... ...Rewards philosophy . Staff Observability Engineer Gusto’s Reliability Engineering team... ...role). The same office expectations apply to all Symmetry roles, Gusto's...Work at officeLocal areaRemote work2 days per week3 days per week$130k - $220k
...insights and feedback, AI agents, and automation, we help sales... ...nine offices, Paris, New York, San Francisco, Sydney, Madrid, London, Berlin... ...in As a Full-Stack Software Engineer for AI Voice Agent, you will... ...delivering high quality, secure, voice solutions to...Full timeWorldwide$350k
...committed researchers, engineers, policy experts,... ...to develop, evaluate, and optimize reward... ...systems for automated quality assessment of... ...researchers to rapidly test new rubric designs... ...encourage you to apply even if you do not... ...headquartered in San Francisco. We offer...Remote jobWork at officeVisa sponsorshipFlexible hours- ...Software Engineer, Agent Evaluation and Quality Engineering · Full-time · San Francisco; New York Our mission is to automate coding. The first step in our journey is to build the best tool for professional programmers, using a combination of inventive research,...Full timeWork at office
$35 - $45 per hour
...one of our backend engineering teams and contribute... ...agencies. RAM Data Quality (RDQ) owns the... ...data teams to develop new features, analytics... ...experimenting with or applying AI/ML tools in... ...be for the summer: San Francisco, Los Angeles, or New York City! Snacks, drinks...Hourly paySummer workInternshipWork at office$197k - $247k
...With teams in Denver, San Francisco, and New York, we support more... ...the Team The Identity Engineering team is responsible... ...of 8 years in mobile software engineering Strong... ...Ability to deliver high‑quality code across a... ...office expectations apply to all Symmetry roles...Full timeWork at officeLocal area2 days per week3 days per week$350k
...researchers, engineers, policy experts... ...be to design new architectures... ...Designing and evaluating advanced agentic... ...reliable, high quality code that your... ...developing software that utilizes... ...encourage you to apply even if you do... ...headquartered in San Francisco. We offer...Remote jobWork at officeVisa sponsorshipFlexible hours$350k
...committed researchers, engineers, policy experts,... ...Labs is a new team operating at... ...of researchers to software engineers, so each... ...detection systems, and evaluate their effectiveness... ...encourage you to apply even if you do not... ...headquartered in San Francisco. We offer competitive...Work at officeVisa sponsorshipFlexible hours$216.2k - $270.25k
...upon our prior model evaluation work with enterprise... .... About Data Engine Our Generative AI Data... ...deliver the highest-quality data at scale. Responsibilities... ...: 5+ years of software engineering... ...hybrid team based in San Francisco or New York City Compensation packages...Full time$350k
...committed researchers, engineers, policy experts, and... ...source representative evaluations to iterate on Build... ...operations, and develop new methods for... ...research engineering, or applied research, in academia... ...corporation headquartered in San Francisco. We offer competitive...Work at officeVisa sponsorshipFlexible hours$259.2k - $324k
...Research Scientist/ Engineer, Agents Join the team shaping... ...our prior model evaluation work with enterprise... ...and quickly turning new ideas into prototypes... ...in the locations of San Francisco, New York, Seattle is: $259,2... ...products provide the high‑quality data and full‑stack...Full time$28.85 - $40.87 per hour
...Applications Engineer - San Francisco, CA Salary Range $28.85 -... ...importer and distributor of quality CNC manufacturing... ...tools are properly applied, demonstrated, installed... ...to help customers evaluate and successfully implement... ...phone & iPad, New car discount program,...Hourly payTemporary workWork at office$140k - $175k
...the leading AI & Agent Engineering observability and evaluation platform , empowering... ...the most complex software ever deployed in... ...with new teams, and building... ...opened offices in New York City and the San Francisco Bay Area, as an option... ...Arizers subgroup Apply now Tell us why...Work experience placementRemote workWork from home- ...company based in San Francisco, with growing... ...in Atlanta, New York, London, Paris... ...production-grade AI agents: You'll build... ...projects that engineers on our team... ..., empathy, quality, and responsiveness... ...you to apply even if your experience... ...We strive to evaluate all applicants...Full timeFlexible hours
$214k - $300k
...the Role: Agent Dev... ...tooling and evaluation backbone that... ...ship high-quality AI faster and... ...ignore, so engineers across the... ...Strong software engineering... ...instincts also apply). You don... ...to roll out new workflows and... ...based in San Francisco or New York City, the estimated...Work at officeLocal area$135k - $236.25k
...Rippling, you can hire a new employee anywhere in... ...90 seconds. Based in San Francisco, CA, Rippling has raised... ...Team The IT Product Engineering team is critical to unifying... ...mission is to build software that enables... ...location; see which tier applies to your location here...Work at officeRemote work3 days per week- ...Location: Hybrid - San Francisco or New York City | Full-Time We are in search of a skilled Senior Software Engineer to lead the development of an innovative SaaS platform. This... ...Set and maintain high standards for code quality, performance, and maintainability....Full timeOverseasVisa sponsorship
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, Agent Evaluation and Quality Engineering · · San Francisco; New York Apply →. Be the first to apply!
- graduate software developer San Francisco, CA
- rust software engineer San Francisco, CA
- senior software design engineer San Francisco, CA
- software engineer student San Francisco, CA
- software engineer amazon San Francisco, CA
- software developer positions San Francisco, CA
- software engineer full time San Francisco, CA
- software qa engineer San Francisco, CA
- new graduate software engineer San Francisco, CA
- junior software developer San Francisco, CA


