Senior AI Quality Engineer — Agent Evaluation & Testing

$176k - $253k

Harper

Harper is seeking a Senior Member of Technical Staff, AI Quality, in San Francisco. Your main goal will be to turn agent quality into quantifiable metrics, ensuring high standards through robust evaluation processes. You'll build capability regression evaluation suites, design grading systems, and work directly with engineers to ensure our AI systems excel. Ideal candidates have 3–6 years of software experience, particularly in LLM and agent evaluations. Competitive compensation includes a base salary of $176,000–$253,000, with equity options and benefits like meals and a gym membership. #J-18808-Ljbffr Harper

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Senior AI Quality Engineer — Agent Evaluation & Testing in San Francisco, CA vacancy

Senior Principal AI Agent / ML Software Engineer (OCI)
$135.2k - $306.4k
...Job Description The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-level,... ..., memory, retrieval, evaluation, guardrails, and cloud services... ...eval suites, regression testing, experimentation, safety... ...to contribute high-quality production code, reviews...
Senior
Temporary work
Flexible hours
Oracle
San Francisco, CA
15 hours ago
Senior Principal AI Agent / ML Software Engineer (OCI)
$96.8k - $306.4k
Job Description The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands... ..., memory, retrieval, evaluation, guardrails, and cloud services... ...eval suites, regression testing, experimentation, safety... ...to contribute high‑quality production code, reviews...
Senior
Temporary work
Flexible hours
Oracle
San Francisco, CA
4 days ago
Senior SDET: AI Testing & Quality Engineer
A leading tech company in San Francisco is seeking a Quality Engineer to develop high-performance testing infrastructure and ensure the quality of software products. The ideal candidate has strong CS fundamentals and over 5 years of experience in building testing software...
Senior
Sigma Computing
San Francisco, CA
1 day ago
Senior AI Agent Engineer - Open Models & Evaluation Systems
Sail is the foundation of useful, agentic AI. We are here to take a big swing at the most ambitious engineering challenge of our careers. Everyone working at Sail will... ...is just one piece of an effective background agent. Let's design and build the rest of the system,...
Senior
Work at office
Immediate start
Sail Research
San Francisco, CA
1 day ago
Senior AI Web QA Engineer (SDET)
Adobe Inc. in San Francisco is seeking a Senior Software Quality Engineer to establish standards for a new AI-first web product. You will build fast, reliable test systems for evolving AI workflows and ensure real-time performance, accessibility and cross-browser coverage...
Senior
Adobe Inc.
San Francisco, CA
1 day ago
Sr. AI / Machine Learning Platform Engineer - Voice Agents
...company seeks language agnostic engineers who are able and... ...building a simulation and evaluation platform for AI agents, and 50% working on the framework... ...used to rigorously test, benchmark, and improve AI... ...performance, reliability, decision quality, and failure modes. Build...
Senior
Skyrocket Ventures
San Francisco, CA
1 day ago
Senior AI Agent Runtime & Evals Engineer - Remote Equity
Braven is agentic infrastructure for reinsurance. A Senior engineer on the Platform team you own the agent runtime and the evaluation system around it. Paid in USD. Remote, with in person hackathons and offsites. You will own the agent runtime and evaluation system, building...
Senior
Remote job
Sytrex
San Francisco, CA
1 day ago
Principal AI Engineer (LLM Agents & Orchestration)
Role Title: Principal AI Engineer (LLM Agents & Orchestration) Focus: Building Autonomous "Super Agents" Who We Are... ...memory and context awareness across sessions. Evaluation & Observability: Establish a rigorous testing framework for non‑deterministic model outputs to...
ImagineArt
San Francisco, CA
1 day ago
Senior AI Web Quality Engineer
Adobe is seeking a Senior Software Quality Engineer to own the quality strategy for a new AI-first web application. You... ...and maintain modern web test automation, spanning the prompt/agent layer, in-browser experience... .... You will stand up evaluation harnesses, ensure...
Senior
Adobe
San Francisco, CA
1 day ago
Senior AI Simulation & Agent Infra Engineer
Techire AI is seeking a Senior Software Engineer to design and build simulation environments for frontier AI. You will own end-to-end RL environments... .... You will collaborate with researchers to push evaluation quality, reliability, and safety of AI systems, shaping the...
Senior
techire ai
San Francisco, CA
1 day ago
AI Agent Abuse Prevention Engineer
$240k - $360k
## AI Agent Abuse Prevention EngineerPostulerremote type... ...**Zendesk is hiring a Senior Staff-level technical... ...company-wide and engineering changes to prevent AI... ...be used to screen or evaluate applications for this... ...complete any pre-employment testing, or otherwise participate...
Remote work
Zendesk Group
San Francisco, CA
2 days ago
AI Engineer - Agent
$140k - $300k
This role owns the full Agent development lifecycle -... ...including orchestration engines, tool-use pipelines, memory... ..., monitoring) and evaluation pipelines covering personalization quality, memory accuracy, and multi... ...canary deployment, A/B testing, and production feedback...
Kaon (prev. FlowGPT)
San Francisco, CA
1 day ago
Senior AI Evaluation Engineer
$240k - $280k
Sentry, located in San Francisco, is searching for a Senior Software Engineer to enhance its AI/ML team. In this role, you'll build the evaluation infrastructure for AI systems, ensuring they perform accurately and reliably. Responsibilities include designing datasets,...
Senior
Dormont Manufacturing Co
San Francisco, CA
1 day ago
Senior AI Engineer, Evaluation & Reliability (Remote)
$200k - $250k
Fieldguide, a remote-first company based in San Francisco, is seeking a Senior AI Engineer to lead the evaluation infrastructure for AI agents. This role focuses on enhancing the reliability and performance of AI systems to support top accounting and consulting firms....
Senior
Remote job
Flexible hours
Fieldguide
San Francisco, CA
1 day ago
Senior AI Product Engineer
$260k
...Title : Senior AI Product Engineer Location : San Francisco, CA... ...decisions about how LLMs, agents, and RAG pipelines... ...signal on AI output quality and feed it back to the... ...-stack integration testing Establish reusable... ...AI systems are built, evaluated, and deployed in high...
Senior
Harnham
San Francisco, CA
1 day ago
Staff AI Engineer — Real-Time Voice & Agent Evaluation
...build core simulation engines, evaluation systems, and... ...improving conversational agents. You’ll work at... ...real-time voice, AI evaluation, and... ...shaping how agents are tested and improved. You’... ...with senior engineers on latency... ...barge-in, and audio quality challenges, while...
Cekura
San Francisco, CA
1 day ago
Senior AI Evaluation Engineer — Metrics & Data Pipelines
$240k - $280k
A leading software monitoring company is seeking a Senior Software Engineer on its AI/ML team to build evaluation infrastructure for measuring the performance of AI systems. This role involves designing datasets, creating benchmarks, and ensuring AI features behave reliably...
Senior
Sentry
San Francisco, CA
15 hours ago
Senior Software Engineer, Agents
Responsibilities Build the AI Runtime Design and... ...and verify. Build evaluation frameworks that... ...improve quality. Implement permission... ...Product, Design, Sales Engineering, and Customer... ...Experiment with new agent architectures... ...APIs, concurrency, testing, and system design...
Senior
Jobtailor
San Francisco, CA
15 hours ago
Senior AI Software Engineer
$160k - $207k
...gets smarter as you build, with AI that learns your context to... ...Gartner in Application Security Testing and is trusted by leading... ...dev. About the role As an AI engineer, you’ll apply LLM technologies... ...powered solutions and rigorously evaluate the efficacy of different prompts...
Senior
Currently hiring
Local area
Remote work
Weekend work
3 days per week
Semgrep
San Francisco, CA
4 days ago
Cyber Strategy, Risk & Compliance - AI Engineering for Cybersecurity - Senior Manager
$124k - $280k
...expertise, and network to deliver quality results. You motivate and... ...through innovative, AI-driven solutions. As a Senior Manager, you will lead... ...strategy, transformation and engineering projects and teams Design... ...closely with team members. We evaluate these factors thoughtfully...
Senior
Full time
H1b
PwC
San Francisco, CA
3 hours ago
Senior AI/ML Engineer: Training & Evaluation (Remote)
$80 per hour
Prolific is seeking an AI & Machine Learning Engineer in San Francisco to evaluate and refine AI models. The role involves auditing ML code, providing human feedback on AI frameworks, and analyzing model reasoning. Candidates should have a relevant degree and experience...
Senior
Remote job
Hourly pay
Work from home
Flexible hours
Prolific
San Francisco, CA
1 day ago
Senior Software Engineer, Agent Orchestration
$250k - $375k
Senior Software Engineer, Agent Orchestration Decagon is a leading conversational AI platform that empowers every brand to... ...coordinating model reasoning, evaluating agent behavior, and... ..., reliability, and quality. In this role, you... ...through better testing, observability, and...
Senior
Full time
Work at office
Decagon
San Francisco, CA
1 day ago
Senior Software Quality Engineer
$113.4k - $221.75k
...a new‑generation, AI‑first web application... ...from scratch, and quality engineering is a core founding... ...focus. We seek a Senior Software Quality... ...creating fast, reliable test systems for... ...spanning the prompt and agent layer, the in‑... .... Stand up evaluation harnesses for automated...
Senior
Shift work
Adobe
San Francisco, CA
3 hours ago
Senior Software Engineer, Agent Orchestration
$200k - $400k
...conversational AI platform empowering... ...to deploy AI agents that power... ...frontier‑style engineering. The team continuously... ...offline evaluation and online experimentation... ...to improve quality, reliability,... ...the Role As a Senior Software... ...experimentation (A/B testing) and contribute...
Senior
Full time
Work at office
Local area
Decagon
San Francisco, CA
1 day ago
Senior Software Engineer, Quantitative Evaluations
$204k - $259k
...environments crucial for testing and training the... ...behavior of diverse agents (vehicles, pedestrians... ...we create metrics and evaluation methodologies to measure... ...assessing simulator quality, identifying critical... ...and reports to a Senior Engineering Manager. You will:...
Senior
Waymo
San Francisco, CA
1 day ago
Member of Technical Staff (AI Software Engineer, Agents)
$200k - $300k
...Full time Department AI Compensation $200K -... ...is seeking an energetic engineer to join our highly driven Comet Agents engineering team. The Comet... ...a high craft and quality bar, in both AI agent performance... ..., code quality, AI evaluation, testing, and maintenance across...
Full time
Flexible hours
B Capital
San Francisco, CA
4 days ago
Senior Software Engineer, Investigative Agent
$170k - $200k
...Opportunity We're hiring a Senior Software Engineer to drive the... ...Nightshift , a conversational agent that helps... ...Experience with LLM evaluation (LangSmith/Langfuse),... ...and logging for the AI system to monitor agentic... ...Establish best practices for testing and deploying AI...
Senior
Work at office
Work from home
Home office
Flexible hours
Night shift
Flock
San Francisco, CA
15 hours ago
Staff/Senior Agentic AI Engineers (Multiple roles) San Francisco, California
$180k - $215k
Staff/Senior Agentic AI Engineers (Multiple roles) Heartflow is a medical technology... ..., non‑invasive cardiac test supported by the ACC/AHA Chest... ...autonomous, intelligent agents to execute multi‑step medical... ...Implement advanced guardrails, evaluation frameworks, and reasoning...
Senior
Local area
Worldwide
Relocation
HeartFlow, Inc.
San Francisco, CA
1 day ago
AI Engineer/ML Engineer - Senior Developers - AI Training - San Francisco, US
$80 per hour
Overview AI & Machine Learning Engineer - AI Training at Prolific. Prolific is building the biggest pool of quality human data in the world. Over 35,000 AI... ...skills. Responsibilities Evaluate LLM architecture logic:... ...performance: conduct comparative testing between different model...
Senior
Hourly pay
Work from home
Flexible hours
Prolific
San Francisco, CA
1 day ago
Senior AI Platform Engineer: Simulation & Evaluation
Skyrocket Ventures is looking for engineers who are proficient in Python and willing to learn new technologies. The role focuses on building and improving simulation and evaluation platforms for AI agents in San Francisco. Successful candidates will have 5-10 years of...
Senior
Skyrocket Ventures
San Francisco, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Quality Engineer — Agent Evaluation & Testing. Be the first to apply!