Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Quality Engineer — Agent Evaluation & Testing

$176k - $253k

Harper

Harper is seeking a Senior Member of Technical Staff, AI Quality, in San Francisco. Your main goal will be to turn agent quality into quantifiable metrics, ensuring high standards through robust evaluation processes. You'll build capability regression evaluation suites, design grading systems, and work directly with engineers to ensure our AI systems excel. Ideal candidates have 3–6 years of software experience, particularly in LLM and agent evaluations. Competitive compensation includes a base salary of $176,000–$253,000, with equity options and benefits like meals and a gym membership. #J-18808-Ljbffr Harper

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior AI Quality Engineer — Agent Evaluation & Testing in San Francisco, CA vacancy
  • Anysphere is seeking a Software Engineer for the Agent Quality team in San Francisco, CA. In this role, you...  ...design and build infrastructure to evaluate and improve ML agents. Responsibilities...  ...candidates will have experience in AI evaluations, data analysis, and solid... 
    Suggested

    Anysphere

    San Francisco, CA
    4 days ago
  • $96.8k - $306.4k

     ...Job Description The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-level,...  ..., memory, retrieval, evaluation, guardrails, and cloud services...  ...eval suites, regression testing, experimentation, safety...  ...to contribute high-quality production code, reviews... 
    Senior
    Temporary work
    Flexible hours

    Oracle

    San Francisco, CA
    4 days ago
  • OutSystems is seeking a Senior AI Quality Engineer in San Francisco to lead quality initiatives in an AI-powered environment. This role includes defining test strategies, mentoring engineers, and ensuring reliable product delivery. The ideal candidate has extensive experience... 
    Senior

    OutSystems

    San Francisco, CA
    2 days ago
  • $204k - $235k

    OutSystems, Inc. is seeking a Senior AI Quality Engineer to lead quality management within their AI-integrated product ecosystem. In this role, you will define testing strategies, implement automation processes, and mentor junior engineers. The ideal candidate will have... 
    Senior

    OutSystems, Inc.

    San Francisco, CA
    4 days ago
  • Ellipsis Health is seeking a Forward Deployed QA Engineer to ensure the quality of its conversational AI product, Sage. The role requires expertise in software...  ...will engage with clients to perform rigorous testing and analysis. This position is based in the San Francisco... 
    Senior
    Remote job
    Flexible hours

    Ellipsis Health

    San Francisco, CA
    4 days ago
  • OutSystems Inc. is looking for a Senior Quality Engineer in San Francisco to lead quality initiatives within our AI product ecosystem. You will define and implement comprehensive testing strategies, focusing on automation and AI metrics, ensuring the reliability and success... 
    Senior

    OutSystems Inc.

    San Francisco, CA
    21 hours ago
  • Cacheflow is seeking a Senior Applied Research Engineer to enhance the effectiveness of our AI systems through focused research and experimentation. This role involves designing information retrieval strategies and collaborating with engineers to turn validated approaches... 
    Senior
    Flexible hours

    Cacheflow

    San Francisco, CA
    2 days ago
  • $160k - $207k

     ...gets smarter as you build, with AI that learns your context to...  ...Gartner in Application Security Testing and is trusted by leading...  ....dev. About the role As an AI engineer, you’ll apply LLM technologies...  ...powered solutions and rigorously evaluate the efficacy of different prompts... 
    Senior
    Currently hiring
    Local area
    Remote work
    Weekend work
    3 days per week

    Semgrep

    San Francisco, CA
    2 days ago
  • $200k - $290k

     ...is building production-ready AI agents that handle complex, real-world...  ...at scale. As a Full-Stack Engineer on the Agent Engineering team...  ...performance. Integrate and evaluate cutting-edge text and voice models...  ..., maintainable code, strong testing practices, and thoughtful... 

    Viridian Staffing

    San Francisco, CA
    24 days ago
  • black.ai is seeking a Senior Research Engineer in San Francisco to develop next-gen generative video and audio technology. This role significantly impacts...  ...designing context for multi-turn sessions, building evaluation metrics, and closely collaborating with product and... 
    Senior

    black.ai

    San Francisco, CA
    2 days ago
  • Principal AI Engineer (LLM Agents & Orchestration) Role Title: Principal AI Engineer (LLM Agents & Orchestration) Focus...  ...memory and context awareness across sessions. Evaluation & Observability: Establish a rigorous testing framework for non‑deterministic model outputs... 

    ImagineArt

    San Francisco, CA
    21 hours ago
  • $194k - $239k

     ...Senior Agentic Ai Engineer Hover helps people design, improve, and...  ...focused on building, testing, and improving production...  ...of: multi-agent orchestration production AI systems evaluation and reliability engineering...  ...to deliver high-quality AI experiences. Contribute... 
    Senior
    Full time
    For contractors
    Work at office
    Local area
    Flexible hours

    Almaz Capital

    San Francisco, CA
    1 day ago
  •  ...re building an agentic AI caregiver advocate...  ...coordinates across multiple sub-agents to get things done. It...  ...over time. The AI engineering challenge: build an...  ...-use frameworks, and evaluation infrastructure that...  ...good enough. Develop testing infrastructure for multi... 
    Senior
    Immediate start
    Remote work
    Flexible hours

    Citizen Health

    San Francisco, CA
    10 hours ago
  •  ...Senior AI Engineer Disney Entertainment and ESPN Product & Technology...  ...You will create shared agents, initializers,...  ...governance, observability, and evaluation, so teams can deliver high-quality AI solutions quickly—...  ...in Python (libraries, testing, packaging) and... 
    Senior

    The Walt Disney Studios

    San Francisco, CA
    4 days ago
  • $141.9k - $190.3k

     ...Senior Software Engineer - AI Core Engineering Disney Entertainment and...  ...You will create shared agents, initializers,...  ...governance, observability, and evaluation, so teams can deliver high-quality AI solutions quickly—...  ...in Python (libraries, testing, packaging) and... 
    Senior

    Disney France

    San Francisco, CA
    4 days ago
  • $240k - $280k

    A leading software monitoring company is seeking a Senior Software Engineer on its AI/ML team to build evaluation infrastructure for measuring the performance of AI systems. This role involves designing datasets, creating benchmarks, and ensuring AI features behave reliably... 
    Senior

    Sentry

    San Francisco, CA
    4 days ago
  • $170k - $210k

     ...Summary At the Innovation & Engineering Center California (IECC) we conduct...  ...models in real vehicles, evaluate testability, and ensure...  ...documentation skills. Desired Skills AI ethics: bias mitigation and...  ...‑employment substance abuse testing. #J-18808-Ljbffr... 
    Senior
    Contract work
    Overseas

    SupportFinity™

    San Francisco, CA
    1 day ago
  •  ...inventive research, design, and engineering. Our organization is very flat...  ...As a Software Engineer on the Agent Quality team at Cursor, you’ll build the measurement, evaluation, and feedback-loop...  ...Designing and building best-in‑class AI evaluation system: curated datasets... 

    Anysphere

    San Francisco, CA
    4 days ago
  • B Capital is seeking a highly skilled AI Platform Engineer to enhance their ML/AI platform that powers autonomous AI agents at scale. This pivotal role combines software engineering...  ...agent harness infrastructure, implement evaluation frameworks, and ensure a seamless journey... 
    Senior

    B Capital

    San Francisco, CA
    3 days ago
  • $137k - $188k

     ...reporting to the Forensic Engineering Manager, the Senior Compliance Engineer is a key...  ...Essential Job Functions: Product Testing and Analysis Test consumer...  ..., and multimedia framework evaluation. Analyze Android and iOS...  ...gather intelligence. Use AI‑assisted tools to support product... 
    Senior
    Full time
    Work at office
    Local area
    Remote work
    Worldwide

    Dolby

    San Francisco, CA
    11 hours ago
  • $216k - $270k

    About Scale AI Scale AI is the data...  ...Role Overview As a Senior Forward Deployed AI Engineer on our...  ...configure AI models and agents within customer...  ...sources Implement evaluation frameworks to...  ...experimentation and A/B testing to improve model...  ...the high‑quality data and full‑... 
    Senior
    Full time

    Neura Market

    San Francisco, CA
    2 days ago
  • Drata is seeking a Senior Applied Research Engineer to enhance the quality of AI systems through rigorous evaluation and experimentation. This role emphasizes applied research, focusing on information retrieval and reasoning strategies. The ideal candidate will bring 5+... 
    Senior

    jobr.pro

    San Francisco, CA
    21 hours ago
  • $130k - $160k

     ...Senior Quality Engineer – Design Assurance (Firmware / Electrical Systems) An innovative, well-...  ...and assess quality, validation, and testing impacts. Develop and execute test...  ...Experience developing test methodologies and evaluating the impact of design changes on... 
    Senior
    Contract work

    SciPro

    San Francisco, CA
    21 hours ago
  • $124k - $280k

     ...expertise, and network to deliver quality results. You motivate and...  ...through innovative, AI-driven solutions. As a Senior Manager, you will lead...  ...strategy, transformation and engineering projects and teams Design...  ...with team members. We evaluate these factors thoughtfully... 
    Senior
    Full time
    H1b

    PwC

    San Francisco, CA
    21 hours ago
  •  ...at the intersection of AI, biology, chemistry, and large-scale engineering. Our goal is to translate...  ...systems. The Role As a Senior AI/ML Engineer, you will...  ...Do Design, train, and evaluate large-scale models, including...  ...: clean code, testing, reproducibility, and debugging... 
    Senior
    Remote work
    Flexible hours

    Absentia Labs

    San Francisco, CA
    21 hours ago
  • $155k

     ...About the Team The Quality Engineering team builds the shared testing infrastructure and...  ...are looking for a Senior Software Engineer,...  ...in implementing how AI reshapes quality engineering...  ...that enable AI agents to validate the...  ...Experience using or evaluating AI-powered engineering... 
    Senior
    Contract work
    Local area
    Home office
    Flexible hours
    Shift work

    Scribd, Inc.

    San Francisco, CA
    2 days ago
  •  ...B consulting Industry. AI Transformation will be...  ...will be delivered through agents. We built that. Our AI...  ...for an AI Product Engineer who sits at the intersection...  ...you prototype ideas, evaluate tradeoffs, and ship MVPs...  ...prototype rapidly and test product directions with... 
    Senior
    Work at office
    Local area
    Relocation package

    Klarity Intelligence, Inc.

    San Francisco, CA
    2 days ago
  • $86.5k - $142.7k

     ...prototypes and builds modern, AI‑enabled applications and...  ...‑of‑concept, and guiding engineering teams through complex...  ...search, prompt orchestration, evaluation and guardrails. Author...  ...and raise technical quality. Leverage AI coding and testing tools to accelerate development... 
    Senior
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    San Francisco, CA
    2 days ago
  • $120k - $140k

     ...manage User Acceptance Testing (UAT) deadlines,...  ...maintaining rigorous quality standards. Ellipsis...  ...Forward Deployed QA Engineer, you will occupy a critical...  ...core conversational AI product, Sage, across...  ...by configuring shadow agents. Prompt Evaluation & Optimization: Apply... 
    Senior
    Full time
    Remote work
    Flexible hours

    Ellipsis Health, Inc.

    San Francisco, CA
    12 hours ago
  • Perplexity is seeking energetic engineers to join our highly driven Agents engineering team. The Agents team consists of AI/ML, backend, and full-...  ...Ensure a high craft and quality bar, in both AI agent...  ...reliability, code quality, AI evaluation, testing, and maintenance across... 
    Flexible hours

    Neura Market

    San Francisco, CA
    21 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Quality Engineer — Agent Evaluation & Testing. Be the first to apply!