Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Software Engineer, AI Evals

$240k - $280k

Sentry

About Sentry

Software runs the world and the pace is faster than ever. Sentry helps developers fix errors and performance issues before users notice, so teams can spend less time firefighting and more time building.

Trusted by 200,000+ organizations, Sentry is today's application monitoring standard and our team is building its AI-native future.

About the role

As a Senior Software Engineer on Sentry's AI/ML team, you'll be responsible for building the evaluation infrastructure that measures the accuracy, reliability, and real-world performance of our AI systems. This role is critical to ensuring that our debugging agents and AI-powered features behave correctly, safely, and predictably as they scale. You'll design datasets, benchmarks, and test harnesses that turn ambiguous AI behavior into measurable signals, helping the team ship AI with confidence.

In this role you will
  • Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
  • Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
  • Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
  • Partner closely with applied AI engineers and product leaders to define what "good" looks like and translate it into measurable criteria
  • Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring
You'll love this job if you
  • Care deeply about correctness, rigor, and measurement in AI systems
  • Enjoy turning fuzzy product goals and model behavior into concrete tests and metrics
  • Like building foundational infrastructure that unlocks faster iteration and higher confidence for the entire AI team
  • Thrive in cross-functional environments and enjoy influencing model design through better evaluation
Qualifications
  • Minimum 5+ years of professional experience with a Bachelor's degree in computer science, machine learning, or a related field
  • Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred)
  • Comfort writing production-quality code (we use Python and TypeScript)
  • Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines
  • Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts)
  • Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools

The base salary range (or hourly wage range, if applicable) that Sentry reasonably expects to pay for this position is $240,000 to $280,000 USD . A successful candidate's actual base salary (or hourly wage) amount will be determined by a variety of relevant factors including, without limitation, the candidate's work location, education, work and other relevant experience, skills, and job-related knowledge. A successful candidate will be eligible to participate in Sentry's employee benefit plans/programs applicable to the candidate's position (including incentive compensation, equity grants, paid time off, and group health insurance coverage). See Sentry Benefits for more details about the Company's benefit plans/programs.

Equal Opportunity at Sentry

Sentry is committed to providing equal employment opportunities to its employees and candidates for employment regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, or other legally-protected characteristic. This commitment includes the provision of reasonable accommodations to employees and candidates for employment with physical or mental disabilities who require such accommodations in order to (a) perform the essential functions of their jobs, or (b) seek employment with Sentry. We strive to build a diverse team, with an inclusive culture where every teammate can thrive. Sentry is an open-source company because we believe that everyone, everywhere, should have the ability and tools to make great software. Software should be accessible. That starts with making our industry accessible.

If you need assistance or an accommodation due to a disability, you may contact us at View email address on click.appcast.io.

Want to learn more about how Sentry handles applicant data? Get the details in our Applicant Privacy Policy.
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Software Engineer, AI Evals in San Francisco, CA vacancy
  •  ...AI Operating System For Healthcare At Commure, we're building...  ...by partnering with engineering and product teams on how models...  ...extensive experience Professional software development industry experience...  ...applications, including designing evals and improving performance... 
    Senior
    Work at office
    Immediate start

    Commure

    San Francisco, CA
    3 days ago
  • $175k - $225k

     ...Senior Backend Engineer In person 5 days/week in San Francisco, Boston, MA, New York. We are looking...  ...power LangChain's observability and evals platform. You will work on the core...  ...developers to monitor and evaluate their AI applications at scale. While the focus... 
    Senior
    Work at office
    Flexible hours

    LangChain

    San Francisco, CA
    2 days ago
  • $170k - $195k

     ...ubiquitous. We provide the agent engineering platform and open source...  ...developers worldwide and powers AI teams at companies like Replit...  ...London. We are looking for a Senior Backend Engineer to join us. In...  ...LangChain’s observability and evals platform. You will work on the... 
    Senior
    Worldwide
    Flexible hours

    LangChain

    San Francisco, CA
    2 days ago
  • $320k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...group of committed researchers, engineers, policy experts, and business...  ...measurement gaps, and evolve evals so they remain unsaturated and...  ...qualifications 6+ years of industry software engineering experience... 
    Suggested
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    3 days ago
  • $175k - $240k

     ...our mission is to make intelligent agents ubiquitous. We build the foundation for agent engineering in the real world, helping developers move from prototypes to production-ready AI agents that teams can rely on. We began as widely adopted open-source tools and have grown... 
    Senior
    Work at office
    Flexible hours

    LangChain, Inc

    San Francisco, CA
    1 day ago
  • $150k - $250k

     ...Job Description Filevine is a Legal AI company delivering Legal Operating Intelligence...  ...country. Role Summary:   As a Senior Software Engineer, you’ll own major parts of our AI stack...  ..., embeddings, experimentation, and evals   ~ Ability to design multi-step pipelines... 
    Senior
    Full time
    Contract work
    Temporary work
    Work experience placement

    Filevine

    San Francisco, CA
    24 days ago
  • $155k - $195k

     ...intelligent agents ubiquitous. We help developers build mission-critical AI applications across the entire agent development lifecycle. Our...  ...their organization. Founded in 2023, LangChain powers top engineering teams at companies like Replit, Lovable, Clay, Klarna, LinkedIn... 
    Senior

    LangChain

    San Francisco, CA
    1 day ago
  • Lassie is building the AI for every health practice. One million health practices spend...  ...'s office. About the role We're hiring an engineer to build the AI agents that run doctor's offices...  ...: agent architectures, data pipelines, evals, and the infrastructure that makes it all... 
    Senior
    Work at office

    Lassie

    San Francisco, CA
    2 days ago
  • $200k - $240k

    Your Role You are the engineering anchor of Zip’s new Internal AI team. The Internal AI team operates in a hub-and...  ...accelerate AI adoption across every non-software engineering use case at Zip and we...  ...patterns, data handling, identity, evals, observability. Be the single... 
    Senior
    Home office
    Flexible hours

    ZipHQ, Inc.

    San Francisco, CA
    2 days ago
  • $230k

     ...Software Engineer This role blends traditional software engineering, agent management, and system...  ...be the expert on agentic harnesses, evals, and best practices. Your experience wielding...  ...what we're building more than ever. AI demand is driving dirty energy generation... 
    Senior
    Work at office
    Visa sponsorship

    Gravity

    San Francisco, CA
    2 days ago
  •  ...freedom. About the Role We are hiring Senior Software Engineers to help shape the future of January’s platform...  ...reality. Consumer Voice - Owns Voice AI conversation flows for inbound and...  ..., plus the LLM platform infrastructure—evals, consumer context, and guardrails—that... 
    Senior
    Currently hiring
    Work at office

    January

    San Francisco, CA
    16 hours ago
  • About Nooks.ai: Nooks is an applied AI lab building the Agent Workspace for GTM. We...  ...volume for real-time voice. We're hiring senior engineers to push both sides forward. What you'll...  ...latency budget and have opinions about evals. Strong opinions, held loosely, about what... 
    Senior
    Work at office
    3 days per week

    nooks

    San Francisco, CA
    4 days ago
  •  ...hardware, by developing the first AI Hardware Engineer. Our goal is to democratize...  ...first AI Hardware Engineer, software that can design real,...  ...electronics from a prompt. As a Senior Software Engineer, Agentic...  .... Build and analyze evals; translate data into engineering... 
    Senior
    Internship
    Remote work
    Shift work

    Flux Enterprise

    San Francisco, CA
    2 days ago
  •  ...type : Full-time Department : Engineering and Development Workplace...  ...Office Experience : 0 years Senior Software Engineer (Startup) About the...  ...intersection of web apps and modern AI systems. That means we`re...  ...depend on LLMs—prompt pipelines, evals, guardrails, retrieval, and... 
    Senior
    Full time
    Work at office
    Flexible hours
    Weekend work

    SproutsAI

    San Francisco, CA
    1 day ago
  • $190k - $230k

    At Sanity, we’re building the future of AI-powered Content Operations. Our AI Content...  ...a designer, product manager, and fellow engineers to move fast, experiment constantly, and...  ...with minimal human guidance. Design and run evals - build evaluation suites using... 
    Senior
    Work at office
    Flexible hours
    2 days per week

    Sanity CMS

    San Francisco, CA
    2 days ago
  • $160k - $180k

     ...Summary: As a Staff Full Stack Engineer on the contract review and...  ...systems that power our AI-native workflow engine. You...  ...embeddings, prompt libraries, and evals Collaborate directly with...  ...high-quality, reliable software quickly in a small, senior team Improve performance,... 
    Senior
    Full time
    Contract work
    Temporary work
    Work experience placement
    Work at office

    Filevine

    San Francisco, CA
    4 days ago
  • $226k - $306k

    About the Team The AI Platform team is a newly formed team at the center of Mixpanel...  ...system for Mixpanel AI that uses evals and metrics from production to optimize...  ...looking for an experienced and driven Senior Software Engineer to join our AI Platform team. You will... 
    Senior
    Remote work

    Jobr

    San Francisco, CA
    3 days ago
  • About Reducto Reducto helps AI teams ingest real world enterprise...  ...Capital, and are looking for senior engineers for our Platform team. The Opportunity As a Senior Software Engineer on our Platform team,...  ...Building internal tooling and evals to better understand/analyze... 
    Senior
    Work at office
    Local area

    Reducto

    San Francisco, CA
    2 days ago
  • $153k - $246.4k

     ...physical lab space, R&D capabilities, AI/ML tools, and decades of enterprise learning...  ...from scratch. We’re looking for a Senior Software Engineer to be one of the early engineers on this...  ...the practical challenges: reliability, evals, context management, and when not to use... 
    Senior
    Full time
    H1b
    Visa sponsorship
    Work visa
    Flexible hours

    Eli Lilly and Company

    San Francisco, CA
    9 hours ago
  • $170k - $195k

    A tech company specializing in AI is looking for a Senior Backend Engineer to build backend systems for their observability and evaluation platform. This role requires over 5 years of experience in backend engineering and proficiency in languages like Python or Go. The... 
    Senior

    LangChain

    San Francisco, CA
    2 days ago
  • $140k - $175k

     ...ubiquitous. We provide the agent engineering platform and open source...  ...developers worldwide and powers AI teams at companies like Replit...  ...commercial observability and evals platform product. In this role...  ...role 2+ years of experience in software engineering working on complex... 
    Worldwide
    Flexible hours

    LangChain

    San Francisco, CA
    2 days ago
  •  ...Join Our Fast-Growing Startup 1/ Join a fast-growing startup before Series A, bringing AI to the $1T maps and geospatial industry. 2/ Work with technical founders who have led Eng, Product, Marketing teams at FAANG and Series C+ companies. 3/ Build systems that... 
    Senior
    Work at office

    Reprompt

    San Francisco, CA
    2 days ago
  •  ...Software Engineer Magnetic is hiring a skilled Software Engineer to join our core development team. You'll be responsible for building and maintaining the backend systems that power our AI-driven tax prep platform - including model pipelines, data extraction logic,... 
    Senior

    Magnetic

    San Francisco, CA
    1 day ago
  •  ...continuous and deeply human. Heidi is building an AI Care Partner that works alongside...  ...possible. We’re a team of doctors, engineers, designers, researchers, and creatives building...  ...What we’re looking for ~5+ years of software engineering experience , with a track... 
    Senior
    Work at office
    Worldwide

    Heidi Health

    San Francisco, CA
    16 hours ago
  •  ...Endeavor is real-world enterprise AI. Our software is the distribution layer for powerful AI models that will build the world....  ...will win. And we work really hard to win. The Role | Senior Software Engineer As a Senior Software Engineer at Endeavor, you will design... 
    Senior

    Endeavor AI, Inc

    San Francisco, CA
    4 days ago
  • $202k - $251k

    Decisive Point is seeking a Senior Software Engineer in San Francisco to build AI-powered workflows. This hybrid role involves designing features, mentoring engineers, and collaborating with product teams. Ideal candidates will have over 5 years of experience and a penchant... 
    Senior

    Decisive Point

    San Francisco, CA
    3 days ago
  •  ...Role Responsibilities: As a Senior Backend Engineer at Anon, you'll architect and scale the enterprise data migration platform that enables...  ...-growth early-stage tech startup • Previous experience at AI/Agent related startups • GraphQL experience • Networking... 
    Senior

    Tranzeal

    San Francisco, CA
    4 days ago
  • $190k - $250k

     ...About Unify Unify is building the first AI-powered system of action for revenue...  ...transform outbound into a top-performing growth engine by making go-to-market execution...  ...Who You Are You have 4+ years of software engineering experience and have a track record... 
    Senior

    Unify

    San Francisco, CA
    2 days ago
  •  ...the infrastructure for enterprises to build and orchestrate AI workforces. Our AI workers don't just communicate - they...  ...Overview We are looking for a versatile and highly skilled Senior Backend Software Engineer to join our team. This person will own meaningful parts of... 
    Senior
    Worldwide
    Shift work

    Happy Robot

    San Francisco, CA
    4 days ago
  • $50 - $150 per hour

    A leading AI company is seeking a software engineer to review and evaluate model-generated code. This contract role requires several years of software engineering experience, particularly as a full-stack engineer at notable tech firms. You will assess code quality and provide... 
    Senior
    Hourly pay
    Contract work
    Flexible hours

    Turing

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer, AI Evals. Be the first to apply!