Senior Software Engineer, AI Evals
$240k - $280kSentry
Bad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster so we can get back to enjoying technology. With more than $217 million in funding and 100,000+ organizations that believe we’re on to something, we're building performance and error monitoring tools that help companies like Disney, Microsoft, and Atlassian spend less time fixing bugs and more time building products. Sentry embraces a hybrid work model across our global hubs, with Mondays, Tuesdays, and Thursdays set as in-office anchor days to encourage meaningful collaboration. If you like to selfishly build things that make your digital life better, come help us build the next generation of software monitoring tools. About the role As a Senior Software Engineer on Sentry’s AI/ML team, you’ll be responsible for building the evaluation infrastructure that measures the accuracy, reliability, and real‑world performance of our AI systems. This role is critical to ensuring that our debugging agents and AI‑powered features behave correctly, safely, and predictably as they scale. You’ll design datasets, benchmarks, and test harnesses that turn ambiguous AI behavior into measurable signals, helping the team ship AI with confidence. In this role you will Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems. Create and curate high‑quality datasets, golden test cases, and benchmarks grounded in real production data. Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows. Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria. Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring. You’ll love this job if you Care deeply about correctness, rigor, and measurement in AI systems. Enjoy turning fuzzy product goals and model behavior into concrete tests and metrics. Like building foundational infrastructure that unlocks faster iteration and higher confidence for the entire AI team. Thrive in cross‑functional environments and enjoy influencing model design through better evaluation. Qualifications Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field. Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred). Comfort writing production‑quality code (we use Python and TypeScript). Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines. Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts). Bonus: experience evaluating LLMs, agentic systems, or AI‑assisted developer tools. The base salary range (or hourly wage range, if applicable) that Sentry reasonably expects to pay for this position is $240,000 to $280,000. A successful candidate’s actual base salary (or hourly wage) amount will be determined by a variety of relevant factors including, without limitation, the candidate’s work location, education, work and other relevant experience, skills, and job‑related knowledge. A successful candidate will be eligible to participate in Sentry’s employee benefit plans/programs applicable to the candidate’s position (including incentive compensation, equity grants, paid time off, and group health insurance coverage). See Sentry Benefits for more details about the Company’s benefit plans/programs. Equal Opportunity at Sentry Sentry is committed to providing equal employment opportunities to its employees and candidates for employment regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, or other legally‑protected characteristic. This commitment includes the provision of reasonable accommodations to employees and candidates for employment with physical or mental disabilities who require such accommodations in order to (a) perform the essential functions of their jobs, or (b) seek employment with Sentry. We strive to build a diverse team, with an inclusive culture where every teammate can thrive. Sentry is an open‑source company because we believe that everyone, everywhere, should have the ability and tools to make great software. Software should be accessible. That starts with making our industry accessible. #J-18808-Ljbffr
- ...AI Operating System For Healthcare At Commure, we're building... ...by partnering with engineering and product teams on how models... ...extensive experience Professional software development industry experience... ...applications, including designing evals and improving performance...SeniorWork at officeImmediate start
$175k - $225k
...Senior Backend Engineer In person 5 days/week in San Francisco, Boston, MA, New York. We are looking... ...power LangChain's observability and evals platform. You will work on the core... ...developers to monitor and evaluate their AI applications at scale. While the focus...SeniorWork at officeFlexible hours$170k - $195k
...ubiquitous. We provide the agent engineering platform and open source... ...developers worldwide and powers AI teams at companies like Replit... ...London. We are looking for a Senior Backend Engineer to join us. In... ...LangChain’s observability and evals platform. You will work on the...SeniorWorldwideFlexible hours$155k - $195k
...intelligent agents ubiquitous. We help developers build mission-critical AI applications across the entire agent development lifecycle. Our... ...their organization. Founded in 2023, LangChain powers top engineering teams at companies like Replit, Lovable, Clay, Klarna, LinkedIn...Senior- ...freedom. About the Role We are hiring Senior Software Engineers to help shape the future of January’s platform... ...reality. Consumer Voice – Owns Voice AI conversation flows for inbound and... ..., plus the LLM platform infrastructure—evals, consumer context, and guardrails—that...SeniorCurrently hiringWork at office
- ...Lassie is building the AI for every health practice. One million health practices spend... .... About the role We're hiring an engineer to build the AI agents that run doctor's offices... ...: agent architectures, data pipelines, evals, and the infrastructure that makes it all...SeniorWork at office
- ...type : Full-time Department : Engineering and Development Workplace... ...Office Experience : 0 years Senior Software Engineer (Startup) About the... ...intersection of web apps and modern AI systems. That means we`re... ...depend on LLMs—prompt pipelines, evals, guardrails, retrieval, and...SeniorFull timeWork at officeFlexible hoursWeekend work
- ...About Nooks.ai: Nooks is an applied AI lab building the Agent Workspace for GTM. We... ...volume for real-time voice. We're hiring senior engineers to push both sides forward. What you'll... ...latency budget and have opinions about evals. Strong opinions, held loosely, about what...SeniorWork at office3 days per week
$175k - $240k
...our mission is to make intelligent agents ubiquitous. We build the foundation for agent engineering in the real world, helping developers move from prototypes to production-ready AI agents that teams can rely on. We began as widely adopted open-source tools and have grown...SeniorWork at officeFlexible hours$160k - $180k
...help professionals scale. Role Summary As a Senior Software Engineer, you’ll own major parts of our AI stack. You’ll prototype zero-to-one workflows, design... ...with retrieval, embeddings, experimentation, and evals ~ Ability to design multi-step pipelines and agentic...SeniorFull timeContract workTemporary workWork experience placementWork at office$200k - $240k
...Your Role You are the engineering anchor of Zip’s new Internal AI team. The Internal AI team operates in a hub-and... ...accelerate AI adoption across every non-software engineering use case at Zip and we... ...patterns, data handling, identity, evals, observability. Be the single...SeniorHome officeFlexible hours$190k - $230k
...At Sanity, we’re building the future of AI-powered Content Operations. Our AI Content... ...a designer, product manager, and fellow engineers to move fast, experiment constantly, and... ...with minimal human guidance. Design and run evals – build evaluation suites using...SeniorWork at officeFlexible hours2 days per week- ...hardware, by developing the first AI Hardware Engineer. Our goal is to democratize... ...first AI Hardware Engineer, software that can design real,... ...electronics from a prompt. As a Senior Software Engineer, Agentic... .... Build and analyze evals; translate data into engineering...SeniorInternshipRemote workShift work
$160k - $180k
...: As a Staff Full Stack Engineer on the contract review and... ...core systems that power our AI-native workflow engine. You... ...embeddings, prompt libraries, and evals Collaborate directly... ...high-quality, reliable software quickly in a small, senior team Improve performance...SeniorFull timeContract workTemporary workWork experience placementWork at office- ...About Reducto Reducto helps AI teams ingest real world... ...Capital, and are looking for senior engineers for our Platform team. The Opportunity As a Senior Software Engineer on our Platform team... ...Building internal tooling and evals to better understand/analyze...SeniorWork at officeLocal area
$226k - $306k
...About the Team The AI Platform team is a newly formed team at the center of Mixpanel... ...system for Mixpanel AI that uses evals and metrics from production to optimize... ...looking for an experienced and driven Senior Software Engineer to join our AI Platform team. You will...SeniorRemote work$230k
...Software Engineer This role blends traditional software engineering, agent management, and system... ...be the expert on agentic harnesses, evals, and best practices. Your experience wielding... ...what we're building more than ever. AI demand is driving dirty energy generation...SeniorWork at officeVisa sponsorship$170k - $195k
...A tech company specializing in AI is looking for a Senior Backend Engineer to build backend systems for their observability and evaluation platform. This role requires over 5 years of experience in backend engineering and proficiency in languages like Python or Go. The...Senior$140k - $175k
...ubiquitous. We provide the agent engineering platform and open source... ...developers worldwide and powers AI teams at companies like Replit... ...commercial observability and evals platform product. In this role... ...~2+ years of experience in software engineering working on complex...WorldwideFlexible hours$50 - $150 per hour
A leading AI company is seeking a software engineer to review and evaluate model-generated code. This contract role requires several years of software engineering experience, particularly as a full-stack engineer at notable tech firms. You will assess code quality and provide...SeniorHourly payContract workFlexible hours- ...technology firm in San Francisco is on the lookout for driven Senior Engineers. In this role, you will contribute to building and scaling innovative... ...in a dynamic startup environment, focusing on cutting-edge AI technologies. You will engage with key decisions and enjoy...Senior
- ...healthcare technology startup is seeking driven Senior Engineers to join their expanding team in San... ...environment, focusing on cutting-edge AI and voice technologies. Candidates... ...strong track record in building scalable software solutions. Competitive compensation and...Senior
- ...Capably in San Francisco is looking for a Senior Software Engineer to design and scale core platform capabilities that support enterprise AI workflows. This role emphasizes hands-on engineering, requiring the ability to build reliable, maintainable systems. Candidates...Senior
- ...A leading AI research organization in New York is seeking a Senior Software Engineer to build and maintain infrastructure for AI research tools. The role emphasizes full-stack development, rapid iteration, and collaboration with research teams. Candidates should have...Senior
$202k - $251k
...Decisive Point is seeking a Senior Software Engineer in San Francisco to build AI-powered workflows. This hybrid role involves designing features, mentoring engineers, and collaborating with product teams. Ideal candidates will have over 5 years of experience and a penchant...Senior- ...A healthcare technology company based in San Francisco is seeking a Senior Software Engineer to design and build scalable systems for their innovative AI healthcare platform. The ideal candidate will have expertise in AWS and TypeScript, with the ability to drive real...Senior
- ...A cutting-edge AI technology company in San Francisco is seeking an experienced backend engineer to build and scale their AI agent platform. The role involves developing... ...performance. Ideal candidates will have 4-8 years in software engineering, a strong ownership mindset, and...Senior
- ...MSCI is seeking a Senior Software Engineer in San Francisco, CA, to design and build high-performance distributed systems. This role requires strong... ...The role influences technical direction and encourages exploring AI intelligent automation capabilities. #J-18808-Ljbffr...Senior
- A dynamic technology firm is seeking a Senior Software Program Engineer to lead software development efforts. You will work collaboratively across teams... ...years of engineering experience and be eager to leverage AI tools. The role is fully remote, allowing for flexible...SeniorRemote workFlexible hours
- ...Noyo, based in San Francisco, is looking for an experienced software developer to join their team in building AI-powered tools and modern benefits infrastructure. As part of a collaborative team, you will engage in all phases of product development, from ideation to launch...SeniorFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer, AI Evals. Be the first to apply!
- software sales engineer San Francisco, CA
- software engineer internship remote San Francisco, CA
- IT software developer San Francisco, CA
- new grad software engineer San Francisco, CA
- software engineer staff San Francisco, CA
- integration software engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- software engineer part time San Francisco, CA
- facebook software engineer San Francisco, CA
- senior robotics software engineer San Francisco, CA

