Senior AI Agent Engineer - Open Models & Evaluation Systems

Sail Research

Sail is the foundation of useful, agentic AI. We are here to take a big swing at the most ambitious engineering challenge of our careers. Everyone working at Sail will become an expert; nothing less will do in our immensely competitive market. Inference is just one piece of an effective background agent. Let's design and build the rest of the system, that turns billions of tokens into the best possible answers. What you’ll do Design custom evals for multi-turn, massively parallel agents. Build agent harnesses to improve open model (Deepseek, Qwen, Llama) performance. Claude Code is all about agent/harness codesign; let's do the same for open source! Automate prompt optimization techniques like DSPy. What we’re looking for Experience building AI agents. Familiarity with open source models. Interview process Meet the CEO. This is the first step because we respect your time. Ask any question and get a definitive answer immediately. Meet the CTO, who will ask about your experience, and share as much technical detail about Sail as you want to hear. Come in to Sail's SF office for an interview day. Meet the whole team, then you'll have 3-4 hours to work on a problem that closely simulates the work we do daily. It's an objectively scored task, so you'll have immediate feedback on how well your code is working - just like we do in production! AI assistance is highly encouraged, and we'll provide a laptop with all the best tools set up. Finish with a short presentation describing your process, learnings, and results. Offer. Once the team decides we want to work with you, we make a strong offer quickly and will be quite persistent over email/text/calls :) Life at Sail We work out of a beautiful, sunny office in downtown San Francisco. All meals are on us (and actually great; SF is a food paradise and it would be a shame to eat only bowl slop). Everyone gets a Studio Display at their desk. We are serious about investing in anything that saves us time or energy. There are six different ways to make coffee or tea in the office. A friendly (hypoallergenic) black cat named Coco visits occasionally. #J-18808-Ljbffr Sail Research

Apply

Vacancy posted 2 hours ago

Similar jobs that could be interesting for youBased on the Senior AI Agent Engineer - Open Models & Evaluation Systems in San Francisco, CA vacancy

Staff AI Evaluations Engineer — Open Foundation Models
B Capital seeks a talented individual for an AI Evaluation role in San Francisco. This position involves conducting... ...comparative analysis, refining evaluation systems, and collaborating with various teams to enhance model capabilities. The ideal candidate will have strong...
Suggested
B Capital
San Francisco, CA
4 days ago
Senior AI Systems Engineer & Open-Source Team Lead
...and optimizing features for an AI runtime and SaaS platform. The... ...years of experience in backend systems, proficiency in Python and C++,... ...teams. You will contribute to open-source initiatives and help shape... ...position offers a hybrid working model with a hands-on approach to AI...
Senior
Valid8 Financial, Inc.
San Francisco, CA
11 hours ago
Senior AI Quality Engineer — Agent Evaluation & Testing
$176k - $253k
Harper is seeking a Senior Member of Technical Staff, AI Quality, in San Francisco... ...goal will be to turn agent quality into... ...standards through robust evaluation processes. You'll build... ...suites, design grading systems, and work directly with engineers to ensure our AI...
Senior
Harper
San Francisco, CA
2 days ago
AI Systems Engineer, Codex Agents
AI Systems Engineer - Codex Core Agents About the team: The Codex Core Agents team builds... ...agent harness that turns model capability into real-world... ...execution, orchestration, evaluations, production reliability,... ...quality. The harness is open source and increasingly part...
Suggested
United States Digital Space LLC
San Francisco, CA
2 days ago
Applied AI Health Data System Engineer-Senior Manager
$124k - $280k
...Data, Analytics & AI Industry/... ...data and analytics engineering focus on leveraging... ...optimising algorithms, models, and systems to enable... ...relevant. Initiate open and honest... ...health plans. As a Senior Manager, you will... ...team members. We evaluate these factors thoughtfully...
Senior
Full time
H1b
PwC
San Francisco, CA
11 hours ago
Senior AI Agent Engineer — End-to-End Production Systems
Wallman Unlimited Company is seeking an AI Engineer to design and implement core systems for autonomous financial audits in San Francisco. This role offers... ...architecture to deployment, requiring strong skills in AI agents and full-stack programming. You will work in a fast-...
Senior
Carlsbad Tech
San Francisco, CA
3 days ago
Senior AI Engineer: Autonomous Agent Systems
A technology firm specializing in AI solutions is seeking an experienced AI/ML Engineer located in San Francisco. This role involves designing and implementing autonomous AI agent systems and developing feedback mechanisms for self-improvement. Candidates should have over...
Senior
Sweya Information Technologies LLP
San Francisco, CA
1 day ago
Senior AI Systems Engineer for Production-Ready Agents
United States Digital Space LLC is seeking an AI Systems Engineer to build the core systems that enhance Codex agents' performance in production. You’ll collaborate with research and infrastructure teams to design agent harness capabilities and ensure reliability across...
Senior
United States Digital Space LLC
San Francisco, CA
11 hours ago
AI Model Evaluation Engineer — Benchmarking & Validation
A fast-growing AI company seeks a Software Engineer to focus on Model Evaluation & Benchmarking. This role involves building evaluation systems for multimodal AI, ensuring reliable performance. The ideal candidate will possess strong Python programming skills, familiarity...
SpreeAI
San Francisco, CA
2 days ago
Senior AI Research Engineer - RAG, Agents & Evaluation
Cacheflow is seeking a Senior Applied Research Engineer to enhance the effectiveness of our AI systems through focused research and experimentation. This role involves designing information retrieval strategies and collaborating with engineers to turn validated approaches...
Senior
Flexible hours
Cacheflow
San Francisco, CA
2 days ago
Senior AI Evaluation Engineer — Metrics & Data Pipelines
$240k - $280k
...software monitoring company is seeking a Senior Software Engineer on its AI/ML team to build evaluation infrastructure for measuring the performance of AI systems. This role involves designing... ...The position offers a hybrid work model and a salary range of $240,000 to $...
Senior
Sentry
San Francisco, CA
4 days ago
Senior Applied AI Engineer- Creative Systems & Brand Intelligence, Adobe Express
$159.2k - $301.6k
..., reusable design systems, and collaboration... ...next generation of AI‑native creative... ...both users and AI agents. Our mission is to... ...forward‑thinking engineers who are excited to... ...experience. Develop evaluation and quality frameworks... ...: If this role is open to hiring in...
Senior
Temporary work
Local area
Dormont Manufacturing Co
San Francisco, CA
1 day ago
Staff AI Platform Engineer: Agent Infra & Model Routing
$231k - $340k
Harvey is seeking a Senior AI Engineer in San Francisco, CA, to design and enhance their AI platform, focusing on model integration, evaluation, and shared infrastructure. Candidates should have 8+ years of backend systems experience, including AI/ML engineering, and a...
Harvey
San Francisco, CA
2 days ago
Physical AI Engineering Consultant - Senior - Consulting - Open Location
$105.8k - $174.8k
...skills and ambitions. As a Senior AI Native Engineer, you will be at the... ...and implementing scalable AI systems that learn and make predictions... ...to improve high‑performance models. This position may have travel... ..., transforming data and evaluating results to make meaningful...
Senior
Full time
Work experience placement
Summer holiday
Flexible hours
Ernst & Young Oman
San Francisco, CA
3 days ago
Senior AI Architect - Multi-Agent Systems & Platform Infrastructure
Senior AI Architect - Multi-Agent Systems & Platform Infrastructure Senior AI Architect - Multi... ...Orchestration / Head of Engineering Seniority: Senior-Level (... ...and refine test plans, evaluation pipelines, and debug tools... ...LLMs • Contributions to open-source AI orchestration or...
Senior
Full time
Work at office
Remote work
Nivalto
San Francisco, CA
3 days ago
AI Systems Engineer
...AI Systems Engineer Transluce is a fast-moving research lab building the... ...set industry standards for evaluation. We are a non-profit with a... ...cross-organisational reach (open-source tools the entire community... ...enough to allow complex model introspection and intervention...
Flexible hours
Transluce
San Francisco, CA
4 days ago
Software Engineer (Model Evaluation & Benchmarking)
Software Engineer (Model Evaluation & Benchmarking) About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently... ...with HuggingFace ecosystem or open-source ML toolkits. Experience building...
SpreeAI
San Francisco, CA
2 days ago
AI Agent Engineer
$215k - $230k
...trajectory. The AI Engineering Team is chartered... ...on Large Language Models (LLMs) and agentic systems . Our mission is... ...deeply involved in evaluating and integrating... ...tools in the LLM and agent space — including open‑source stacks,... ...knowledge sharing. Senior Engineer:...
Local area
Remote work
Crypto Pro Network
San Francisco, CA
3 days ago
GenAI Python Systems Engineer -Senior Manager
$124k - $280k
...Competency: Data, Analytics & AI Industry/Sector:... ...in data and analytics engineering focus on leveraging... ...algorithms, models, and systems to enable intelligent... ...relevant. Initiate open and honest coaching conversations... ...with team members. We evaluate these factors...
Senior
Full time
H1b
PwC
San Francisco, CA
3 days ago
Senior AI Model Serving Engineer Low-Latency Inference
A leading data and AI company in San Francisco is seeking a Senior Engineer to enhance their Model Serving platform. This role requires expertise in building large-scale distributed systems and collaboration across teams to optimize performance and reliability. Ideal candidates...
Senior
Jobleads-US
San Francisco, CA
4 days ago
Senior AI Model Lifecycle Engineer - LLM Training Pipelines
Crusoe is seeking a Senior Staff Software Engineer for the AI Model Lifecycle team in San Francisco, CA. The candidate will manage fine-tuning systems and training pipelines for large language models, contributing to the development of AI solutions. The ideal candidate...
Senior
AI
San Francisco, CA
11 hours ago
Senior AI Software Engineer
$160k - $207k
...as you build, with AI that learns your context... ...the role As an AI engineer, you’ll apply LLM... ...and rigorously evaluate the efficacy of different prompts and models through experimentation... ...provided you are open to learning them quickly... ...employee with a system that equally...
Senior
Currently hiring
Local area
Remote work
Weekend work
3 days per week
Semgrep
San Francisco, CA
3 days ago
Senior AI Engineer - Conversational Systems
Block, Inc. is seeking senior AI engineers in San Francisco to design and develop innovative conversational AI systems. The role involves training language models, collaborating with various teams, and contributing to AI infrastructure handling millions of interactions....
Senior
Full time
Block, Inc.
San Francisco, CA
6 days ago
Senior AI Product Engineer, Agent Governance
$166.7k - $225.9k
...Hybrid Department Engineering Job Summary Drata... ...on experience — and AI is at the center of... .... We are seeking a Senior AI Product Engineer... ...capabilities of LLMs, agents, and RAG pipelines... ...; surface where model outputs break down... ...agents Exposure to RAG system design - not as an...
Senior
Full time
Cacheflow
San Francisco, CA
3 days ago
Senior AI Research Engineer - RAG & GenAI Evaluation
Drata is seeking a Senior Applied Research Engineer to enhance the quality of AI systems through rigorous evaluation and experimentation. This role emphasizes applied research, focusing on information retrieval and reasoning strategies. The ideal candidate will bring 5+...
Senior
jobr.pro
San Francisco, CA
11 hours ago
AI Quality Engineer: Agent Evaluation & Metrics
Anysphere is seeking a Software Engineer for the Agent Quality team in San... ...and build infrastructure to evaluate and improve ML agents. Responsibilities... ...include creating evaluation systems, defining quality metrics,... ...will have experience in AI evaluations, data analysis,...
Anysphere
San Francisco, CA
4 days ago
Applied AI Systems Engineer - ML Infra & Evaluation
$150k
Tzafon is seeking a skilled engineer to enhance their machine intelligence systems in San Francisco. As part of the team, you'll be responsible for building evaluation infrastructure, designing data pipelines, and implementing fine-tuning processes. Ideal candidates have...
Tzafon
San Francisco, CA
2 days ago
Senior AI Engineer - Agentic Systems
Build autonomous AI agents that form feedback-driven, self-improving systems for enterprise operations. Python TensorFlow... ...+ years of experience in AI/ML engineering, Strong background in Python... ...platforms, Knowledge of large language models and agentic AI systems,...
Senior
Sweya Information Technologies LLP
San Francisco, CA
1 day ago
Senior Staff Forward Deployed AI Engineer, Enterprise
About Scale AI Scale AI is the data... ...Overview As a Senior Staff Forward Deployed AI Engineer on our... ...adoption of AI systems in production environments... ...configure AI models and agents within customer... ...Implement evaluation frameworks to measure... ...to open‑source AI/ML projects...
Senior
Neura Market
San Francisco, CA
3 days ago
Senior Lead AI Engineer
$225.4k - $257.2k
...responsible and reliable AI systems, changing banking for... ...applied science and engineering teams to deliver our industry... ...of customers. Our AI models and platforms empower... ..., guardrails, model evaluation, experimentation,... ...Leverage a broad stack of Open Source and SaaS AI...
Senior
Full time
Part time
Local area
Capital One National Association
San Francisco, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Agent Engineer - Open Models & Evaluation Systems. Be the first to apply!