Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Agent Engineer - Open Models & Evaluation Systems

Sail Research

Sail is the foundation of useful, agentic AI. We are here to take a big swing at the most ambitious engineering challenge of our careers. Everyone working at Sail will become an expert; nothing less will do in our immensely competitive market. Inference is just one piece of an effective background agent. Let's design and build the rest of the system, that turns billions of tokens into the best possible answers. What you’ll do Design custom evals for multi-turn, massively parallel agents. Build agent harnesses to improve open model (Deepseek, Qwen, Llama) performance. Claude Code is all about agent/harness codesign; let's do the same for open source! Automate prompt optimization techniques like DSPy. What we’re looking for Experience building AI agents. Familiarity with open source models. Interview process Meet the CEO. This is the first step because we respect your time. Ask any question and get a definitive answer immediately. Meet the CTO, who will ask about your experience, and share as much technical detail about Sail as you want to hear. Come in to Sail's SF office for an interview day. Meet the whole team, then you'll have 3-4 hours to work on a problem that closely simulates the work we do daily. It's an objectively scored task, so you'll have immediate feedback on how well your code is working - just like we do in production! AI assistance is highly encouraged, and we'll provide a laptop with all the best tools set up. Finish with a short presentation describing your process, learnings, and results. Offer. Once the team decides we want to work with you, we make a strong offer quickly and will be quite persistent over email/text/calls :) Life at Sail We work out of a beautiful, sunny office in downtown San Francisco. All meals are on us (and actually great; SF is a food paradise and it would be a shame to eat only bowl slop). Everyone gets a Studio Display at their desk. We are serious about investing in anything that saves us time or energy. There are six different ways to make coffee or tea in the office. A friendly (hypoallergenic) black cat named Coco visits occasionally. #J-18808-Ljbffr Sail Research

Vacancy posted 2 hours ago
Similar jobs that could be interesting for youBased on the Senior AI Agent Engineer - Open Models & Evaluation Systems in San Francisco, CA vacancy
  • B Capital seeks a talented individual for an AI Evaluation role in San Francisco. This position involves conducting...  ...comparative analysis, refining evaluation systems, and collaborating with various teams to enhance model capabilities. The ideal candidate will have strong... 
    Suggested

    B Capital

    San Francisco, CA
    4 days ago
  •  ...and optimizing features for an AI runtime and SaaS platform. The...  ...years of experience in backend systems, proficiency in Python and C++,...  ...teams. You will contribute to open-source initiatives and help shape...  ...position offers a hybrid working model with a hands-on approach to AI... 
    Senior

    Valid8 Financial, Inc.

    San Francisco, CA
    11 hours ago
  • $176k - $253k

    Harper is seeking a Senior Member of Technical Staff, AI Quality, in San Francisco...  ...goal will be to turn agent quality into...  ...standards through robust evaluation processes. You'll build...  ...suites, design grading systems, and work directly with engineers to ensure our AI... 
    Senior

    Harper

    San Francisco, CA
    2 days ago
  • AI Systems Engineer - Codex Core Agents About the team: The Codex Core Agents team builds...  ...agent harness that turns model capability into real-world...  ...execution, orchestration, evaluations, production reliability,...  ...quality. The harness is open source and increasingly part... 
    Suggested

    United States Digital Space LLC

    San Francisco, CA
    2 days ago
  • $124k - $280k

     ...Data, Analytics & AI Industry/...  ...data and analytics engineering focus on leveraging...  ...optimising algorithms, models, and systems to enable...  ...relevant. Initiate open and honest...  ...health plans. As a Senior Manager, you will...  ...team members. We evaluate these factors thoughtfully... 
    Senior
    Full time
    H1b

    PwC

    San Francisco, CA
    11 hours ago
  • Wallman Unlimited Company is seeking an AI Engineer to design and implement core systems for autonomous financial audits in San Francisco. This role offers...  ...architecture to deployment, requiring strong skills in AI agents and full-stack programming. You will work in a fast-... 
    Senior

    Carlsbad Tech

    San Francisco, CA
    3 days ago
  • A technology firm specializing in AI solutions is seeking an experienced AI/ML Engineer located in San Francisco. This role involves designing and implementing autonomous AI agent systems and developing feedback mechanisms for self-improvement. Candidates should have over... 
    Senior

    Sweya Information Technologies LLP

    San Francisco, CA
    1 day ago
  • United States Digital Space LLC is seeking an AI Systems Engineer to build the core systems that enhance Codex agents' performance in production. You’ll collaborate with research and infrastructure teams to design agent harness capabilities and ensure reliability across... 
    Senior

    United States Digital Space LLC

    San Francisco, CA
    11 hours ago
  • A fast-growing AI company seeks a Software Engineer to focus on Model Evaluation & Benchmarking. This role involves building evaluation systems for multimodal AI, ensuring reliable performance. The ideal candidate will possess strong Python programming skills, familiarity... 

    SpreeAI

    San Francisco, CA
    2 days ago
  • Cacheflow is seeking a Senior Applied Research Engineer to enhance the effectiveness of our AI systems through focused research and experimentation. This role involves designing information retrieval strategies and collaborating with engineers to turn validated approaches... 
    Senior
    Flexible hours

    Cacheflow

    San Francisco, CA
    2 days ago
  • $240k - $280k

     ...software monitoring company is seeking a Senior Software Engineer on its AI/ML team to build evaluation infrastructure for measuring the performance of AI systems. This role involves designing...  ...The position offers a hybrid work model and a salary range of $240,000 to $... 
    Senior

    Sentry

    San Francisco, CA
    4 days ago
  • $159.2k - $301.6k

     ..., reusable design systems, and collaboration...  ...next generation of AI‑native creative...  ...both users and AI agents. Our mission is to...  ...forward‑thinking engineers who are excited to...  ...experience. Develop evaluation and quality frameworks...  ...: If this role is open to hiring in... 
    Senior
    Temporary work
    Local area

    Dormont Manufacturing Co

    San Francisco, CA
    1 day ago
  • $231k - $340k

    Harvey is seeking a Senior AI Engineer in San Francisco, CA, to design and enhance their AI platform, focusing on model integration, evaluation, and shared infrastructure. Candidates should have 8+ years of backend systems experience, including AI/ML engineering, and a... 

    Harvey

    San Francisco, CA
    2 days ago
  • $105.8k - $174.8k

     ...skills and ambitions. As a Senior AI Native Engineer, you will be at the...  ...and implementing scalable AI systems that learn and make predictions...  ...to improve high‑performance models. This position may have travel...  ..., transforming data and evaluating results to make meaningful... 
    Senior
    Full time
    Work experience placement
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    San Francisco, CA
    3 days ago
  • Senior AI Architect - Multi-Agent Systems & Platform Infrastructure Senior AI Architect - Multi...  ...Orchestration / Head of Engineering Seniority: Senior-Level (...  ...and refine test plans, evaluation pipelines, and debug tools...  ...LLMs • Contributions to open-source AI orchestration or... 
    Senior
    Full time
    Work at office
    Remote work

    Nivalto

    San Francisco, CA
    3 days ago
  •  ...AI Systems Engineer Transluce is a fast-moving research lab building the...  ...set industry standards for evaluation. We are a non-profit with a...  ...cross-organisational reach (open-source tools the entire community...  ...enough to allow complex model introspection and intervention... 
    Flexible hours

    Transluce

    San Francisco, CA
    4 days ago
  • Software Engineer (Model Evaluation & Benchmarking) About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently...  ...with HuggingFace ecosystem or open-source ML toolkits. Experience building... 

    SpreeAI

    San Francisco, CA
    2 days ago
  • $215k - $230k

     ...trajectory. The AI Engineering Team is chartered...  ...on Large Language Models (LLMs) and agentic systems . Our mission is...  ...deeply involved in evaluating and integrating...  ...tools in the LLM and agent space — including open‑source stacks,...  ...knowledge sharing. Senior Engineer:... 
    Local area
    Remote work

    Crypto Pro Network

    San Francisco, CA
    3 days ago
  • $124k - $280k

     ...Competency: Data, Analytics & AI Industry/Sector:...  ...in data and analytics engineering focus on leveraging...  ...algorithms, models, and systems to enable intelligent...  ...relevant. Initiate open and honest coaching conversations...  ...with team members. We evaluate these factors... 
    Senior
    Full time
    H1b

    PwC

    San Francisco, CA
    3 days ago
  • A leading data and AI company in San Francisco is seeking a Senior Engineer to enhance their Model Serving platform. This role requires expertise in building large-scale distributed systems and collaboration across teams to optimize performance and reliability. Ideal candidates... 
    Senior

    Jobleads-US

    San Francisco, CA
    4 days ago
  • Crusoe is seeking a Senior Staff Software Engineer for the AI Model Lifecycle team in San Francisco, CA. The candidate will manage fine-tuning systems and training pipelines for large language models, contributing to the development of AI solutions. The ideal candidate... 
    Senior

    AI

    San Francisco, CA
    11 hours ago
  • $160k - $207k

     ...as you build, with AI that learns your context...  ...the role As an AI engineer, you’ll apply LLM...  ...and rigorously evaluate the efficacy of different prompts and models through experimentation...  ...provided you are open to learning them quickly...  ...employee with a system that equally... 
    Senior
    Currently hiring
    Local area
    Remote work
    Weekend work
    3 days per week

    Semgrep

    San Francisco, CA
    3 days ago
  • Block, Inc. is seeking senior AI engineers in San Francisco to design and develop innovative conversational AI systems. The role involves training language models, collaborating with various teams, and contributing to AI infrastructure handling millions of interactions.... 
    Senior
    Full time

    Block, Inc.

    San Francisco, CA
    6 days ago
  • $166.7k - $225.9k

     ...Hybrid Department Engineering Job Summary Drata...  ...on experience — and AI is at the center of...  .... We are seeking a Senior AI Product Engineer...  ...capabilities of LLMs, agents, and RAG pipelines...  ...; surface where model outputs break down...  ...agents Exposure to RAG system design - not as an... 
    Senior
    Full time

    Cacheflow

    San Francisco, CA
    3 days ago
  • Drata is seeking a Senior Applied Research Engineer to enhance the quality of AI systems through rigorous evaluation and experimentation. This role emphasizes applied research, focusing on information retrieval and reasoning strategies. The ideal candidate will bring 5+... 
    Senior

    jobr.pro

    San Francisco, CA
    11 hours ago
  • Anysphere is seeking a Software Engineer for the Agent Quality team in San...  ...and build infrastructure to evaluate and improve ML agents. Responsibilities...  ...include creating evaluation systems, defining quality metrics,...  ...will have experience in AI evaluations, data analysis,... 

    Anysphere

    San Francisco, CA
    4 days ago
  • $150k

    Tzafon is seeking a skilled engineer to enhance their machine intelligence systems in San Francisco. As part of the team, you'll be responsible for building evaluation infrastructure, designing data pipelines, and implementing fine-tuning processes. Ideal candidates have... 

    Tzafon

    San Francisco, CA
    2 days ago
  • Build autonomous AI agents that form feedback-driven, self-improving systems for enterprise operations. Python TensorFlow...  ...+ years of experience in AI/ML engineering, Strong background in Python...  ...platforms, Knowledge of large language models and agentic AI systems,... 
    Senior

    Sweya Information Technologies LLP

    San Francisco, CA
    1 day ago
  • About Scale AI Scale AI is the data...  ...Overview As a Senior Staff Forward Deployed AI Engineer on our...  ...adoption of AI systems in production environments...  ...configure AI models and agents within customer...  ...Implement evaluation frameworks to measure...  ...to open‑source AI/ML projects... 
    Senior

    Neura Market

    San Francisco, CA
    3 days ago
  • $225.4k - $257.2k

     ...responsible and reliable AI systems, changing banking for...  ...applied science and engineering teams to deliver our industry...  ...of customers. Our AI models and platforms empower...  ..., guardrails, model evaluation, experimentation,...  ...Leverage a broad stack of Open Source and SaaS AI... 
    Senior
    Full time
    Part time
    Local area

    Capital One National Association

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Agent Engineer - Open Models & Evaluation Systems. Be the first to apply!