Senior AI Agent Engineer - Open Models & Evaluation Systems
Sail Research
Sail is the foundation of useful, agentic AI. We are here to take a big swing at the most ambitious engineering challenge of our careers. Everyone working at Sail will become an expert; nothing less will do in our immensely competitive market. Inference is just one piece of an effective background agent. Let's design and build the rest of the system, that turns billions of tokens into the best possible answers. What you’ll do Design custom evals for multi-turn, massively parallel agents. Build agent harnesses to improve open model (Deepseek, Qwen, Llama) performance. Claude Code is all about agent/harness codesign; let's do the same for open source! Automate prompt optimization techniques like DSPy. What we’re looking for Experience building AI agents. Familiarity with open source models. Interview process Meet the CEO. This is the first step because we respect your time. Ask any question and get a definitive answer immediately. Meet the CTO, who will ask about your experience, and share as much technical detail about Sail as you want to hear. Come in to Sail's SF office for an interview day. Meet the whole team, then you'll have 3-4 hours to work on a problem that closely simulates the work we do daily. It's an objectively scored task, so you'll have immediate feedback on how well your code is working - just like we do in production! AI assistance is highly encouraged, and we'll provide a laptop with all the best tools set up. Finish with a short presentation describing your process, learnings, and results. Offer. Once the team decides we want to work with you, we make a strong offer quickly and will be quite persistent over email/text/calls :) Life at Sail We work out of a beautiful, sunny office in downtown San Francisco. All meals are on us (and actually great; SF is a food paradise and it would be a shame to eat only bowl slop). Everyone gets a Studio Display at their desk. We are serious about investing in anything that saves us time or energy. There are six different ways to make coffee or tea in the office. A friendly (hypoallergenic) black cat named Coco visits occasionally. #J-18808-Ljbffr Sail Research
- B Capital seeks a talented individual for an AI Evaluation role in San Francisco. This position involves conducting... ...comparative analysis, refining evaluation systems, and collaborating with various teams to enhance model capabilities. The ideal candidate will have strong...Suggested
- ...and optimizing features for an AI runtime and SaaS platform. The... ...years of experience in backend systems, proficiency in Python and C++,... ...teams. You will contribute to open-source initiatives and help shape... ...position offers a hybrid working model with a hands-on approach to AI...Senior
$230k - $385k
...AI Systems Engineer - Codex Core Agents About The Team The Codex Core Agents team builds... ...agent harness that turns model capability into real-world... ...quality. The harness is open source and increasingly part... ...models are trained and evaluated, making this one of the highest...Suggested- ...re looking for our first AI Engineer focused on agents and evaluation—a foundational hire who will... ..., and scale intelligent systems. The Opportunity: Design... ...the broader foundation model ecosystem. If you love designing... ...strategies for complex, open‑ended tasks Proficiency...SuggestedFlexible hours
- ...caring nurses while AI agents handle the tedious... ...the healthcare system on your behalf, and... ...looking for an AI engineer to own the loop that... .... Set up the evaluation infrastructure that... ...feedback (RLHF), reward modeling, or other feedback... ...Contributions to open-source AI projects...SeniorFull time
$170k - $220k
Kindo is seeking a Senior AI Systems Engineer to design and operate core systems enabling autonomous agents in production. You will build production-grade workflows and integrate... .... Location options include Venice, San Francisco, or a hybrid model. #J-18808-Ljbffr KindoSenior- AI Systems Engineer - Codex Core Agents Location San Francisco Employment Type Full time Department... ...Codex agents interpret model outputs, use tools,... ...development environments. Develop evaluation, experimentation, and... ...for other teams and open-source users to build on....Full timeWork at officeLocal areaRelocation packageFlexible hours
$140k - $225k
...Staff — SketchPro.ai Location: San... ...Headcount: 2 open seats About SketchPro... ...through AI agents operating... ...partnership for frontier model fine-tuning... ...'ll Own Agent engineering across context... ...AEC ecosystem Evaluation harnesses to determine... ...generation systems for...Full timeH1bWork at officeVisa sponsorship$124k - $280k
...Data, Analytics & AI Industry/... ...data and analytics engineering focus on leveraging... ...optimising algorithms, models, and systems to enable... ...relevant. Initiate open and honest... ...health systems. As a Senior Manager, you will... ...team members. We evaluate these factors thoughtfully...SeniorFull timeH1b$124k - $280k
...Data, Analytics & AI Industry/... ...data and analytics engineering focus on leveraging... ...optimising algorithms, models, and systems to enable... ...relevant. Initiate open and honest... ...health plans. As a Senior Manager, you will... ...team members. We evaluate these factors thoughtfully...SeniorFull timeH1b$166.9k - $225.9k
...thoughtful hybrid model because we believe... ...intelligent AI capabilities into... ...We are seeking a Senior AI Engineer to help design, build... ...robust, high-impact AI systems that improve... ...LLM + retrieval + agent systems in production... ...reliability, and evaluation in real-world enterprise...SeniorWork at officeImmediate startWorldwideMonday to FridayFlexible hours- ...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer support, writing code, analyzing... .... A way to route between models. A way to manage tools and integrate... ...building or contributing to open-source LLM orchestration tools...Senior
- Amplitude is looking for a Senior AI Software Engineer to lead the design and development of AI-powered applications for HR systems. This role involves creating custom applications that optimize workflows and requires proficiency in Python or Node.js, along with 5+ years...SeniorFlexible hours
- A fast-growing AI company seeks a Software Engineer to focus on Model Evaluation & Benchmarking. This role involves building evaluation systems for multimodal AI, ensuring reliable performance. The ideal candidate will possess strong Python programming skills, familiarity...
$167.2k - $209k
AppFolio, Inc is looking for a Software Engineer specializing in AI to define and drive the technical vision... ...experience in developing and deploying ML/AI systems and a Master's or Ph.D. in a relevant field. You will design deep agents and collaborate with cross-functional...Senior$214k - $300k
Monograph is seeking an engineer to build and improve AI evaluation systems aimed at increasing shipping quality for AI tools. You will enhance scalable eval... ...position is based in San Francisco with a hybrid work model and a competitive salary range of $214,000 - $300,000...- ...Senior AI Architect – Multi-Agent Systems & Platform Infrastructure Senior AI Architect... ...Orchestration / Head of Engineering Seniority: Senior-Level... ...Develop and refine test plans, evaluation pipelines, and debug... ...LLMs • Contributions to open-source AI orchestration or...SeniorFull timeWork at officeRemote work
$170k - $220k
...Senior AI Systems Engineer - Agentic Platforms The Moment The role of... ...engineer is changing. Autonomous agents can now execute real... ...problems are shifting from model demos to production systems... ...operability foundations, including evaluation, observability, failure...SeniorRemote workShift work$216k - $270k
...As a Software Engineer on the ML Infrastructure team,... ...research and production systems, supporting both internal... ...integrate and optimize models for production and... ...ensure a fair and thorough evaluation of all applicants. About... ...is to develop reliable AI systems for the world's...SeniorFull time$231k - $340k
Harvey is seeking a Senior AI Engineer in San Francisco, CA, to design and enhance their AI platform, focusing on model integration, evaluation, and shared infrastructure. Candidates should have 8+ years of backend systems experience, including AI/ML engineering, and a...$105.8k - $174.8k
...skills and ambitions. As a Senior AI Native Engineer, you will be at the... ...and implementing scalable AI systems that learn and make predictions... ...to improve high‑performance models. This position may have travel... ..., transforming data and evaluating results to make meaningful...SeniorFull timeWork experience placementSummer holidayFlexible hours- ...implementation and integration of AI agents, ensuring seamless tool... ...works closely with the Senior Manager to execute agent builds, leveraging open-source frameworks for... ...troubleshooting of AI systems. • Excellent... ...digital trends, Gen AI Models and technologies, and enterprise...SeniorWork experience placementWork at officeLocal area
- ...Meet Eloquent AI At Eloquent... ...multimodal, autonomous systems that execute... ...talent in AI, engineering, and product as... ...As an AI Agent Engineer at Eloquent... ..., and scale AI models for real-world... ...and evaluations. Requirements... ...contributed to open-source NLP projects...
- ...About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves reliably, consistently, and predictably... .... Familiarity with HuggingFace ecosystem or open-source ML toolkits. Experience building...
$215k - $230k
...trajectory. The AI Engineering Team is chartered... ...Large Language Models (LLMs) and agentic systems . Our mission is... ...deeply involved in evaluating and integrating... ...in the LLM and agent space — including open‑source stacks, vector... ...sharing. Senior Engineer: Successfully...Local areaRemote work$106.9k - $176.5k
...and Decision Science – AI Native Engineering AI/Machine Learning Engineer, Senior Consultant The... ...implementing scalable AI systems that learn and make... ...improve high-performance models. This position may... ..., transformation, and evaluation ~ Experience with...SeniorFull timeWork experience placementSummer holidayFlexible hours$260k
...Title : Senior AI Engineer Location : Remote, United States... ...decisions around systems, evaluation, and long-term technical... ...retrieval systems, and agent-based workflows in... ...architectural decisions across model infrastructure,... ...Contributions to open source, research, or...SeniorRemote work- ...AI Systems Engineer Transluce is a fast-moving research lab building the... ...set industry standards for evaluation. We are a non-profit with a... ...cross-organisational reach (open-source tools the entire community... ...enough to allow complex model introspection and intervention...Flexible hours
$180k - $240k
...We are looking for a Senior Agentic AI Engineer to join our team. You will... ..., and legal workflow systems. Evaluate third-party models, frameworks, and services... ...and deliver high-impact agent solutions. Build tooling... .... Contributions to open-source projects or...SeniorFull timeContract workWork at officeRemote workWorldwideFlexible hours- ...published more than 16 open source tools... ...t just another engineering role. You'll be... ...an experienced AI engineer who... ...building real systems that operate in... ...reliability, evaluation, and scale—not... ...autonomous AI agents that identify genuine... ...large language models and cutting-...SeniorWork at officeLocal areaRemote workWork from home
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior AI Agent Engineer - Open Models & Evaluation Systems. Be the first to apply!
- transfer agent San Francisco, CA
- commissioning agent San Francisco, CA
- signing agent San Francisco, CA
- work from home chat agent San Francisco, CA
- remote chat agent San Francisco, CA
- airport agent San Francisco, CA
- right of way agent San Francisco, CA
- agent San Francisco, CA
- sourcing agent San Francisco, CA
- title agent San Francisco, CA


