Research Engineer: Build Self-Improving Agent Systems

Judgment Labs Inc.

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, our ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. Instead of reactive incident triage, they cluster patterns across conversations and workflows, correlate regressions to specific interaction types, and pinpoint where reliability breaks down in their usage context. We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others. The Role: We are looking for Research Engineers to build AI systems that use agent interaction data to help us understand how agents behave, evaluate them at scale, and improve them through learning and feedback. Your research will not live on a whiteboard. You'll work directly with real-world agent data, apply frontier methods in production, and see your work ship immediately into the product. By making agent behavior measurable and debuggable, your systems will support teams deploying agents across finance, legal, operations, and other high-stakes workflows. You will own projects end-to-end, with significant autonomy, and work closely with the team to build self-improving agent systems. What You'll Do: Build systems to aggregate, index, and analyze large-scale agent interaction data to extract meaningful evaluation signals Develop agent-based systems for analyzing and evaluating complex, long-running behaviors Design and implement post-training and optimization workflows to improve agent behavior Build internal tools and infrastructure to support rapid experimentation, analysis, and training What We're Looking For: You identify with at least one of the following: You care about data quality, evaluation, and benchmarking, and are comfortable working hands-on with messy data You have experience building agent systems and working with them in real-world or production settings You have a strong background in reinforcement learning, agents, or machine learning fundamentals You are comfortable working across infrastructure and systems, spanning training, data pipelines, and model serving. You are comfortable working across teams to translate research into product, balancing real-world customer constraints and tradeoffs. You enjoy turning ambiguous problems into clear, well-designed plans Why Judgment? Agents can’t work without this. Today’s agents hallucinate, drift, and break in production. We’re building the infrastructure that fixes this: the monitoring layer that makes agents self-improving. We’re wired to win. We're a team of less than 20 but we ship like 50+ on the daily. You'll be working with olympiad medalists, debate champions, and competitive athletes who bring that same intensity to company building. Fast track to founding. Our engineers interface directly with customers, ship code into their environments, and use their feedback to dictate what’s next on the roadmap. Everyone on the team is either an ex-founder or a founder-to-be. We make sure our people do their best work. If you deserve a spot on the team, money will never get in the way of it. Full benefits, Equinox, and a private chef to take care of you. We sprint hard but we play hard, ask us about our Smash/Mario Kart tournaments. #J-18808-Ljbffr Judgment Labs Inc.

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Research Engineer: Build Self-Improving Agent Systems in San Francisco, CA vacancy

Research Engineer, Agent Systems — Frontier AI Lab
$300k
Research Engineer, Agent Systems One of the most mission-driven organizations in AI is building the infrastructure that makes intelligent agents safe... ..., act, fail, recover, and improve in production. $300K - $6... ...agents can operate and self-validate safely Continuously...
Suggested
Visa sponsorship
Aionia Group
San Francisco, CA
4 days ago
Research Engineer: AI Tooling & Agent Systems
...San Francisco is looking for a candidate to drive research initiatives that influence engineering solutions. You'll build evaluations using real tool data, tackle search challenges for tools, and train systems for improved accuracy. Ideal candidates will have research...
Suggested
Composio
San Francisco, CA
1 day ago
Founding AI Engineer: Build Self-Learning Agent Systems
Embedding VC is seeking a founding engineer to help build core products and systems. Work directly with the CEO and CTO as part of an experienced team. You will design AI systems, implement features, and enhance product usability. The ideal candidate has over 5 years of...
Suggested
Flexible hours
Embedding VC
San Francisco, CA
3 days ago
Applied Research Engineer (Agents)
$160k - $300k
...meaningful use cases. The Agents team builds everything from... ..., multi-source research. We’ve built our... ...by distributed systems built for scale.... ...LLM inference engine - a distributed,... ...business problems, improving processes, and enhancing... ...* Voluntary Self-Identification...
Suggested
Contract work
For contractors
For subcontractor
Work at office
Hebbia
San Francisco, CA
2 days ago
Machine Learning Systems Research Engineer, Agent Post-training - Enterprise GenAI
$264.8k - $331k
...are doubling down on building out state of the art post... ...necessary for complex agents in enterprises around... ...The Enterprise ML Research Lab works on the front... ...As an ML Sys Research Engineer, you'll work on building... ...technologies to optimize our ML system. Your customer will be...
Suggested
Full time
Scale AI
San Francisco, CA
7 days ago
AI Agent Behavior Research Engineer
Judgment Labs in San Francisco seeks a Research Engineer to develop AI systems analyzing agent interaction data. This hands-on position involves building self-improving systems that support teams across finance, legal, and operations. Ideal candidates have experience in...
Judgment Labs
San Francisco, CA
5 days ago
Applied Research Engineer, Agents
The Role As an Applied Research Engineer , you will serve as... ...language processing systems. You will be instrumental... ...; experience building with foundation models... ...solving business problems, improving processes, and... ...LLM applications and agents is a plus. Excellent...
Gravity Engineering Services Pvt Ltd.
San Francisco, CA
5 days ago
Research Engineer, Agent Systems — Safe, Scalable AI
$300k
Aionia Group in San Francisco is looking for a Research Engineer, Agent Systems. This role involves developing foundational systems that ensure agent reliability and safety in real-world applications. You will work directly with top researchers in a mission-driven environment...
Aionia Group
San Francisco, CA
3 days ago
Staff Research Engineer, Data Agents
$190k - $270k
...Team The company AI Research organization is... ...advantage, and we’re building the models and agents that unlock it. Our... ...advanced multi‑agent systems. The Data Agent... ...by shipping direct improvements to Genie, the company... ...exploration with product and engineering rigor. Clear...
Worldwide
United States Digital Space LLC
San Francisco, CA
3 days ago
Research Engineer - Agent Product Engineer
$200k - $350k
...Labs Judgment Labs builds infrastructure for Agent Behavior Monitoring (... ...understand how their systems behave post-deployment... ...hiring an Agent Product Engineer to build high-taste products for self-learning agents. The... ...Build, evaluate, and improve agents that power...
H1b
Work at office
Relocation
Visa sponsorship
Transparent Search Group
San Francisco, CA
16 days ago
Research Infrastructure Engineer, Training Systems
$295k - $380k
...The Team The team works on research and systems that advance frontier models... ..., which means we also build the infrastructure needed to... ...The Role This is a systems engineering role focused on ML training... ...express and harder to misuse. Improve reliability, debuggability,...
Slope
San Francisco, CA
3 days ago
Research Engineer
$120k - $200k
...We are actively seeking a Research Engineer specializing in Machine Learning... ...technical expertise to build scalable systems, all within the innovative... ...creators, encouraging self-expression, and enabling users... ...testing and iterative improvement processes to optimize the...
Casual work
Work at office
Erth.AI Inc.
San Francisco, CA
3 days ago
AI/ML Research Engineer: Build Production-Ready Agents
...fast-growing enterprise AI startup in San Francisco, is seeking an AI/ML Research Engineer. This role is pivotal as you will join an elite founding team, working on designing multi-agent systems and vision-language models. Your research will rapidly transition into production...
Jack & Jill
San Francisco, CA
5 days ago
Staff Machine Learning Research Engineer, Agent Post-training - Enterprise GenAI
$264.8k - $331k
...Meta, we are doubling down on building out state of the art post-training... ...necessary for complex agents in enterprises around the world. The Enterprise ML Research Lab works on the front lines of... ...that enable complex multi-agent systems to directly learn from both process...
Full time
Scale AI
San Francisco, CA
2 days ago
Senior Research Engineer - Video Agents
$220k - $280k
...the role In your role as Senior Research Engineer, you'll be at the heart of building the next generation of generative... ...Storytelling team builds the agentic systems behind Canva's video product. We... ...to help define how Canva's video agents think, plan, and ship. You’ll...
Work at office
Local area
Flexible hours
black.ai
San Francisco, CA
5 days ago
Startup AI Engineer - Build & Scale Multi-Agent Systems
$250k
Acceler8 Talent is hiring engineers in San Francisco for a rapidly growing AI startup focused on building and deploying production AI systems. The team is deploying multi-agent AI systems and large-scale automation platforms, requiring strong engineering fundamentals and...
Acceler8 Talent
San Francisco, CA
4 days ago
Applied Research Engineer (Agents)
$250k - $300k
At Labelbox, we're building the critical infrastructure that powers... ...breakthrough AI models at leading research labs and enterprises. Since 2... ..., and quality control systems that enable teams to produce... ...benchmark and evaluate autonomous agent capabilities. Design agent-...
Work at office
Flexible hours
2 days per week
Labelbox
San Francisco, CA
2 days ago
Research Engineer, Post-Training
$231k - $340k
...rare chance to help build a generational... ...expert feedback and agent traces into models... ...are looking for a research engineer who can help scale... ...for someone who can self‑manage model... ...validation loops that improve quality on long‑horizon... ...and reward systems that are reliable...
harvey.ai
San Francisco, CA
3 days ago
Research Engineer — Search/IR
$180k - $270k
Research Engineer (Focused on Search/IR) You'll own and advance... ...information retrieval systems at the core of... ...search role where you'll build and operate everything... ...to connect search/IR improvements with model training and... ...incremental processing. Self‑directed experimenter...
Full time
Temporary work
Remote work
Firecrawl
San Francisco, CA
3 days ago
Research Engineer — End-to-End Curriculum Generation & RL Systems
Tykhe Inc in San Francisco, CA is seeking a Research Engineer who will be responsible for designing experiments and building task generation systems. You will work on generating realistic curricula and transforming research prototypes into reliable systems. The ideal candidate...
Tykhe Inc
San Francisco, CA
2 days ago
Senior AI Architect - Multi-Agent Systems & Platform Infrastructure
Senior AI Architect - Multi-Agent Systems & Platform Infrastructure Senior... ...& Orchestration / Head of Engineering Seniority: Senior-Level (... ...About Nivalto + AURA Nivalto is building AURA — the world’s first... ..., predictive analytics, and self-healing orchestration to ensure...
Full time
Work at office
Remote work
Nivalto
San Francisco, CA
1 day ago
Production AI Agent Engineer Intern: Build Real-World Systems
Gallop Intelligence Inc. in San Francisco is looking for innovative individuals to build production AI agents for Fortune 500 companies. You will be responsible for architecting and owning systems from start to finish, ensuring impactful deployments while working in a fast-...
Internship
Gallop Intelligence Inc.
San Francisco, CA
2 days ago
Head of AI Agent Systems
...Wonderschool Wonderschool builds software and systems that help businesses... ...building systems to improve compliance,... ...already deployed a multi-agent system using... ...operate across product, engineering, design, data, and operations... ...large teams Not a research or experimentation...
Immediate start
Shift work
Namely
San Francisco, CA
5 days ago
Member of Technical Staff: Agent Systems
$160k - $250k
...'re a team of founders, engineers, researchers, creatives, and operators building what we believe will be... ...engineers shaping the core systems that power Blok. You won’t just build agents - you’ll design the... ...and how their behavior improves over time . This is a deeply...
Work at office
Weekend work
3 days per week
Blok
San Francisco, CA
5 days ago
Research Engineer/Research Scientist, RL/Reasoning
$310k
...reinforcement learning research, building next-generation... ...Role As a Research Engineer/Research Scientist... ...and general-purpose agents, including the systems that power various... ...research. You're a self-starter who takes initiative... ...to debug and improve it. You have a deep...
Work at office
Relocation package
Slope
San Francisco, CA
5 days ago
Research Engineer / Scientist, Societal Impacts
$350k
...interpretable, and steerable AI systems. We want AI to be safe... ...group of committed researchers, engineers, policy experts, and... ...working together to build beneficial AI systems.... ...Interface with and improve our internal technical... ...Status Select... Voluntary Self-Identification For...
Full time
Contract work
For contractors
For subcontractor
Work at office
Visa sponsorship
Flexible hours
Menlo Ventures
San Francisco, CA
5 days ago
Research Engineer / Scientist, Alignment Science
$280k
...and steerable AI systems. We want AI to be... ...group of committed researchers, engineers, policy experts,... ...together to build beneficial AI systems... ...misalignment to improve our empirical understanding... .... Run multi-agent reinforcement... ...Select... Voluntary Self-Identification...
Contract work
For contractors
For subcontractor
Work at office
Relocation
Visa sponsorship
Work visa
Flexible hours
Menlo Ventures
San Francisco, CA
2 days ago
Applied Scientist/Research Engineer—Speech AI
...and conversational AI systems. This person will work across applied research, model development,... ...work closely with engineering and product teams to improve model quality, speed... ...Generation Systems Build and improve machine... ...data, weak labels, self‑supervised methods,...
GTN Technical Staffing
San Francisco, CA
4 days ago
Founding Research Engineer - Health AI Systems (Equity)
...technology firm in San Francisco is looking for a Founding Research Engineer to design and prototype core systems that convert messy health data into actionable... ...collaboration with clinicians and engineers to improve healthcare services and user experience. Competitive...
Lotus Health AI
San Francisco, CA
4 days ago
Research Systems Engineer
The role As a research systems engineer, you'll train frontier-scale models and develop the methods that make continual learning work inside enterprise... ...at scale, explore cutting‑edge RL techniques, and build the tools that let us understand what's actually happening...
Work at office
Visa sponsorship
Relocation package
Applied Compute
San Francisco, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer: Build Self-Improving Agent Systems. Be the first to apply!