Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Engineer: Build Self-Improving Agent Systems

Judgment Labs Inc.

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, our ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. Instead of reactive incident triage, they cluster patterns across conversations and workflows, correlate regressions to specific interaction types, and pinpoint where reliability breaks down in their usage context. We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others. The Role: We are looking for Research Engineers to build AI systems that use agent interaction data to help us understand how agents behave, evaluate them at scale, and improve them through learning and feedback. Your research will not live on a whiteboard. You'll work directly with real-world agent data, apply frontier methods in production, and see your work ship immediately into the product. By making agent behavior measurable and debuggable, your systems will support teams deploying agents across finance, legal, operations, and other high-stakes workflows. You will own projects end-to-end, with significant autonomy, and work closely with the team to build self-improving agent systems. What You'll Do: Build systems to aggregate, index, and analyze large-scale agent interaction data to extract meaningful evaluation signals Develop agent-based systems for analyzing and evaluating complex, long-running behaviors Design and implement post-training and optimization workflows to improve agent behavior Build internal tools and infrastructure to support rapid experimentation, analysis, and training What We're Looking For: You identify with at least one of the following: You care about data quality, evaluation, and benchmarking, and are comfortable working hands-on with messy data You have experience building agent systems and working with them in real-world or production settings You have a strong background in reinforcement learning, agents, or machine learning fundamentals You are comfortable working across infrastructure and systems, spanning training, data pipelines, and model serving. You are comfortable working across teams to translate research into product, balancing real-world customer constraints and tradeoffs. You enjoy turning ambiguous problems into clear, well-designed plans Why Judgment? Agents can’t work without this. Today’s agents hallucinate, drift, and break in production. We’re building the infrastructure that fixes this: the monitoring layer that makes agents self-improving. We’re wired to win. We're a team of less than 20 but we ship like 50+ on the daily. You'll be working with olympiad medalists, debate champions, and competitive athletes who bring that same intensity to company building. Fast track to founding. Our engineers interface directly with customers, ship code into their environments, and use their feedback to dictate what’s next on the roadmap. Everyone on the team is either an ex-founder or a founder-to-be. We make sure our people do their best work. If you deserve a spot on the team, money will never get in the way of it. Full benefits, Equinox, and a private chef to take care of you. We sprint hard but we play hard, ask us about our Smash/Mario Kart tournaments. #J-18808-Ljbffr Judgment Labs Inc.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Research Engineer: Build Self-Improving Agent Systems in San Francisco, CA vacancy
  • $300k

    Research Engineer, Agent Systems One of the most mission-driven organizations in AI is building the infrastructure that makes intelligent agents safe...  ..., act, fail, recover, and improve in production. $300K - $6...  ...agents can operate and self-validate safely Continuously... 
    Suggested
    Visa sponsorship

    Aionia Group

    San Francisco, CA
    3 days ago
  • Judgment Labs is searching for Research Engineers in San Francisco to build AI systems utilizing agent interaction data. You will work with real-world data, impacting...  ...thrive in a fast-paced environment focused on improving agent performance. The company provides full benefits... 
    Suggested

    Judgment Labs

    San Francisco, CA
    1 day ago
  •  ...San Francisco is looking for a candidate to drive research initiatives that influence engineering solutions. You'll build evaluations using real tool data, tackle search challenges for tools, and train systems for improved accuracy. Ideal candidates will have research... 
    Suggested

    Composio

    San Francisco, CA
    18 hours ago
  • $160k - $300k

     ...meaningful use cases. The Agents team builds everything from...  ..., multi-source research. We’ve built our...  ...by distributed systems built for scale....  ...LLM inference engine - a distributed,...  ...business problems, improving processes, and enhancing...  ...* Voluntary Self-Identification... 
    Suggested
    Contract work
    For contractors
    For subcontractor
    Work at office

    Hebbia

    San Francisco, CA
    1 day ago
  • Embedding VC is seeking a founding engineer to help build core products and systems. Work directly with the CEO and CTO as part of an experienced team. You will design AI systems, implement features, and enhance product usability. The ideal candidate has over 5 years of... 
    Suggested
    Flexible hours

    Embedding VC

    San Francisco, CA
    2 days ago
  • $264.8k - $331k

     ...are doubling down on building out state of the art post...  ...necessary for complex agents in enterprises around...  ...The Enterprise ML Research Lab works on the front...  ...As an ML Sys Research Engineer, you'll work on building...  ...technologies to optimize our ML system. Your customer will be... 
    Full time

    Scale AI

    San Francisco, CA
    2 days ago
  • $150k - $250k

     ...organizations. We research and deploy...  ...spans research into self-constructing systems, the development...  ...drive incremental improvements on benchmarks or...  ...The Multi-Agent Systems team focuses...  ...processes. Researchers build systems that...  ...a software engineer you need to be able... 
    Work at office
    3 days per week

    Distyl AI

    San Francisco, CA
    4 days ago
  • $300k

    Aionia Group in San Francisco is looking for a Research Engineer, Agent Systems. This role involves developing foundational systems that ensure agent reliability and safety in real-world applications. You will work directly with top researchers in a mission-driven environment... 

    Aionia Group

    San Francisco, CA
    2 days ago
  • $295k - $380k

     ...Team The team works on research and systems that advance frontier...  ...recipes, which means we also build the infrastructure needed to...  ...Role This is a systems engineering role focused on ML training...  ...and harder to misuse. Improve reliability, debuggability,... 

    OpenAI

    San Francisco, CA
    4 days ago
  • $180.6k - $315k

     ...are doubling down on building out state of the art post...  ...necessary for complex agents in enterprises around...  .... The Enterprise ML Research Lab works on the front...  ...actionable insights to use to improve agents Contribute to...  ...develop reliable AI systems for the world's most... 
    Full time

    Scale AI

    San Francisco, CA
    6 days ago
  • $122k - $215k

     ...learn more visit: As a Research Engineer, you will be at the...  ...algorithms for our self-driving vehicles. You...  ...data and simulations, to improve the accuracy, robustness...  ...to our production systems, collaborating closely...  ...Regularly scheduled team building activities and social... 
    Full time
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    more than 2 months ago
  • $134k - $235k

     ...learn more visit: As a Research Engineer in Neural Rendering,...  ...-sensor rendering systems for autonomous driving...  ...scientists and engineers to build innovative, practical,...  ...solutions for self-driving. We value original...  ...autonomy and safety teams to improve the realism and... 
    Full time
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    more than 2 months ago
  •  ...fast-growing enterprise AI startup in San Francisco, is seeking an AI/ML Research Engineer. This role is pivotal as you will join an elite founding team, working on designing multi-agent systems and vision-language models. Your research will rapidly transition into production... 

    Jack & Jill

    San Francisco, CA
    4 days ago
  • $250k - $300k

    At Labelbox, we're building the critical infrastructure that powers...  ...breakthrough AI models at leading research labs and enterprises. Since 2...  ..., and quality control systems that enable teams to produce...  ...benchmark and evaluate autonomous agent capabilities. Design agent-... 
    Work at office
    Flexible hours
    2 days per week

    Labelbox

    San Francisco, CA
    1 day ago
  • $220k - $280k

     ...the role In your role as Senior Research Engineer, you'll be at the heart of building the next generation of generative...  ...Storytelling team builds the agentic systems behind Canva's video product. We...  ...to help define how Canva's video agents think, plan, and ship. You’ll... 
    Work at office
    Local area
    Flexible hours

    black.ai

    San Francisco, CA
    4 days ago
  • $320k

     ...interpretable, and steerable AI systems. We want AI to be...  ...group of committed researchers, engineers, policy experts, and...  ...working together to build beneficial AI systems...  ...ensuring safety with self‑improving, highly autonomous AI...  ...that arise when agents interface with the external... 
    Relocation
    Visa sponsorship

    Anthropic

    San Francisco, CA
    18 hours ago
  • $180k - $270k

    Research Engineer (Focused on Search/IR) You'll own and advance...  ...information retrieval systems at the core of...  ...search role where you'll build and operate everything...  ...to connect search/IR improvements with model training and...  ...incremental processing. Self‑directed experimenter... 
    Full time
    Temporary work
    Remote work

    Firecrawl

    San Francisco, CA
    2 days ago
  • $315k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...group of committed researchers, engineers, policy experts, and...  ...working together to build beneficial AI systems....  ...interpretability to improve the safety of LLMs...  ...* Select... Voluntary Self-Identification For government... 
    Contract work
    For contractors
    For subcontractor
    Work at office
    Remote work
    Relocation
    Visa sponsorship
    Work visa
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    4 days ago
  •  ...constraints of physical systems to improve peoples’ lives....  ...Multi‑View Geometry Engineer on the Robotics team,...  ...practical experience building robust perception systems...  ...working closely with AI researchers and engineers. This...  ...such as in robotics, self‑driving vehicles, AR/... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    18 hours ago
  • $380k

     ...Type Hybrid Department Research Compensation $380K...  ...of physical systems to improve peoples’ lives. About...  ...Multi‑View Geometry Engineer on the Robotics team,...  ...practical experience building robust perception systems...  ...such as in robotics, self‑driving vehicles, AR/... 
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Centaur Labs

    San Francisco, CA
    4 days ago
  •  ...is committed to helping build strong and inclusive...  ...and we do not request self-recorded video responses...  ...Our Team Agentic AI Engineering Intern Engineering & Innovation...  ...Engineering/Power System Intern Engineering &...  ..., Enterprise Systems & Agent Integrations Operational... 
    Internship
    Remote work
    Night shift

    SB Energy

    San Francisco, CA
    20 hours ago
  •  ...Head Of Ai Agent Systems San Francisco About Wonderschool...  ...Wonderschool builds software and systems that...  ...also building systems to improve compliance, oversight,...  ...across product, engineering, design, data, and operations...  ...large teams Not a research or experimentation... 
    Immediate start
    Shift work

    Wonderschool

    San Francisco, CA
    a month ago
  •  ...Senior AI Architect – Multi-Agent Systems & Platform Infrastructure...  ...Systems & Orchestration / Head of Engineering Seniority: Senior-Level (...  ...+ AURA Nivalto is building AURA — the world’s first fiduciary...  ..., predictive analytics, and self-healing orchestration to... 
    Full time
    Work at office
    Remote work

    Nivalto

    San Francisco, CA
    12 days ago
  • $310k

     ...reinforcement learning research, building next-generation...  ...Role As a Research Engineer/Research Scientist...  ...and general-purpose agents, including the systems that power various...  ...research. You're a self-starter who takes initiative...  ...to debug and improve it. You have a deep... 
    Work at office
    Relocation package

    Slope

    San Francisco, CA
    4 days ago
  • $280k

     ...and steerable AI systems. We want AI to be...  ...group of committed researchers, engineers, policy experts,...  ...together to build beneficial AI systems...  ...misalignment to improve our empirical understanding...  .... Run multi-agent reinforcement...  ...Select... Voluntary Self-Identification... 
    Contract work
    For contractors
    For subcontractor
    Work at office
    Relocation
    Visa sponsorship
    Work visa
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    1 day ago
  • Solving Self-Improving Superintelligence The human brain...  .... At Letta, we’re building self-improving artificial...  ...: creating agents that continually learn...  ...already power production systems at companies like 11...  ...world-class team of researchers and engineers to solve AI’s... 

    Letta

    San Francisco, CA
    4 days ago
  • $350k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...group of committed researchers, engineers, policy experts, and...  ...working together to build beneficial AI systems....  ...Interface with and improve our internal technical...  ...Status Select... Voluntary Self-Identification For... 
    Full time
    Contract work
    For contractors
    For subcontractor
    Work at office
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    4 days ago
  •  ...Company.ai is building a network of category...  ...is applied research with...  ...Personalization that improves outcomes without...  ...because the best agent research comes...  ...personalization systems at scale Day...  ...verification, and self correction...  ...partner with product engineers, instrument... 
    Relocation package

    Company.ai

    San Francisco, CA
    3 days ago
  • $160k - $230k

     ...About the Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing...  ...join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation... 
    Full time
    Remote work

    Together AI

    San Francisco, CA
    3 days ago
  •  ...Research Systems Engineer As a research systems engineer, you'll train frontier-scale models and develop...  ...cutting-edge RL techniques, and build the tools that let us understand what'...  ...infrastructure for companies to build agent workforces trained on proprietary data... 
    Visa sponsorship
    Relocation package

    Applied Compute

    San Francisco, CA
    18 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer: Build Self-Improving Agent Systems. Be the first to apply!