Research Engineer: Build Self-Improving Agent Systems
Judgment Labs Inc.
Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, our ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. Instead of reactive incident triage, they cluster patterns across conversations and workflows, correlate regressions to specific interaction types, and pinpoint where reliability breaks down in their usage context. We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others. The Role: We are looking for Research Engineers to build AI systems that use agent interaction data to help us understand how agents behave, evaluate them at scale, and improve them through learning and feedback. Your research will not live on a whiteboard. You'll work directly with real-world agent data, apply frontier methods in production, and see your work ship immediately into the product. By making agent behavior measurable and debuggable, your systems will support teams deploying agents across finance, legal, operations, and other high-stakes workflows. You will own projects end-to-end, with significant autonomy, and work closely with the team to build self-improving agent systems. What You'll Do: Build systems to aggregate, index, and analyze large-scale agent interaction data to extract meaningful evaluation signals Develop agent-based systems for analyzing and evaluating complex, long-running behaviors Design and implement post-training and optimization workflows to improve agent behavior Build internal tools and infrastructure to support rapid experimentation, analysis, and training What We're Looking For: You identify with at least one of the following: You care about data quality, evaluation, and benchmarking, and are comfortable working hands-on with messy data You have experience building agent systems and working with them in real-world or production settings You have a strong background in reinforcement learning, agents, or machine learning fundamentals You are comfortable working across infrastructure and systems, spanning training, data pipelines, and model serving. You are comfortable working across teams to translate research into product, balancing real-world customer constraints and tradeoffs. You enjoy turning ambiguous problems into clear, well-designed plans Why Judgment? Agents can’t work without this. Today’s agents hallucinate, drift, and break in production. We’re building the infrastructure that fixes this: the monitoring layer that makes agents self-improving. We’re wired to win. We're a team of less than 20 but we ship like 50+ on the daily. You'll be working with olympiad medalists, debate champions, and competitive athletes who bring that same intensity to company building. Fast track to founding. Our engineers interface directly with customers, ship code into their environments, and use their feedback to dictate what’s next on the roadmap. Everyone on the team is either an ex-founder or a founder-to-be. We make sure our people do their best work. If you deserve a spot on the team, money will never get in the way of it. Full benefits, Equinox, and a private chef to take care of you. We sprint hard but we play hard, ask us about our Smash/Mario Kart tournaments. #J-18808-Ljbffr Judgment Labs Inc.
$300k
Research Engineer, Agent Systems One of the most mission-driven organizations in AI is building the infrastructure that makes intelligent agents safe... ..., act, fail, recover, and improve in production. $300K - $6... ...agents can operate and self-validate safely Continuously...SuggestedVisa sponsorship- Judgment Labs is searching for Research Engineers in San Francisco to build AI systems utilizing agent interaction data. You will work with real-world data, impacting... ...thrive in a fast-paced environment focused on improving agent performance. The company provides full benefits...Suggested
- ...San Francisco is looking for a candidate to drive research initiatives that influence engineering solutions. You'll build evaluations using real tool data, tackle search challenges for tools, and train systems for improved accuracy. Ideal candidates will have research...Suggested
$160k - $300k
...meaningful use cases. The Agents team builds everything from... ..., multi-source research. We’ve built our... ...by distributed systems built for scale.... ...LLM inference engine - a distributed,... ...business problems, improving processes, and enhancing... ...* Voluntary Self-Identification...SuggestedContract workFor contractorsFor subcontractorWork at office- Embedding VC is seeking a founding engineer to help build core products and systems. Work directly with the CEO and CTO as part of an experienced team. You will design AI systems, implement features, and enhance product usability. The ideal candidate has over 5 years of...SuggestedFlexible hours
$264.8k - $331k
...are doubling down on building out state of the art post... ...necessary for complex agents in enterprises around... ...The Enterprise ML Research Lab works on the front... ...As an ML Sys Research Engineer, you'll work on building... ...technologies to optimize our ML system. Your customer will be...Full time$150k - $250k
...organizations. We research and deploy... ...spans research into self-constructing systems, the development... ...drive incremental improvements on benchmarks or... ...The Multi-Agent Systems team focuses... ...processes. Researchers build systems that... ...a software engineer you need to be able...Work at office3 days per week$300k
Aionia Group in San Francisco is looking for a Research Engineer, Agent Systems. This role involves developing foundational systems that ensure agent reliability and safety in real-world applications. You will work directly with top researchers in a mission-driven environment...$295k - $380k
...Team The team works on research and systems that advance frontier... ...recipes, which means we also build the infrastructure needed to... ...Role This is a systems engineering role focused on ML training... ...and harder to misuse. Improve reliability, debuggability,...$180.6k - $315k
...are doubling down on building out state of the art post... ...necessary for complex agents in enterprises around... .... The Enterprise ML Research Lab works on the front... ...actionable insights to use to improve agents Contribute to... ...develop reliable AI systems for the world's most...Full time$122k - $215k
...learn more visit: As a Research Engineer, you will be at the... ...algorithms for our self-driving vehicles. You... ...data and simulations, to improve the accuracy, robustness... ...to our production systems, collaborating closely... ...Regularly scheduled team building activities and social...Full timeWork at officeWork from homeFlexible hours$134k - $235k
...learn more visit: As a Research Engineer in Neural Rendering,... ...-sensor rendering systems for autonomous driving... ...scientists and engineers to build innovative, practical,... ...solutions for self-driving. We value original... ...autonomy and safety teams to improve the realism and...Full timeWork at officeWork from homeFlexible hours- ...fast-growing enterprise AI startup in San Francisco, is seeking an AI/ML Research Engineer. This role is pivotal as you will join an elite founding team, working on designing multi-agent systems and vision-language models. Your research will rapidly transition into production...
$250k - $300k
At Labelbox, we're building the critical infrastructure that powers... ...breakthrough AI models at leading research labs and enterprises. Since 2... ..., and quality control systems that enable teams to produce... ...benchmark and evaluate autonomous agent capabilities. Design agent-...Work at officeFlexible hours2 days per week$220k - $280k
...the role In your role as Senior Research Engineer, you'll be at the heart of building the next generation of generative... ...Storytelling team builds the agentic systems behind Canva's video product. We... ...to help define how Canva's video agents think, plan, and ship. You’ll...Work at officeLocal areaFlexible hours$320k
...interpretable, and steerable AI systems. We want AI to be... ...group of committed researchers, engineers, policy experts, and... ...working together to build beneficial AI systems... ...ensuring safety with self‑improving, highly autonomous AI... ...that arise when agents interface with the external...RelocationVisa sponsorship$180k - $270k
Research Engineer (Focused on Search/IR) You'll own and advance... ...information retrieval systems at the core of... ...search role where you'll build and operate everything... ...to connect search/IR improvements with model training and... ...incremental processing. Self‑directed experimenter...Full timeTemporary workRemote work$315k
...interpretable, and steerable AI systems. We want AI to be safe... ...group of committed researchers, engineers, policy experts, and... ...working together to build beneficial AI systems.... ...interpretability to improve the safety of LLMs... ...* Select... Voluntary Self-Identification For government...Contract workFor contractorsFor subcontractorWork at officeRemote workRelocationVisa sponsorshipWork visaFlexible hours- ...constraints of physical systems to improve peoples’ lives.... ...Multi‑View Geometry Engineer on the Robotics team,... ...practical experience building robust perception systems... ...working closely with AI researchers and engineers. This... ...such as in robotics, self‑driving vehicles, AR/...Work at officeRelocation package
$380k
...Type Hybrid Department Research Compensation $380K... ...of physical systems to improve peoples’ lives. About... ...Multi‑View Geometry Engineer on the Robotics team,... ...practical experience building robust perception systems... ...such as in robotics, self‑driving vehicles, AR/...Full timeWork at officeLocal areaRelocation packageFlexible hours- ...is committed to helping build strong and inclusive... ...and we do not request self-recorded video responses... ...Our Team Agentic AI Engineering Intern Engineering & Innovation... ...Engineering/Power System Intern Engineering &... ..., Enterprise Systems & Agent Integrations Operational...InternshipRemote workNight shift
- ...Head Of Ai Agent Systems San Francisco About Wonderschool... ...Wonderschool builds software and systems that... ...also building systems to improve compliance, oversight,... ...across product, engineering, design, data, and operations... ...large teams Not a research or experimentation...Immediate startShift work
- ...Senior AI Architect – Multi-Agent Systems & Platform Infrastructure... ...Systems & Orchestration / Head of Engineering Seniority: Senior-Level (... ...+ AURA Nivalto is building AURA — the world’s first fiduciary... ..., predictive analytics, and self-healing orchestration to...Full timeWork at officeRemote work
$310k
...reinforcement learning research, building next-generation... ...Role As a Research Engineer/Research Scientist... ...and general-purpose agents, including the systems that power various... ...research. You're a self-starter who takes initiative... ...to debug and improve it. You have a deep...Work at officeRelocation package$280k
...and steerable AI systems. We want AI to be... ...group of committed researchers, engineers, policy experts,... ...together to build beneficial AI systems... ...misalignment to improve our empirical understanding... .... Run multi-agent reinforcement... ...Select... Voluntary Self-Identification...Contract workFor contractorsFor subcontractorWork at officeRelocationVisa sponsorshipWork visaFlexible hours- Solving Self-Improving Superintelligence The human brain... .... At Letta, we’re building self-improving artificial... ...: creating agents that continually learn... ...already power production systems at companies like 11... ...world-class team of researchers and engineers to solve AI’s...
$350k
...interpretable, and steerable AI systems. We want AI to be safe... ...group of committed researchers, engineers, policy experts, and... ...working together to build beneficial AI systems.... ...Interface with and improve our internal technical... ...Status Select... Voluntary Self-Identification For...Full timeContract workFor contractorsFor subcontractorWork at officeVisa sponsorshipFlexible hours- ...Company.ai is building a network of category... ...is applied research with... ...Personalization that improves outcomes without... ...because the best agent research comes... ...personalization systems at scale Day... ...verification, and self correction... ...partner with product engineers, instrument...Relocation package
$160k - $230k
...About the Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing... ...join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation...Full timeRemote work- ...Research Systems Engineer As a research systems engineer, you'll train frontier-scale models and develop... ...cutting-edge RL techniques, and build the tools that let us understand what'... ...infrastructure for companies to build agent workforces trained on proprietary data...Visa sponsorshipRelocation package
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Research Engineer: Build Self-Improving Agent Systems. Be the first to apply!
- deep learning research engineer San Francisco, CA
- research software engineer San Francisco, CA
- research programmer San Francisco, CA
- senior research engineer San Francisco, CA
- research assistant engineering San Francisco, CA
- research engineer San Francisco, CA
- ai research engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- booking agent San Francisco, CA
- sourcing agent San Francisco, CA


