Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Engineer: RL & Reasoning for Next-Gen LMs

Zyphra

A cutting-edge AI company based in San Francisco is seeking a Research Engineer specializing in Agency and Reasoning. The role focuses on performing research in reinforcement learning and applying innovative ideas to the next generation of their language models. Candidates should have a postgraduate degree in a scientific field and be proficient in PyTorch and Python. The company values creativity and provides a dynamic work environment with excellent benefits, including comprehensive health plans and unlimited PTO. #J-18808-Ljbffr Zyphra

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Research Engineer: RL & Reasoning for Next-Gen LMs in San Francisco, CA vacancy
  • $310k

    About the Team The RL and Reasoning team drives the core reasoning paradigm and has created...  ...of reinforcement learning research, building next-generation generative models, and deploying...  ...scale. About the Role As a Research Engineer/Research Scientist at OpenAI, you will... 
    Suggested
    Work at office
    Relocation package

    Slope

    San Francisco, CA
    15 hours ago
  •  ...Francisco, California. The Role: As a Research Engineer - Agency and Reasoning , you will be a core contributor to...  ...applying your ideas at scale to our next generation of language models. What...  ...model reasoning or more classical RL tasks Experience with language-model... 
    Suggested
    Work at office
    Relocation package

    Zyphra

    San Francisco, CA
    3 days ago
  •  ...Francisco, California. The Role: As a Research Engineer - Agency and Reasoning , you will be a core contributor to...  ...applying your ideas at scale to our next generation of language models....  ...model reasoning or more classical RL tasks Experience with language-model... 
    Suggested
    Work at office
    Relocation package

    Zyphra

    San Francisco, CA
    3 days ago
  • $350k

    Research Engineer, RL Infrastructure and Reliability (Knowledge Work) Anthropic’s mission is to create reliable, interpretable, and steerable...  ...has become stale or gameable. Able to read research code and reason evaluation integrity. Preferred Qualifications 5+ years of... 
    Suggested
    Visa sponsorship
    Shift work

    aijoblist

    San Francisco, CA
    15 hours ago
  •  ...plane and pair it with the full RL post-training stack:...  ...async RL trainer. We enable researchers, startups and enterprises to...  ...deployment contexts. As a Research Engineer in our Reasoning team, you'll play a crucial...  ...and tools, synthetic data gen research and proactively... 
    Suggested
    Remote work
    Worldwide
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime-Intellect

    San Francisco, CA
    2 days ago
  • A technology-driven AI company in San Francisco is hiring a Research Engineer for its Cybersecurity RL team. This role involves advancing AI capabilities in secure coding and vulnerability remediation by blending research with engineering tasks. Candidates should have... 
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    2 days ago
  • $200k

    A technology company in San Francisco is seeking a Software Engineer for their RL Research & Environments team. The role focuses on designing and improving data and evaluation systems to enhance model capabilities. Candidates should have a strong software engineering background... 

    SupportFinity™

    San Francisco, CA
    2 days ago
  • $250k - $350k

     ...Staff Machine Learning Research Engineer, Agent Post-training - Enterprise GenAI San Francisco...  ...MLRE, you will build out our next-gen Agent RL training platform. You'll build out the...  ...committed to working with and providing reasonable accommodations to applicants with... 
    Full time

    Scale AI

    San Francisco, CA
    3 days ago
  • $250k - $350k

     ...The Enterprise ML Research Lab works on the front lines...  ...clients. As an ML Sys Research Engineer, you'll work on building out the algorithms for our next-gen Agent RL training platform, support large...  ...working with and providing reasonable accommodations to applicants... 
    Full time

    Scale AI

    San Francisco, CA
    3 days ago
  • $180k - $270k

    Research Engineer (Focused on Search/IR) You'll own and advance the search and...  ...to tell you what to try next — you have a backlog of ideas...  ...the Head of Research and the RL‑focused Research Engineer to...  ...and when to use which. You can reason about relevance tradeoffs and... 
    Full time
    Temporary work
    Remote work

    Firecrawl

    San Francisco, CA
    3 days ago
  •  ...trustworthy searcher. We're hiring a Research Engineer to advance the science and...  ..., and decide what to try next. Build the instrumentation...  ...evaluations that distinguish genuine reasoning over evidence from plausible...  ...across post‑training, RL infrastructure, and product to... 
    Visa sponsorship

    Nerdleveltech

    San Francisco, CA
    15 hours ago
  •  ...team. You’ll work alongside researchers, operators, and AI companies...  ...About the Role As a Research Engineer at Mercor, you’ll work at the...  ...agentic behavior, and real-world reasoning. You’ll design and run evals,...  ..., rubric design, or RL‑style workflows that use evals... 
    Work at office

    Mercor

    San Francisco, CA
    15 hours ago
  • $320k

     ...a quickly growing group of committed researchers, engineers, policy experts, and business leaders...  ...capabilities. Designing and implementing RL environments for training defensive...  ...make you an offer, we will make every reasonable effort to get you a visa. #J-18808-Ljbffr... 
    Relocation
    Visa sponsorship

    Anthropic

    San Francisco, CA
    1 day ago
  •  ...infrastructure / Reinforcement Learning (RL) training data & evaluations...  ...The Opportunity Our partner is hiring a Research Engineer to help scale the quality assurance (QA...  ...or any other protected characteristic. Reasonable accommodations are available throughout... 
    Remote work

    talentpluto

    San Francisco, CA
    4 days ago
  • Prime-Intellect is seeking a Research Engineer in San Francisco to shape the technological direction of their AI infrastructure. This role demands expertise in AI/ML engineering and the ability to lead research efforts in synthetic data generation. You will optimize AI... 
    Remote job
    Flexible hours

    Prime-Intellect

    San Francisco, CA
    2 days ago
  • $180k - $270k

    Research Engineer (Focused on RL) You'll bring reinforcement learning to Firecrawl's core product — building...  ...this week than one polished one next month. And when you have results, you...  ...techniques, production instincts, and fast reasoning. Founder Chat (~30 min) — Culture,... 
    Full time
    Temporary work
    Remote work

    Firecrawl

    San Francisco, CA
    3 days ago
  • Research Engineer, Virtual Collaborator at Anthropic - San Francisco, CA | New York City, NY | Seattle...  ...and preventing reward hacking in RL systems Translating product requirements...  ...make you an offer, we will make every reasonable effort to get you a visa, and we retain... 
    Work at office
    Visa sponsorship
    Flexible hours

    Victrays

    San Francisco, CA
    15 hours ago
  • $320k

     ...a quickly growing group of committed researchers, engineers, policy experts, and business leaders...  ...post-training models via fine-tuning and RL, running evaluations on trained models...  ...make you an offer, we will make every reasonable effort to get you a visa, and we... 
    Full time
    Work at office
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    4 days ago
  •  ...a quickly growing group of committed researchers, engineers, policy experts, and business leaders...  ...AI systems. About the role The RL Velocity team owns the efficiency and...  ...make you an offer, we will make every reasonable effort to get you a visa, and we retain... 
    Remote job
    Work at office
    Visa sponsorship
    Flexible hours
    San Francisco, CA
    a month ago
  •  ...Turing is the world’s leading research accelerator for frontier AI...  ...pipelines that advance thinking, reasoning, coding, multimodality, and...  ...and reinforcement learning (RL) environments that power post...  ...: Environments for Software Engineering / coding agents UI-Environments... 
    For contractors
    Flexible hours

    Cerebras

    San Francisco, CA
    4 days ago
  • $315k - $340k

     ...a quickly growing group of committed researchers, engineers, policy experts, and business leaders...  ...and honesty. Develop and test novel RL environments that reward truthful outputs...  ...We sponsor visas. We will make every reasonable effort to obtain a visa if you are offered... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    4 days ago
  •  ...the first AI software engineer, and Windsurf, an AI-native...  ..., former founders, and researchers from the frontier of AI...  ...what they need next, and build systems that...  ...hold up at our largest RL training scales. Performance...  ..., and the ability to reason about performance across... 

    Cognition

    San Francisco, CA
    1 day ago
  •  ...startup focused on building next-generation Embodied AI systems...  ...experience in frontier AI research, robotics engineering, product development, and company...  ...systems that can perceive, reason, learn, and act in the...  ...optimization, offline and online RL, model‑based RL, reward... 
    Full time
    Internship

    Stryker Corporation

    San Francisco, CA
    4 days ago
  • A leading AI company based in San Francisco is seeking a research systems engineer to train large language models and explore reinforcement learning techniques. The ideal candidate will work at the intersection of research and systems design experiments at scale. Benefits... 

    Applied Compute Inc.

    San Francisco, CA
    15 hours ago
  • Pantograph is looking for research engineers to build robots that learn through exploration in the real world. Ideal candidates will have strong foundations in reinforcement learning and experience working with large GPU clusters, Kubernetes, and complex distributed systems... 

    Pantograph

    San Francisco, CA
    3 days ago
  • $350k

     ...growing group of committed researchers, engineers, policy experts, and business...  ...searches, retrieves, and reasons over information at scale....  ...environment design, data curation, RL training, evaluation, and...  ...results, and decide what to try next * Develop evaluations that... 
    Full time
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    1 day ago
  • talentpluto is seeking a Research Engineer to enhance the quality assurance (QA) systems supporting training data for reinforcement learning. This position demands close collaboration with stakeholders to guarantee reliability and consistency in datasets. Key responsibilities... 

    talentpluto

    San Francisco, CA
    3 days ago
  • Job Title: AI Research Engineer About Xterra Xterra is a Khosla Ventures-backed...  ...building AI agents that reason about complex scientific problems...  ...where we test what's next. What You'll Work On Building...  ...Experience with multimodal VLMs, RL fine-tuning, and evaluation methodology... 

    Xterraai

    San Francisco, CA
    3 days ago
  • $320k - $405k

     ...Offensive Security Research Engineer, Safeguards San Francisco, CA About Anthropic Anthropic's mission is to create reliable, interpretable...  ...candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration... 
    Work at office
    Visa sponsorship
    Flexible hours

    anthropic

    San Francisco, CA
    7 days ago
  • $160k - $240k

    Research Engineer — Evals Location: San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-1...  ...system. Close the loop with models and RL. Evals here aren't a reporting layer — they...  ...directly influence what gets trained next. Run fast experiments and communicate clearly... 
    Full time
    Temporary work
    Work at office
    Remote work

    AI Chopping Block, Inc.

    San Francisco, CA
    15 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer: RL & Reasoning for Next-Gen LMs. Be the first to apply!