Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Scientist - RL Training

$200k - $325k

Neura Market

About Snorkel At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data. We’re on a mission to help enterprises transform expert knowledge into specialized AI at scale. The AI landscape has gone through incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production‑ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler! ABOUT THE ROLE We’re looking for a Research Scientist to work on reinforcement learning for training and aligning large language models. This is a foundational research role focused on one of the most consequential open data problems in AI: how to generate the data, reward signals, and training procedures that steer LLM behavior in reliable and generalizable directions — and a core capability that directly differentiates Snorkel’s data-as-a-service offering. You’ll work closely with Snorkel’s research, engineering, and delivery teams to advance our RL data capabilities — translating research ideas into the preference datasets, reward models, and RL‑ready corpora we produce for frontier AI labs, and contributing to a research agenda that is central to Snorkel’s long‑term differentiation as a provider of bespoke training data. MAIN RESPONSIBILITIES Research and implement reinforcement learning techniques — including GRPO, RLHF, RLAIF, DPO, and reward modeling — and translate them into data products (preference datasets, reward signals, verifiable rewards) that customers can use to train and fine‑tune large language models. Design and build data pipelines that generate high‑quality training signal for RL workflows, including AI‑assisted data annotation and curation data pipelines to improve model generalization to unseen benchmarks. Prototype and iterate on end‑to‑end RL training recipes that inform what data Snorkel ships as part of its data‑as‑a‑service deliveries. Work closely with research scientists, ML engineers, and delivery teams to translate RL research into customer‑ready data products. Stay current with the latest developments in large‑scale muli‑node LLM training, alignment research, and scalable RL methods (on complex environments such as Terminal‑Bench), bringing relevant advances into Snorkel’s data‑as‑a‑service approach. Contribute to Snorkel’s research publications and internal knowledge base in RL and model training. PREFERRED QUALIFICATIONS Deep expertise in reinforcement learning from human or AI feedback, reward modeling and credit attribution ideally with a clear perspective on what data makes these techniques work. Experience training or fine‑tuning 30B+ large language models at scale, including familiarity with distributed training infrastructure. Strong proficiency in Python and ML frameworks, especially PyTorch and HuggingFace and hands‑on experience with RL frameworks such as Verl and SkyRL. Solid software engineering fundamentals — you can build research prototypes that others can run, extend, and integrate into data production workflows. Familiarity with ML infrastructure and cloud platforms and tools (AWS, GCP, Kubernetes, Slurm, etc.); experience with large‑scale RL training pipelines a strong plus. Comfort operating in a high‑iteration environment with open‑ended research questions and shifting, customer‑driven technical constraints. Ph.D. in machine learning, reinforcement learning, or a related field strongly preferred; exceptional industry experience considered. Salary Range: $200,000—$325,000 USD Be Your Best at Snorkel Joining Snorkel AI means becoming part of a company that has market proven solutions, robust funding, and is scaling rapidly—offering a unique combination of stability and the excitement of high growth. As a member of our team, you’ll have meaningful opportunities to shape priorities and initiatives, influence key strategic decisions, and directly impact our ongoing success. Whether you’re looking to deepen your technical expertise, explore leadership opportunities, or learn new skills across multiple functions, you’re fully supported in building your career in an environment designed for growth, learning, and shared success. Snorkel AI is proud to be an Equal Employment Opportunity employer and is committed to building a team that represents a variety of backgrounds, perspectives, and skills. Snorkel AI embraces diversity and provides equal employment opportunities to all employees and applicants for employment. Snorkel AI prohibits discrimination and harassment of any type on the basis of race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local law. All employment is decided on the basis of qualifications, performance, merit, and business need. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation. #J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Research Scientist - RL Training in Redwood City, CA vacancy
  • $204k - $259k

     ...foster collaborations with other research teams in Alphabet. AI...  ...you will report to a Principal Scientist. You will: Participate in Waymo’s Foundation World Model post-training and evaluation Research and develop cutting edge RL and Distillation techniques for... 
    Training
    Temporary work
    Remote work

    Neura Market

    Mountain View, CA
    1 day ago
  •  ...robotics technology company in Redwood City is seeking a Research Engineer / Scientist to develop and deploy AI models for robotic manipulation...  ...research and engineering, requiring advanced skills in model training and debugging. Ideal candidates will have over five years... 
    Training

    Dyna Robotics

    Redwood City, CA
    1 day ago
  • $204k - $259k

     ...Neura Market is seeking a skilled researcher for a hybrid role focusing on reinforcement learning and autonomous driving technologies....  ...experience. The successful candidate will engage in advanced model training techniques and contribute to the research community through... 
    Training

    Neura Market

    Mountain View, CA
    2 days ago
  • $180k - $300k

     ...Datology in Redwood City, California is seeking a Research Scientist to enhance deep learning models by leveraging training data insights. The ideal candidate has over 3 years of experience in deep learning research, autonomy in a fast-paced environment, and strong collaboration... 
    Training

    Datology

    Redwood City, CA
    1 day ago
  •  ...We're representing an early-stage applied research lab building AI capable of open-ended...  ...hand them What you'll do: Develop RL methods for agents that discover useful objectives...  ...-play, and intrinsic motivation Build training loops where agents learn from interaction... 
    Training
    Permanent employment
    Full time

    Brahma Consulting Group

    San Mateo, CA
    2 days ago
  •  ...Rhoda ai is seeking Research Scientists and Engineers in Palo Alto, California. The role focuses on adapting web-pretrained models for real robot tasks, requiring expertise in reinforcement learning and robotics. Ideal candidates possess hands-on experience in robotic... 
    Training

    Rhoda ai

    Palo Alto, CA
    2 days ago
  •  ...About the role Ambient.ai is hiring a Senior Applied Research Scientist to build the next generation of foundation models for computer vision...  ...role, you’ll own full‑cycle model development: from pre‑training and fine‑tuning on image‑language data to applying distillation... 
    Training
    Full time
    Local area
    Flexible hours

    Ambient

    Redwood City, CA
    1 day ago
  • $180k - $300k

     ...DatologyAI is seeking a Research Scientist in Redwood City, CA, to enhance deep learning models through effective data management. The ideal...  .... Responsibilities include researching interventions on training data to improve model quality, collaborating with engineers,... 
    Training

    datologyai

    Redwood City, CA
    1 day ago
  • $100k - $300k

     ...Research Scientist At Skild AI, we are building the world's first general purpose robotic intelligence...  ...Develop new algorithms and methods for training general-purpose robot foundation models...  ...disciplines (Perception, Robotics, RL / IL, Machine Learning, etc.). Work... 
    Training

    United Cerebral Palsy of Georgia

    San Mateo, CA
    1 day ago
  • $180k - $300k

     ...About the Role We're looking for a Research Scientist to investigate how intervening on training data can improve the quality and shape the behavior of deep learning models. You'll source and implement ideas from the literature, conduct research grounded in real customer... 
    Training
    Work at office
    Relocation package

    Datology

    Redwood City, CA
    1 day ago
  • $180k - $260k

     ...About the Company Companies want to train their own large models on their owndata. The...  ...model quality at worst. There is compelling research showing that smarterdataselection can...  ...the Role We’re looking for a Research Scientist to lead work on post-training data curation... 
    Training
    Work at office
    Relocation package

    datologyai

    Redwood City, CA
    1 day ago
  •  ...Machine Learning Research Scientist At Autoscience Institute, we create AI systems that autonomously conduct AI research. Recently, we...  ...to build and deploy production-ready research systems. RL post-train and fine-tune reasoning models to automate components of the... 
    Training
    Full time
    Flexible hours

    Autoscience Institute

    Menlo Park, CA
    4 days ago
  •  ...Centific Global Solutions, Inc. is seeking an experienced ML/AI Engineer to lead research and development in post-training and simulation for enterprise applications. You will design and deploy LLM agents, mentor team members, and contribute to cutting-edge advancements... 
    Training

    Centific Global Solutions, Inc.

    East Palo Alto, CA
    1 day ago
  • $207k - $300k

    Research Scientist, Gemini Retrieval and Agera, DeepMind Mountain View, CA, USA Required qualifications...  ...Knowledge of Reinforcement Learning (RL) or automated evaluation systems....  ...implementing complex algorithms and multi‑stage training pipelines. Interest in the... 
    Training
    Full time

    Google Inc.

    Mountain View, CA
    3 days ago
  • $174k - $252k

     ...years of experience leading a research agenda. Experience with...  ...learning (RLHF, RLVF, RLGF, offline RL). One or more scientific...  ...tools. Experience building, training, and fine-tuning large language...  ...types of work. As a Research Scientist, you'll setup large-scale... 
    Training
    Full time

    Google Inc.

    Mountain View, CA
    4 days ago
  • $193.93k - $291.15k

     ...ML Research Scientist, Prediction & Smart Agents Mountain View, California (HQ) Nuro is a...  ...agents to enable effective closed-loop training in simulation. If you are passionate...  ...agent training via Reinforcement Learning (RL). Mitigate accumulated uncertainties... 
    Training

    Nuro

    Mountain View, CA
    4 days ago
  •  ...handling scenarios unseen in training. We work at the intersection...  ...robotics, and systems, with a research team that includes researchers...  ...We're looking for a Research Scientist or Research Engineer to...  ...experience in imitation learning, RL, or diffusion-based policies... 
    Training

    Rhoda ai

    Palo Alto, CA
    1 day ago
  •  ...possibly by our cutting edge research and end-to-end system design....  .... We're looking for Research Scientists and Research Engineers with deep...  ...to real robot tasks. Post-training at Rhoda means taking a causal...  ...’ll Do Design and implement RL training pipelines to improve... 
    Training
    Shift work

    Rhoda ai

    Palo Alto, CA
    1 day ago
  • $110k - $115k

     ...therapeutic outcomes for patients. About the role The Clinical Research Scientist provides broad analytic support for Clinical Affairs...  ...writing reports and manuscripts. Qualifications Education and Training Minimum of a bachelor’s degree in Life Sciences, Computer Science... 
    Training
    Full time
    Temporary work
    Work experience placement
    Local area
    Flexible hours

    Galvanize Therapeutics

    Redwood City, CA
    22 hours ago
  • ## Senior Staff Research Scientist, Agentic AI & RLApplylocations: East Palo Alto, CAtime type: Full...  ...to create contextual, multilingual, pre-trained datasets; fine-tuned, industry-specific...  ...architect of Centific’s Agentic AI and RL platform — designing and building governed... 
    Training

    Centific Global Solutions, Inc.

    Palo Alto, CA
    1 day ago
  •  ...more than 150 PhDs and data scientists, along with more than 4,000 AI...  ...contextual, multilingual, pre‑trained datasets; fine‑tuned,...  ...engineering standards Mentor researchers and engineers; drive technical...  ...equivalent) 5+ years hands‑on RL—environment design, reward engineering... 
    Training

    Centific Global Solutions, Inc.

    East Palo Alto, CA
    1 day ago
  •  ...Yang, who sold Caper AI for $350 million, and former DeepMind research scientist Jason Ma. The company has raised over $140M, backed by top...  ...for the full lifecycle of production research pipelines—from training advanced models to analyzing their behavior and driving... 
    Training
    Temporary work

    Dyna Robotics

    Redwood City, CA
    22 hours ago
  •  ...ML Research Scientist, Foundation Models About the Team Join a world-class team at the forefront...  ..., and advanced pretraining and post-training methods. Your core mission is to create...  ...in diffusion models, flow matching, RL, LLMs, and other cutting-edge areas.... 
    Training
    Temporary work

    Genesis Molecular AI

    San Mateo, CA
    6 days ago
  •  ...Research Scientists at Riot combine deep technical expertise across machine learning, artificial intelligence (AI), experimental design, and...  ...systems. Your work will power scalable simulation for agent training, automated game balancing, synthetic data generation, and new... 
    Training
    Local area
    Flexible hours

    Riot Games

    Redwood City, CA
    3 days ago
  •  ...possible, we are building across the entire robotics stack. We're training state-of-the-art AI models that leverage our large-scale, high-...  ...time on the things they value most. As a Machine Learning Research Engineer, you will work on the software and algorithms that... 
    Training

    Sunday

    Redwood City, CA
    3 days ago
  • $236k - $339.2k

     ...Staff Research Scientist At Snowflake, we are powering the era of the agentic enterprise. To usher...  ...efficiency Arctic-Text2SQL: Post-trained reasoning models with frontier-level quality...  ...H200 GPU Agent World Models (AWM): RL training across 1,000+ synthetic... 
    Training
    Flexible hours
    Shift work

    Streamlit

    Menlo Park, CA
    4 days ago
  • $174k - $252k

     ...of professional or academic research experience applying machine learning...  ...large‑scale datasets for training or evaluating AI/ML models,...  ...types of work. As a Research Scientist, you'll set up large‑scale tests...  ..., build a security‑focused RL environment, or architect an... 
    Training
    Full time

    Google DeepMind

    Mountain View, CA
    4 days ago
  •  ...Return to jobs list Overview Research Scientist, Vision-Language-Action Models Job type: Full Time · Department: Manufacturing Engineering...  ...AI companies don’t have: a live production environment as a training ground. Our long-term vision is to become the infrastructure... 
    Training
    Full time
    Contract work
    Immediate start

    Neara

    Menlo Park, CA
    1 day ago
  •  ...A leading technology staffing firm is seeking a Research Scientist IV to work on advancing human-computer interaction in AR/VR. You’ll design...  ...systems, lead data acquisition, and develop ML training pipelines. Ideal candidates hold a PhD with 5+ years in machine... 
    Training

    ManpowerGroup Global, Inc.

    Burlingame, CA
    1 day ago
  • $225k - $400k

     ...Founding Research Scientist ABOUT THE ROLE This is a research-driven, high-impact role for ML researchers who want to push the boundaries...  ...You’ve worked on advanced ML problems (for example: LLM pre-training and post training, transcription model training, text to... 
    Training
    H1b
    Relocation
    Visa sponsorship

    kadence

    San Mateo, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Scientist - RL Training. Be the first to apply!