Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Machine Learning Researcher - RL and Agentic

Protege

Machine Learning Researcher

Data is the foundation of AI performance, and we believe model quality starts with data quality. As AI systems become more agentic, a critical challenge is understanding which real-world datasets, tasks, and environments actually lead to better model behavior.

We're seeking a Machine Learning Researcher focused on RL and agentic systems to help define, design, and evaluate the datasets, tasks, environments, and benchmarks used to assess advanced AI systems. In this role, you'll work closely with research and engineering teams to translate real-world workflows into high-value datasets and evaluation assets: structured tasks, interactive environments, benchmark suites, and quality scorecards that help us understand how models perform in realistic settings.

You'll help define what "high-quality agentic data" means in practice, using statistical, computational, and ML-driven methods to evaluate dataset quality, task design, environment fidelity, and downstream model performance. You'll work on the core problems of benchmarking real-world data, measuring how well models perform on that data, and designing RL-style or agentic environments that capture the structure of meaningful work.

This is an ideal role for someone with a strong machine learning background who is excited by reinforcement learning, agentic systems, evaluation, and the role of data in shaping model behavior. You should be excited by the opportunity to build the datasets and benchmarks that help define what high-quality real-world data looks like for frontier AI systems.

What You'll Do
Design and build datasets, tasks, and environments

Design and build datasets, tasks, environments, and evaluation assets for benchmarking agentic systems and multi-step model behavior.

Translate real-world workflows into structured tasks, interaction traces, trajectories, stateful environments, and verifiable outcomes that can be used to evaluate advanced AI systems.

Develop frameworks for evaluating real-world data quality

Develop frameworks that assess diversity, realism, coverage, fidelity, informativeness, and downstream usefulness of datasets for agentic systems.

Build quality scorecards and evaluation methods that make dataset strengths, weaknesses, and failure modes legible across teams.

Benchmark model behavior in RL and agentic settings

Evaluate planning, tool use, robustness, recovery from failure, task completion, and generalization behavior in RL-style or agentic environments.

Connect model failures back to concrete dataset, environment, or task-design gaps and recommend improvements grounded in empirical evidence.

Build scalable evaluation and validation tooling

Contribute to tools and systems that automate dataset validation, environment generation, rollout analysis, benchmark construction, and evaluation workflows.

Improve internal infrastructure for reproducible experimentation, benchmark management, and evaluation quality.

Partner across research, engineering, and product

Collaborate closely with research and engineering teams to identify data bottlenecks, improve evaluation methodology, and shape internal best practices around task-grounded AI training data.

Represent DataLab's perspective in cross-functional discussions around dataset quality, benchmark design, and frontier agentic-system evaluation.

What Success Looks Like
Near-term: establish a strong evaluation baseline

Create clear benchmark frameworks, evaluation assets, and dataset-quality scorecards that help Protege reason about how real-world data impacts advanced agentic systems.

Use rigorous evaluation methods to identify meaningful dataset improvements, improve benchmark fidelity, and sharpen the company's understanding of what high-impact agentic data actually looks like in practice.

What You Bring
  • PhD or equivalent Master's Degree + 4+ years industry experience in machine learning, computer science, statistics, engineering, mathematics, economics, or related quantitative fields.

  • Strong understanding of AI model training pipelines, evaluation methodology, and the role of data in shaping model performance.

  • Experience working with large, unstructured, or semi-structured datasets used to train or evaluate ML systems.

  • Experience with reinforcement learning, sequential decision-making, agentic systems, tool-using models, or multi-step model evaluation.

  • Experience designing tasks, benchmarks, environments, simulations, or evaluation frameworks for real-world model behavior.

  • Strong intuition for realism, coverage, difficulty, fidelity, and meaningful outcome structure in datasets.

  • Strong experimental design, evaluation, benchmarking, and data-validation skills.

  • High ownership and ability to independently identify and solve high-impact problems.

Nice to have
  • Experience developing evaluation frameworks or performance metrics for datasets, agentic systems, or training data.

  • Experience translating real-world workflows into structured tasks or environments for model evaluation.

  • Experience with RLHF, RLAIF, imitation learning, reward modeling, online or offline RL, or related methods.

  • Experience with Harbor or other agent evaluation frameworks.

  • Publications or open-source contributions in reinforcement learning, agents, evaluation, or data-centric AI.

  • Experience collaborating cross-functionally with product, infrastructure, or partnership teams.

  • Experience with synthetic data generation, trajectory generation, or simulation-based environments.

Protege's Values

Pass the Loved Ones' Test

We act with integrity and do the right thing - especially when it's hard and no one is watching.

Always Find a Way

We are resourceful, resilient builders who solve hard problems and push through obstacles.

Go Fast and Grow Fast

Velocity matters. We move with urgency, learn quickly, and continuously improve as individuals and as a company.

Practice Kindness and Candor

We communicate directly and respectfully, building trust through honest feedback and genuine care for one another.

Deliver Together

We win as one team. Collaboration, accountability, and shared ownership drive our success.

Own the Outcome. Hone the Craft.

We take pride in our work, sweat the details, and continuously raise the bar for excellence.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Machine Learning Researcher - RL and Agentic in United States vacancy
  • $159.1k - $238.7k

     ...Engineering Group, Engineering Group > Machine Learning Researcher Overview As a leading technology...  ...development, and deployment of cutting-edge agentic AI systems across diverse use cases....  ...Tuning (SFT) and Reinforcement Learning (RL). Understanding of agentic AI concepts... 
    Suggested
    Work experience placement
    Work from home

    Qualcomm

    San Diego, CA
    16 hours ago
  • $250k - $300k

     ...Principal AI Researcher (Agentic Systems & AI Infrastructure) Seattle, WA or McLean, VA or Remote...  ...areas such as: agent-to-agent learning, orchestration and harness...  ...Requirements ~12–15+ years of experience in machine learning, AI systems, or applied AI... 
    Suggested
    Remote work
    Shift work

    Trase Systems

    United States
    2 days ago
  • Bloomberg is seeking a seasoned AI developer to build innovative Agentic AI workflow solutions for the financial industry. The ideal candidate will have over 10 years in quantitative research and machine learning, and 5 years in capital markets. You will apply advanced ML... 
    Suggested

    Bloomberg

    New York, NY
    3 days ago
  •  ...Senior Applied AI Researcher Articul8 AI is seeking a...  ...training, reinforcement learning, multimodal understanding...  ...massively parallel agentic AI — from problem formulation...  ..., domain adaptation, RL-based optimization (...  ...MSc in Computer Science, Machine Learning, or a related... 
    Suggested
    Remote work

    Articul8

    United States
    1 day ago
  • $79k - $127.65k

     ...Innovative Medicine is seeking a Translational Post Doctoral Researcher — Agentic AI for Neurodegeneration for a 2-year fixed-term position....  ...Researcher to build them. The Researcher will be embedded in the Machine Intelligence (MI) team at J&J Innovative Medicine, working... 
    Suggested
    Temporary work
    Fixed term contract
    Local area
    Remote work

    Johnson & Johnson Innovative Medicine

    San Diego, CA
    3 days ago
  • $159.1k - $238.7k

    A leading technology firm in San Diego, CA, is seeking a Machine Learning Researcher to conduct fundamental research and develop innovative machine learning methodologies. You'll work on designing and deploying cutting-edge AI systems, while maintaining existing AI infrastructure... 

    Qualcomm

    San Diego, CA
    16 hours ago
  • A major financial services company is seeking an Applied Researcher I specializing in AI foundations in San Jose, CA. The role involves...  ...a PhD or MS with relevant experience, and a passion for machine learning and innovative solutions. Competitive salary and benefits included... 

    COMFORT SYSTEMS

    San Jose, CA
    2 days ago
  •  ...cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products...  ..., and implementation.* Engage in high impact applied research to take the latest AI developments and push them into the next... 
    Full time
    Part time
    Flexible hours

    Capital One

    San Jose, CA
    16 hours ago
  • An innovative research team at a prestigious institution is seeking a motivated Postdoctoral Research Associate to contribute to cutting-edge research in Artificial Intelligence and Machine Learning. This role involves collaboration with a diverse group of universities... 

    Phase2 Technology

    Tucson, AZ
    16 hours ago
  • $191.3k - $305.7k

     ...Principal Applied Scientist Focused On Rl Post-Training Zillow is investing deeply in next-generation AI and machine learning to power intelligent experiences across our products, helping customers and partners make better decisions in a complex, real-world domain.... 
    Permanent employment
    Live in
    Work at office
    Local area
    Remote work

    Zillow Referrals

    United States
    1 day ago
  •  ...combines domain-specific models, autonomous agentic reasoning through ModelMesh(TM),...  ...Fortune 500 enterprises, we bring together research, engineering, product, and domain...  ...Education: PhD or MSc in Computer Science, Machine Learning, NLP, or a related field. Experience... 
    Shift work

    Articul8

    Dublin, CA
    1 day ago
  • $137.5k

     ...technology platform powered by data and machine learning provides secure, differentiated, and...  ...the forefront of innovation in AI-driven agentic systems. We're dedicated to enhancing customer...  ..., and multi-agent collaboration Research and implement state-of-the-art... 
    Local area
    Worldwide
    Flexible hours

    Expedia Group

    Seattle, WA
    4 days ago
  • $96.13k - $155.95k

     ...Intelligence Job Description: The Applied Machine Learning Scientist II is responsible for...  ...methodologies — including Generative AI, Agentic AI systems, machine learning, graph...  ...of emerging industry trends, academic research, and evolving AI technologies, proactively... 
    Work experience placement
    Work at office
    Local area
    Work from home
    Flexible hours

    TD Bank Group

    New York, NY
    2 days ago
  • $167.3k - $250.9k

     ...FUJITSU RESEARCH OF AMERICA, INC. (FRA) was established in 1993 as a wholly owned subsidiary...  ...PhD to join our team working at Agentic AI research. You will collaborate closely...  ...in the area of Artificial Intelligence, machine learning, natural language processing, or... 
    Temporary work

    Fujitsu

    Santa Clara, CA
    16 hours ago
  • $87.1k - $157.45k

     ...capabilities. We are seeking a Senior AI Security Researcher who works at the intersection of cybersecurity, AI and agentic AI, and mission-aligned engineering, with a...  ...(Python preferred). ~ Understanding of machine learning algorithms, tools and platforms (NumPy,... 
    Remote work

    Leidos

    United States
    2 days ago
  • $196.5k - $291.5k

     ...our communities. Job Summary: As the Staff Scientist, Agentic AI, you will lead the strategy, development, and execution of...  ...: Lead the development and optimization of advanced machine learning models. Oversee the preprocessing and analysis of large datasets... 
    Work at office
    Local area
    Immediate start
    Flexible hours

    PayPal

    San Jose, CA
    16 hours ago
  • $218k - $323.95k

     ...Summary: PayPal is seeking a Senior Staff Machine Learning Scientist to drive the next evolution of Venmo's Agentic Experiences - a new paradigm redefining how...  ...the ability to collaborate across engineering, research, and product teams. Why Join Us... 
    Work at office
    Local area
    Immediate start
    Flexible hours

    PayPal

    San Jose, CA
    3 days ago
  •  ...handling sparse and delayed rewards in agentic settings, and aligning models reliably across...  ...you're drawn to hard problems where the research and the product are inseparable, this is...  ...the most interesting problems in deep learning research arise when we try to apply... 

    Apple

    New York, NY
    4 days ago
  •  ...AI Researcher Location: San Francisco About Hum.ai Hum.ai is building planetary...  ...moonshot journey to define what's next in agentic AI and frontier model capabilities....  ...Design: Deep understanding of current machine learning research. Proven track record of... 
    Remote work

    Humai

    United States
    2 days ago
  •  ...related engineering discipline to work on Agentic AI systems for mobility. What you'll...  ...services Conducting applied research in Agentic AI, including agent design,...  ...engineering discipline with a focus on AI / Machine Learning Research experience in Generative AI... 

    Vantage Point Consulting Inc.

    Mountain View, CA
    1 day ago
  • $224k - $356.5k

     ...looking for a Senior AI Security Researcher to help define how frontier AI systems, agentic applications, and AI-enabled...  ...PyTorch, JAX, TensorFlow, scikit-learn, Pandas, NumPy, Spark, BigQuery,...  ...experience in Computer Science, Machine Learning, Cybersecurity or a related... 
    Remote work

    NVIDIA

    United States
    1 day ago
  • $159.1k - $238.7k

     .... Job Area: Engineering Group, Engineering Group Machine Learning Researcher General Summary: Qualcomm AI Research is looking for...  ...Modal Foundation Models, Reasoning, Reinforcement Learning, Agentic AI, and Autonomy, with a strong emphasis on efficient on-... 
    Work experience placement
    Work from home
    Worldwide

    Qualcomm

    San Diego, CA
    1 day ago
  •  ...team and drive our ML work. On our Machine Learning team, you'll build the deep learning models...  .... At Jane Street, our researchers, engineers, and traders sit a few feet...  ...approaches-drawn from LLMs, image models, RL agents, recommendation systems, or classical... 

    Jane Street

    New York, NY
    4 days ago
  •  ...Applied AI Researcher Articul8 AI is seeking an Applied AI Researcher...  ...training, reinforcement learning, multimodal understanding, and...  ...experiments that leverage fleets of agentic AI systems to explore...  ...Education: PhD in Computer Science, Machine Learning, or a related field;... 
    Remote work

    Articul8

    United States
    1 day ago
  •  ...once-in-a-generation opportunity in the financial markets. Machine Learning Researchers on our Options team turn cutting-edge ideas and petabyte-scale...  ...Training techniques (pre-training, fine-tuning, RL, RLHF), and optimization methods A results-oriented track... 

    Citadel

    Miami, FL
    16 hours ago
  •  ...About the Role As an AI Researcher for Computer Vision & Autonomous Robots at TCS, you’...  ...prototype, and implement computer vision and machine learning algorithms that power autonomous robots...  ...Vision Transformers, Diffusion Models, Agentic AI frameworks). Test and benchmark... 
    Full time

    Tata Consultancy Services

    Detroit, MI
    2 days ago
  •  ...Innovative Medicine is looking for a Translational Post Doctoral Researcher — Agentic AI for Neurodegeneration. This role will focus on...  ...neurodegenerative diseases, requiring expertise in neuroscience and machine learning. The position is based in several locations, including San... 

    Johnson & Johnson Innovative Medicine

    San Diego, CA
    3 days ago
  • $286.2k - $326.7k

     ...technical experts working to define the future of banking in the cloud. You will work alongside our talented team of developers, machine learning experts, product managers and people leaders. Our Distinguished Engineers are leading experts in their domains, helping devise... 
    Full time
    Part time
    Local area
    Remote work

    Capital One

    McLean, VA
    a month ago
  •  ...Applied AI ML Researcher Director Our goal is to build the next generation...  ...that can reason, plan, act, and learn to solve critical problems for...  ...define the future of banking through Agentic AI. The Applied Artificial Intelligence and Machine Learning team in Commercial and... 

    Chase

    Palo Alto, CA
    1 day ago
  •  ...architects who will definethe future of banking through Agentic AI.The Applied Artificial Intelligence and Machine Learning team in Commercial and Investment Banking is...  ...AI andfrontier models. As an Applied AI ML Researcher Director in the Applied AI Research team, you... 

    JPMorgan Chase Bank, N.A.

    Palo Alto, CA
    23 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning Researcher - RL and Agentic. Be the first to apply!