Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Internship Reinforcement Learning (Summer)

Cohere

Internship Opportunity

Our mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.

Join us on our mission and shape the future!

Duration: Minimum 4 months (summer 2026, with potential extension)

About the Project

This internship offers a unique opportunity to contribute to cutting-edge research in reinforcement learning (RL) and large language models (LLMs), focusing on two interconnected projects:

  1. Combining Self-Distillation and Reinforcement Learning for LLMs, with Applications to Code and Agentic Tasks This project explores how LLMs can improve through self-reflection and iterative learning by combining reinforcement learning with verifiable rewards (RLVR) and self-distillation. The focus is on scenarios where structured feedback from verifiers, compilers, unit tests, or tool calls enables models to detect errors, revise outputs, and learn from failures. The internship will bridge theoretical mathematical modeling of self-distillation with practical, production-oriented implementation.

  2. Dealing with Extremely Large Rollouts in RLVR As RLVR becomes a cornerstone for training reasoning-oriented LLMs, the challenge of handling extremely large rollouts grows. This project investigates mechanisms such as summarization, memory, context compaction, hierarchical sub-agents, and resumable rollouts to enable unbounded or very long trajectories. It also explores how to effectively learn from such trajectories, as traditional RLVR objectives fail when episodes exceed context window limits.

Both projects are grounded in recent research and aim to advance the state-of-the-art in LLM training and deployment.

Responsibilities

  • Conduct literature reviews and implement state-of-the-art algorithms in RL and self-distillation.

  • Design and execute experiments to evaluate the effectiveness of proposed methods on code generation and agentic tasks.

  • Develop and maintain codebases for both theoretical modeling and practical implementations.

  • Collaborate with researchers to analyze results, refine methodologies, and prepare findings for publication.

  • Contribute to the design of mechanisms for handling large rollouts, such as summarization and hierarchical sub-agents.

  • Document progress, methodologies, and outcomes clearly and comprehensively.

Requirements

  • Technical Skills:

    • Strong background in machine learning, particularly reinforcement learning and deep learning.

    • Proficiency in Python and experience with ML frameworks (e.g., PyTorch, TensorFlow).

    • Familiarity with LLMs and their training paradigms.

    • Experience with coding tasks, unit testing, or compiler tools is a plus.

  • Educational Background:

    • Currently pursuing a Master's or PhD in Computer Science, Machine Learning, or a related field.

  • Soft Skills:

    • Ability to work independently and manage complex projects.

    • Strong problem-solving and analytical skills.

    • Excellent communication skills for collaborating with a research team.

  • Additional:

    • Prior experience with RLVR, self-distillation, or large-scale ML experiments is highly desirable.

    • Willingness to learn and adapt to new methodologies and tools.

If some of the above doesn't line up perfectly with your experience, we still encourage you to apply!

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities.

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Research Internship Reinforcement Learning (Summer) in United States vacancy
  • $19 - $65 per hour

     ...unyielding safety net. This summer, you will own the...  ...this architecture using Deep Reinforcement Learning to provide a continuous, constrained...  ...Conduct groundbreaking research with the potential to...  ...results. Key focus area for this internship will be reinforcement... 
    Summer work
    Internship

    PlusAI, Inc.

    Santa Clara, CA
    4 days ago
  • $6,710 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment...  ...Internship positions in machine learning (ML) and artificial...  ...areas of interest include reinforcement learning for language models...  ...they typically begin in the summer. Qualifications Required... 
    Summer work
    Internship
    Ongoing contract
    Local area

    Microsoft Corporation

    New York, NY
    4 days ago
  • $6,710 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment...  ...The AI Interaction and Learning ( team in Microsoft Research...  ...following: Foundation Models, Reinforcement Learning, Multi-Objective Optimization...  ...typically begin in the summer. Qualifications... 
    Summer work
    Internship
    Ongoing contract
    Local area
    Immediate start
    Shift work

    Microsoft Corporation

    Redmond, WA
    3 days ago
  • $80.17k - $124.8k

     ...s capability in perception, cognition, and creativity. Researchers there aim at solving challenging real-world problems with...  ...at top conferences and journals. Research Internship - Reinforcement Learning for Large Foundation Models Tencent AI Lab is dedicated... 
    Internship
    Full time
    Work at office

    Tencent

    Bellevue, WA
    2 days ago
  • $30 - $94 per hour

    We are looking for PhD research interns excited to advance the next generation of large language models through reinforcement learning. Our applied deep learning research team at NVIDIA...  ...to solve real-world tasks. This internship will focus on algorithmic research at... 
    Internship
    Hourly pay

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $6,710 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment for research careers with...  ..., and the environment. The Deep Learning group at the Microsoft Research Redmond lab is seeking applicants for 2026 summer Research Internships in the areas of deep... 
    Summer work
    Internship
    Ongoing contract
    Summer internship
    Local area
    Worldwide

    Microsoft Corporation

    Redmond, WA
    5 days ago
  • $5,610 - $11,010 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment for research careers...  ...are interested in cutting edge machine learning (ML), video and graphics and want to make...  ..., though they typically begin in the summer. Additional Responsibilities We... 
    Summer work
    Internship
    Ongoing contract
    Local area

    Microsoft Corporation

    Redmond, WA
    18 hours ago
  • $6,710 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment for research careers...  ..., and the environment. The Machine Learning and Optimization (MLO) group in MSR-Redmond...  ..., though they typically begin in the summer. Research Interns are expected to... 
    Summer work
    Internship
    Ongoing contract
    Local area

    Microsoft Corporation

    Cambridge, MA
    3 days ago
  • $6,710 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment for research careers...  ..., and the environment. The Machine Learning and Optimization (MLO) group in MSR-Redmond...  ..., though they typically begin in the summer. Additional Responsibilities... 
    Summer work
    Internship
    Ongoing contract
    Local area

    Microsoft Corporation

    Redmond, WA
    3 days ago
  • $84k - $120k

     ...apply cutting-edge ML approaches (deep learning, reinforcement learning, imitation learning, etc) to...  .... Experience with deep learning research and tools. Proficiency in software...  ...Experience in GPU/CUDA/TensorRT Previous internships involving large-scale deep learning... 
    Summer work
    Internship

    pony.ai

    Fremont, CA
    more than 2 months ago
  • $54k

     ...Program Coordinator for Experiential Learning, College of Natural Sciences...  ...programs including the Freshman Research Initiative, undergraduate research, internships, international education, and...  ...CNS Abroad and the High School Summer Research Academy. Communicate with... 
    Summer work
    Internship
    Full time
    Contract work
    For contractors
    Casual work
    Work at office
    Afternoon shift

    The University of Texas at Austin

    Austin, TX
    1 day ago
  •  ...quantitative investment manager in Chicago is seeking a Quantitative Research Intern for Summer 2027. The role involves developing and implementing...  ...401(k) plan, wellness programs, and generous PTO. This internship offers a unique chance to shape a collaborative and... 
    Summer work
    Internship

    Aquatic Capital Management

    Chicago, IL
    1 day ago
  •  ...Research Intern Applied Intuition, Inc. is powering the future of physical AI. Founded...  ...contributions, you will contribute to and learn from best practices in the autonomy and...  ..., You Will: Conduct research on reinforcement learning (RL) related topics including large... 
    Internship
    For contractors
    For subcontractor
    Casual work
    Work at office
    Immediate start
    Remote work
    Day shift

    Applied Compute

    Sunnyvale, CA
    3 days ago
  •  ...Member Technical Staff- Reinforcement Learning & Open-Ended Learning. Full-time, permanent opportunity in San Francisco. We're representing an early-stage applied research lab building AI capable of open-ended learning , systems that keep getting better by discovering... 
    Permanent employment
    Full time

    Brahma Consulting Group

    Santa Rosa, CA
    2 days ago
  •  ...Research Intern Applied Intuition, Inc. is powering the future of physical AI. Founded...  ...contributions, you will contribute to and learn from best practices in the autonomy and...  ..., You Will: Conduct research on reinforcement learning (RL) related topics including large... 
    Internship
    For contractors
    For subcontractor
    Casual work
    Work at office
    Immediate start
    Remote work
    Day shift

    Applied Compute

    Sunnyvale, CA
    3 days ago
  • $6,710 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment for research...  ...together experts in Machine Learning (ML) (including reinforcement learning and synthetic data generation...  ...they typically begin in the summer. Qualifications Required Qualifications... 
    Summer work
    Internship
    Ongoing contract
    Work at office
    Local area

    Microsoft Corporation

    Redmond, WA
    2 days ago
  •  ...mission to bring the power of open-source LLMs and vLLM to every enterprise. We are seeking a highly motivated summer intern to join our Machine Learning Research Team. As an intern, you will work on cutting-edge AI inference and model optimization techniques, and... 
    Summer work
    Internship
    Full time
    Contract work
    Summer internship
    Work at office
    Remote work
    Flexible hours

    Red Hat

    Boston, MA
    1 day ago
  •  ...Job Title At Toyota Research Institute (TRI), we're on a mission to improve the...  ...strategies, and a range of supervised and reinforcement learning techniques for physical manipulation....  ...completion), with some post-PhD or internship work experience. A demonstrated... 
    Internship
    Work experience placement
    Shift work

    Toyota Research Institute

    Cambridge, MA
    4 days ago
  •  ...Internship - Research Intern (Agentic Learning, Memory & Neurosymbolic Reasoning) Castelldefels (Barcelona),...  ...Experience or strong interest in reinforcement learning and model fine-tuning, particularly...  ...Internship Details Type: Paid Summer Internship Duration:... 
    Internship
    Full time
    Temporary work
    Summer internship
    Work at office
    Relocation
    Relocation package

    Axiomatic_ai

    Boston, MA
    4 days ago
  • $5,610 - $11,010 per month

     ...Overview Research Internships at Microsoft provide a dynamic environment for research careers...  ...include, but are not limited to: Reinforcement learning approaches for improving logical and...  ...though they typically begin in the summer. Qualifications Required... 
    Summer work
    Internship
    Ongoing contract
    Local area

    Microsoft Corporation

    Redmond, WA
    2 days ago
  •  ...Research Engineer Internship Austin, TX About Avride Avride is a US-based...  ...for building machine learning models that enable autonomous...  ...and Planning team for the Summer of 2026. Autonomous Vehicles...  ...of deep learning, reinforcement learning, computer vision,... 
    Summer work
    Internship
    Work at office
    Remote work
    Relocation
    Relocation package

    Avride

    Austin, TX
    1 day ago
  • $218.7k - $249.6k

     ...Applied Researcher I (AI Foundations, LLM Customization, Finetuning, Reinforcement Learning)Overview:At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good. For years, Capital One has been leading the industry in using machine learning... 
    Full time
    Part time
    Local area
    Flexible hours

    Capital One

    McLean, VA
    2 days ago
  •  ...and truly belong. Constant learning, skill growth, great...  ...What You’ll Build The SAP Internship Experience Program is SAP...  ...Title: SAP iXp Intern - AI Research PhD Associate (Summer 2026) Location: Palo Alto...  ..., LLM fine‑tuning and reinforcement learning, or synthetic data... 
    Summer work
    Internship
    Full time
    Contract work

    SAP

    Palo Alto, CA
    18 hours ago
  •  ...Division of the Applied Research Laboratory (ARL) at Penn...  ...undergraduate students for internship opportunities. ARL/Penn State...  ..., and/or machine learning (deep learning/reinforcement learning) is desired. Experience...  ...40 hours/week over the summer. This is a paid... 
    Summer work
    Internship
    For contractors
    Remote work
    Relocation

    Penn State University

    University Park, PA
    3 days ago
  •  ...At Toyota Research Institute (TRI), we're on a mission to improve the quality of human...  ..., and a range of supervised and reinforcement learning techniques for physical manipulation....  ...nearing completion), with some post-PhD or internship work experience. A demonstrated... 
    Internship
    Work experience placement
    Local area
    Shift work

    Toyota Research Institute

    Los Altos, CA
    4 days ago
  • $80.17k - $124.8k

     ...What the Role Entails Responsibilities: 1. Conduct research on RL algorithms for multimodal models, including diffusion models...  ...next-generation RL paradigms that more directly and effectively learn from environment feedback. Who We Look For Requirements... 
    Internship
    Full time

    Tencent

    Palo Alto, CA
    18 hours ago
  •  ...Foundation Models  We are a dedicated research lab for building, understanding, using,...  ...for high-performance computing in deep learning, driving impactful discoveries that...  ...As a Research Scientist within our Reinforcement Learning team, you will play a fundamental... 
    Visa sponsorship
    Shift work

    Institute of Foundation Models

    Sunnyvale, CA
    22 days ago
  • $94.49k - $147.4k

    Staff Scientist - Post-Training and Reinforcement Learning for AI for Science The Argonne Leadership Computing Facility (ALCF) seeks a Staff...  ...mathematics, and domain scientists. Responsibilities Conduct research and development aligned with Argonne’s strategic mission in... 
    Remote work

    Argonne National Laboratory

    Lemont, IL
    18 hours ago
  •  ...speeds and empowers machine learning users to effortlessly run large...  ...an Applied Machine Learning Research Scientist at Cerebras, you...  ..., fine-tuning, and reinforcement learning-based post-training...  ...years of experience (including internships, research, or industry experience... 
    Internship

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    4 days ago
  • $200k - $330k

     ...Convergent Ventures, and have raised over $150M to date. We're looking for a motivated and creative Machine Learning (ML) Scientist to drive research into reinforcement learning for biomolecular design. This position offers an opportunity to work at the forefront of... 

    Profluent

    Emeryville, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Internship Reinforcement Learning (Summer). Be the first to apply!