Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, RL Training Infra

Full-time

OpenAI

About the Team

The Post-Training Frontiers team creates the frontier agents OpenAI ships to the world. We do the reinforcement learning training for the agentic models we ship in Codex, ChatGPT, and the API (from o1 to 5.5).

Our role consists of (1) shepherding all integrations that should go into the final RL run and deciding what can make it in, (2) babysitting and scaling the final run, and (3) building the research and infra for horizontal integrations, such as improving function calling, factuality, multi-agent capabilities, memory, calibrated thinking, etc.

About the Role

This role focuses on keeping our frontier RL training runs fast, reliable, and unblocked. You will work across engineering and infrastructure problems as they emerge, from scaling and orchestration issues to inference bottlenecks, numerical problems, and hardware failures, as well as supporting large horizontal integrations in the big run, like multi-agent capabilities or memory. This is a role for a strong generalist who quickly learns anything needed for the task, has high attention to detail, debugs deeply, and is motivated by fixing the highest-impact problem in front of the team.

In this role, you will:

- Keep large-scale RL training runs moving by jumping into the most urgent engineering and infrastructure problems.

- Debug issues across training systems, inference, orchestration, scaling, and distributed infrastructure.

- Solve hard technical problems at the boundary between research and engineering: scaling experiments, improving training reliability, debugging distributed systems, reducing latency and cost, and making new capabilities robust under real workloads.

- Improve reliability and efficiency for RL training runs.

- Help researchers who are developing infra-heavy integrations, such as multi-agent capabilities or memory.

- Turn recurring operational issues into better tools, systems, processes, or abstractions.

- Work closely with research, infrastructure, and partner teams during tight model run timelines.

- Become useful quickly in messy, ambiguous areas where ownership matters more than a perfectly scoped project.

- Debug failures that cut across model behavior, training data, RL systems, evaluation infrastructure, serving systems, and agent harnesses, then turn those failures into hypotheses, fixes, and durable improvements.

You might thrive in this role if you:

- Want to train and ship our frontier models and ensure we make agents genuinely useful for developers, enterprises, researchers, and everyday users.

- Are a strong generalist engineer with experience in some layer of ML infrastructure.

- Have worked on RL, inference, scaling, training systems, orchestration, or adjacent ML infrastructure.

- Learn extremely quickly and are comfortable operating across unfamiliar layers.

- Are a strong debugger with high ownership, low ego, and excellent communication.

- Can land in a messy area with tight timelines, become useful quickly, and gradually raise the quality of the whole system.

- Are energized by fast-moving environments where reliability, speed, and judgment matter.

- Like building load-bearing systems and processes when that is what the team needs, even if the work is not glamorous.

Nice to have:

- Experience supporting large-scale model training, async RL systems, or high-throughput ML infrastructure.

- Experience debugging distributed systems across GPUs, networking, orchestration, or inference stacks.

- Background in performance optimization, scaling, or production-critical infrastructure.

- Experience working directly with researchers or fast-moving model teams.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. 

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement .

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form . No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link .

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Vacancy posted 16 hours ago
Similar jobs that could be interesting for youBased on the Software Engineer, RL Training Infra in San Francisco, CA vacancy
  • $200k

     ...About AfterQuery AfterQuery builds the training data and evaluation infrastructure that frontier...  ...if they've worked for/interned for any RL environment companies in the past or any...  ...results ~ Former founders and early engineers at early stage startups are a plus. We don... 
    Training

    AfterQuery

    San Francisco, CA
    4 days ago
  • $200k

    About AfterQuery AfterQuery builds the training data and evaluation infrastructure that frontier...  ...if they’ve worked for/interned for any RL environment companies in the past or any...  ...messy results Former founders and early engineers at early stage startups are a plus. We... 
    Training

    AfterQuery

    San Francisco, CA
    3 days ago
  •  ...entire stack to be agent-first, from training our own models to generative...  ...Responsibilities: Scale infra for post-training of multimodal LLMs (CPT, SFT, RL, search, reward models) Scale...  ...agent Work closely with product engineers to translate cutting-edge AI... 
    Training
    Work at office
    Relocation
    Visa sponsorship

    Yutori

    San Francisco, CA
    18 days ago
  • $320k

     ...group of committed researchers, engineers, policy experts, and business...  ...About the role Anthropic's RL Data team builds the systems...  ...quality assurance that keeps training data trustworthy at scale....  ...Minimum qualifications Strong software engineering skills and proficiency... 
    Training
    Full time
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    20 hours ago
  • $320k - $405k

     ...group of committed researchers, engineers, policy experts, and business...  ...Engineer, Node Infra About the role Anthropic...  ...determine how quickly we can train new models, how reliably we can...  ...qualifications ~8+ years of software engineering experience,... 
    Training
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    14 days ago
  • $180k - $220k

     ...Job Description Job Description Software Engineer – RL Environments — AfterQuery Location: San Francisco, CA (Onsite) Compensation:...  ...AfterQuery AfterQuery is an AI infrastructure company building training data and evaluation systems for frontier AI labs. They work... 
    Training

    David Joseph & Company

    San Francisco, CA
    1 day ago
  • $180k - $300k

     ...Join to apply for the Software Engineer (Infra) role at Numeral . This range is provided by Numeral. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range $180,000.00/yr - $300,000.00/yr About NumeralHQ Numeral... 
    Full time
    Immediate start
    Remote work
    Flexible hours

    Numeral

    San Francisco, CA
    1 day ago
  •  ...about the mission (and each other), we'd love to meet you. About the Role We're looking for our first engineer focused on infrastructure to start and lead Infra at Amperos. You'll get to own dev ops, dev experience, compliance, observability, monitoring for our AWS... 
    Work at office
    Flexible hours

    Amperos Health, Inc

    San Francisco, CA
    3 days ago
  • $170k - $250k

     ...Senior Infra Software Engineer Title of Role: Senior Infra Software Engineer Location: San Francisco, onsite Company Stage of Funding: Seed - Software Development, Devtools, AI Office Type: Onsite Salary: $170K-$250K Company Description We're... 
    Work at office

    Recruiting from Scratch

    San Francisco, CA
    3 days ago
  •  ...to get your help as we're hiring several extremely talented software engineers across the stack. In this role, you will... Build...  ...platforms that power Pylon's AI features - prompt executions, search infra, and more! Improve LLM observability - AI evals (online... 
    Work at office
    Relocation

    Pylon Labs

    San Francisco, CA
    4 days ago
  •  ...AI/ML Engineer (RL & Physical Systems) FLUIX is building the AI Operating System for data...  ...simulation environments to accelerate training, testing, and Sim2Real deployment....  ...meet. Collaborate with controls, software, and field engineering teams to integrate... 
    Training
    Weekend work

    Fluix AI

    San Francisco, CA
    3 days ago
  • $141.9k - $190.3k

     ...Sr Product Software Engineer Technology is at the heart of Disney's past, present, and future...  ...Required Education, Experience/Skills/Training: ~ Basic Qualifications ~5+ years...  ...able to partner across AI engineering, infra, and product teams. ~ Familiarity with... 
    Training

    Disney France

    San Francisco, CA
    16 hours ago
  • $350k

     ...growing group of committed researchers, engineers, policy experts, and business leaders working...  ...role The Knowledge Work team builds the training environments and evaluations that make...  ...scale. Experience building or operating RL environments, agent harnesses, or LLM evaluation... 
    Training
    Visa sponsorship
    Shift work

    Menlo Ventures

    San Francisco, CA
    4 days ago
  •  ...Software Engineer, ML Research Engineering · Full-time · San Francisco; New York Our mission is to automate coding....  ...Engineer Cursor is building the future of coding. We train frontier coding agents and scale RL on real user data to make them increasingly... 
    Training
    Full time

    Anysphere

    San Francisco, CA
    3 days ago
  • $300k - $405k

     ...growing group of committed researchers, engineers, policy experts, and business leaders...  ...About the Role As a Full-Stack Software Engineer in RL, you'll build the platforms, tools,...  ...environment creation, data collection, and training observability. The quality of Claude'... 
    Training
    Work at office
    Visa sponsorship
    Flexible hours
    Shift work

    Anthropic

    San Francisco, CA
    3 days ago
  • $220k - $247.5k

     ...are looking for a Senior Machine Learning Engineer to join our Revenue ML team at Discord....  ...closely with Shop, Game Commerce, Revenue Infra, ML Infra and Data Engineering teams to define...  ..., experience, and relevant education or training. Please note that the compensation... 
    Training
    Full time
    Seasonal work

    Discord

    San Francisco, CA
    3 days ago
  • $240k

     ...mission is to fundamentally change how software is built on the Internet by empowering developers...  ...Team: Convex has assembled a team of engineers who have built and designed some of the...  ...designing and operating large-scale infra, we’d love to talk! This is a hands-on... 
    Full time
    Work at office
    Remote work
    Shift work
    Night shift

    Convex

    San Francisco, CA
    1 day ago
  •  ...computing and make it accessible to software developers of all skill...  .... We're looking for engineers with systems software experience...  ...distributed libraries, test infra improvements, debugging, and...  ...Knowledge of distributed model training and inference (e.g. tensor parallel... 
    Training
    Work experience placement

    Anyscale, Inc

    San Francisco, CA
    1 day ago
  • $184k - $259.44k

     ...seeking a highly skilled and motivated Software Engineer, Frontier AI Infrastructure to join...  ...features before they break, moving us from "infra-only debugging" to proactive...  ...performance, and relevant education or training. Scale employees in eligible roles are... 
    Training
    Full time
    Work at office
    3 days per week
    Early shift

    Scale AI

    San Francisco, CA
    3 days ago
  • $123.2k - $189.1k

    Job Description As a Software Engineer on the Metrics Frameworks team, you will lead the development...  ...road event monitoring, data mining and training, and simulation metrics. We are seeking...  ...with other frameworks and data infra teams to build and deploy tools that improve... 
    Training
    Local area
    Flexible hours

    General Motors

    San Francisco, CA
    4 days ago
  • $225k - $300k

     ..., bringing the power of software automation to physical spaces...  ...seeking an AI Research Engineer to develop and optimize our AI models and training systems. This is an...  ...training inference and infra pipelines that go from camera...  ...Have experience with RL (reinforcement learning)... 
    Training
    Full time

    Standard Bots

    San Francisco, CA
    2 days ago
  •  ...Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE As a Senior Software Engineer - Model Training at Baseten, you’ll be at the forefront of...  ...stack, collaborate with product and infra teams to surface customer needs, and push... 
    Training
    Flexible hours

    BaseTen

    San Francisco, CA
    4 days ago
  •  ...AI Systems Engineer Transluce is a fast-moving research lab building the public tech...  ...models Behavior elicitation: Distributed RL training and roll-outs allowing thousands of...  ...practices for building and path-set on what infra we should build Help other team... 
    Training
    Flexible hours

    Transluce

    San Francisco, CA
    2 days ago
  •  ...Senior Software Engineer, ML Data San Francisco, CA • Hybrid • Reports to Head of Vision & AI...  ...own the data infrastructure required to train and evaluate our ML models. You'll work...  ...and partner closely with applied CV, ML infra and Platform engineers. What You'll Do... 
    Training
    Work at office
    Flexible hours

    Voxel Labs

    San Francisco, CA
    1 day ago
  •  ...systems. In this role, you’ll lead engineering efforts to ensure our largest...  ...direction across research, infra, and product teams. Mentor...  ...kernels for inference or training workloads. Have experience...  ...performance issues across hardware and software layers. Have strong... 
    Training
    Full time

    OpenAI

    San Francisco, CA
    16 hours ago
  •  ...learning environments that teach AI models to code like 0.01% engineers. Our training environments are based on real-world coding scenarios that...  ...you'll: Design and build scaleable systems that generate RL environments Create automated QA systems to validate environment... 
    Training
    Full time
    Contract work
    Relocation package

    Idler

    San Francisco, CA
    16 hours ago
  • $200k

     ...than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About the role As a Software Engineer at Magic, you will work on core systems or product surfaces... 
    Training
    Full time
    Relocation
    Visa sponsorship

    Magic

    San Francisco, CA
    16 hours ago
  •  ...a small, fast-moving team of engineers focused on delivering a world...  ...Role We’re looking for a software engineer to help us serve OpenAI...  ...directly with researchers training these models and with product...  ...Collaborate closely with researchers, infra teams, and product engineers... 
    Training
    Full time

    OpenAI

    San Francisco, CA
    16 hours ago
  • $225k

     ...than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About the role: As a Software Engineer on the product team, you’ll be responsible for building and... 
    Training
    Full time
    Local area
    Relocation
    Visa sponsorship

    Magic Ai

    San Francisco, CA
    16 hours ago
  •  ...or SSMs, a new primitive for training efficient, large-scale foundation...  ...model innovation and systems engineering paired with a design-minded...  ...Role Cartesia is hiring a Software Engineer, Product to build...  ...end-to-end, such as realtime infra. Work alongside product teams... 
    Training
    Full time
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia

    San Francisco, CA
    16 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, RL Training Infra. Be the first to apply!