Software Engineer, RL Training Infra
OpenAI
About the Team
The Post-Training Frontiers team creates the frontier agents OpenAI ships to the world. We do the reinforcement learning training for the agentic models we ship in Codex, ChatGPT, and the API (from o1 to 5.5).
Our role consists of (1) shepherding all integrations that should go into the final RL run and deciding what can make it in, (2) babysitting and scaling the final run, and (3) building the research and infra for horizontal integrations, such as improving function calling, factuality, multi-agent capabilities, memory, calibrated thinking, etc.
About the Role
This role focuses on keeping our frontier RL training runs fast, reliable, and unblocked. You will work across engineering and infrastructure problems as they emerge, from scaling and orchestration issues to inference bottlenecks, numerical problems, and hardware failures, as well as supporting large horizontal integrations in the big run, like multi-agent capabilities or memory. This is a role for a strong generalist who quickly learns anything needed for the task, has high attention to detail, debugs deeply, and is motivated by fixing the highest-impact problem in front of the team.
In this role, you will:
- Keep large-scale RL training runs moving by jumping into the most urgent engineering and infrastructure problems.
- Debug issues across training systems, inference, orchestration, scaling, and distributed infrastructure.
- Solve hard technical problems at the boundary between research and engineering: scaling experiments, improving training reliability, debugging distributed systems, reducing latency and cost, and making new capabilities robust under real workloads.
- Improve reliability and efficiency for RL training runs.
- Help researchers who are developing infra-heavy integrations, such as multi-agent capabilities or memory.
- Turn recurring operational issues into better tools, systems, processes, or abstractions.
- Work closely with research, infrastructure, and partner teams during tight model run timelines.
- Become useful quickly in messy, ambiguous areas where ownership matters more than a perfectly scoped project.
- Debug failures that cut across model behavior, training data, RL systems, evaluation infrastructure, serving systems, and agent harnesses, then turn those failures into hypotheses, fixes, and durable improvements.
You might thrive in this role if you:
- Want to train and ship our frontier models and ensure we make agents genuinely useful for developers, enterprises, researchers, and everyday users.
- Are a strong generalist engineer with experience in some layer of ML infrastructure.
- Have worked on RL, inference, scaling, training systems, orchestration, or adjacent ML infrastructure.
- Learn extremely quickly and are comfortable operating across unfamiliar layers.
- Are a strong debugger with high ownership, low ego, and excellent communication.
- Can land in a messy area with tight timelines, become useful quickly, and gradually raise the quality of the whole system.
- Are energized by fast-moving environments where reliability, speed, and judgment matter.
- Like building load-bearing systems and processes when that is what the team needs, even if the work is not glamorous.
Nice to have:
- Experience supporting large-scale model training, async RL systems, or high-throughput ML infrastructure.
- Experience debugging distributed systems across GPUs, networking, orchestration, or inference stacks.
- Background in performance optimization, scaling, or production-critical infrastructure.
- Experience working directly with researchers or fast-moving model teams.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.
For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement .
Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.
To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form . No response will be provided to inquiries unrelated to job posting compliance.
We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link .
At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
$200k
...About AfterQuery AfterQuery builds the training data and evaluation infrastructure that frontier... ...if they've worked for/interned for any RL environment companies in the past or any... ...results ~ Former founders and early engineers at early stage startups are a plus. We don...Training$200k
About AfterQuery AfterQuery builds the training data and evaluation infrastructure that frontier... ...if they’ve worked for/interned for any RL environment companies in the past or any... ...messy results Former founders and early engineers at early stage startups are a plus. We...Training- ...entire stack to be agent-first, from training our own models to generative... ...Responsibilities: Scale infra for post-training of multimodal LLMs (CPT, SFT, RL, search, reward models) Scale... ...agent Work closely with product engineers to translate cutting-edge AI...TrainingWork at officeRelocationVisa sponsorship
$320k
...group of committed researchers, engineers, policy experts, and business... ...About the role Anthropic's RL Data team builds the systems... ...quality assurance that keeps training data trustworthy at scale.... ...Minimum qualifications Strong software engineering skills and proficiency...TrainingFull timeWork at officeVisa sponsorshipFlexible hours$320k - $405k
...group of committed researchers, engineers, policy experts, and business... ...Engineer, Node Infra About the role Anthropic... ...determine how quickly we can train new models, how reliably we can... ...qualifications ~8+ years of software engineering experience,...TrainingWork at officeVisa sponsorshipFlexible hours$180k - $220k
...Job Description Job Description Software Engineer – RL Environments — AfterQuery Location: San Francisco, CA (Onsite) Compensation:... ...AfterQuery AfterQuery is an AI infrastructure company building training data and evaluation systems for frontier AI labs. They work...Training$180k - $300k
...Join to apply for the Software Engineer (Infra) role at Numeral . This range is provided by Numeral. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range $180,000.00/yr - $300,000.00/yr About NumeralHQ Numeral...Full timeImmediate startRemote workFlexible hours- ...about the mission (and each other), we'd love to meet you. About the Role We're looking for our first engineer focused on infrastructure to start and lead Infra at Amperos. You'll get to own dev ops, dev experience, compliance, observability, monitoring for our AWS...Work at officeFlexible hours
$170k - $250k
...Senior Infra Software Engineer Title of Role: Senior Infra Software Engineer Location: San Francisco, onsite Company Stage of Funding: Seed - Software Development, Devtools, AI Office Type: Onsite Salary: $170K-$250K Company Description We're...Work at office- ...to get your help as we're hiring several extremely talented software engineers across the stack. In this role, you will... Build... ...platforms that power Pylon's AI features - prompt executions, search infra, and more! Improve LLM observability - AI evals (online...Work at officeRelocation
- ...AI/ML Engineer (RL & Physical Systems) FLUIX is building the AI Operating System for data... ...simulation environments to accelerate training, testing, and Sim2Real deployment.... ...meet. Collaborate with controls, software, and field engineering teams to integrate...TrainingWeekend work
$141.9k - $190.3k
...Sr Product Software Engineer Technology is at the heart of Disney's past, present, and future... ...Required Education, Experience/Skills/Training: ~ Basic Qualifications ~5+ years... ...able to partner across AI engineering, infra, and product teams. ~ Familiarity with...Training$350k
...growing group of committed researchers, engineers, policy experts, and business leaders working... ...role The Knowledge Work team builds the training environments and evaluations that make... ...scale. Experience building or operating RL environments, agent harnesses, or LLM evaluation...TrainingVisa sponsorshipShift work- ...Software Engineer, ML Research Engineering · Full-time · San Francisco; New York Our mission is to automate coding.... ...Engineer Cursor is building the future of coding. We train frontier coding agents and scale RL on real user data to make them increasingly...TrainingFull time
$300k - $405k
...growing group of committed researchers, engineers, policy experts, and business leaders... ...About the Role As a Full-Stack Software Engineer in RL, you'll build the platforms, tools,... ...environment creation, data collection, and training observability. The quality of Claude'...TrainingWork at officeVisa sponsorshipFlexible hoursShift work$220k - $247.5k
...are looking for a Senior Machine Learning Engineer to join our Revenue ML team at Discord.... ...closely with Shop, Game Commerce, Revenue Infra, ML Infra and Data Engineering teams to define... ..., experience, and relevant education or training. Please note that the compensation...TrainingFull timeSeasonal work$240k
...mission is to fundamentally change how software is built on the Internet by empowering developers... ...Team: Convex has assembled a team of engineers who have built and designed some of the... ...designing and operating large-scale infra, we’d love to talk! This is a hands-on...Full timeWork at officeRemote workShift workNight shift- ...computing and make it accessible to software developers of all skill... .... We're looking for engineers with systems software experience... ...distributed libraries, test infra improvements, debugging, and... ...Knowledge of distributed model training and inference (e.g. tensor parallel...TrainingWork experience placement
$184k - $259.44k
...seeking a highly skilled and motivated Software Engineer, Frontier AI Infrastructure to join... ...features before they break, moving us from "infra-only debugging" to proactive... ...performance, and relevant education or training. Scale employees in eligible roles are...TrainingFull timeWork at office3 days per weekEarly shift$123.2k - $189.1k
Job Description As a Software Engineer on the Metrics Frameworks team, you will lead the development... ...road event monitoring, data mining and training, and simulation metrics. We are seeking... ...with other frameworks and data infra teams to build and deploy tools that improve...TrainingLocal areaFlexible hours$225k - $300k
..., bringing the power of software automation to physical spaces... ...seeking an AI Research Engineer to develop and optimize our AI models and training systems. This is an... ...training inference and infra pipelines that go from camera... ...Have experience with RL (reinforcement learning)...TrainingFull time- ...Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE As a Senior Software Engineer - Model Training at Baseten, you’ll be at the forefront of... ...stack, collaborate with product and infra teams to surface customer needs, and push...TrainingFlexible hours
- ...AI Systems Engineer Transluce is a fast-moving research lab building the public tech... ...models Behavior elicitation: Distributed RL training and roll-outs allowing thousands of... ...practices for building and path-set on what infra we should build Help other team...TrainingFlexible hours
- ...Senior Software Engineer, ML Data San Francisco, CA • Hybrid • Reports to Head of Vision & AI... ...own the data infrastructure required to train and evaluate our ML models. You'll work... ...and partner closely with applied CV, ML infra and Platform engineers. What You'll Do...TrainingWork at officeFlexible hours
- ...systems. In this role, you’ll lead engineering efforts to ensure our largest... ...direction across research, infra, and product teams. Mentor... ...kernels for inference or training workloads. Have experience... ...performance issues across hardware and software layers. Have strong...TrainingFull time
- ...learning environments that teach AI models to code like 0.01% engineers. Our training environments are based on real-world coding scenarios that... ...you'll: Design and build scaleable systems that generate RL environments Create automated QA systems to validate environment...TrainingFull timeContract workRelocation package
$200k
...than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About the role As a Software Engineer at Magic, you will work on core systems or product surfaces...TrainingFull timeRelocationVisa sponsorship- ...a small, fast-moving team of engineers focused on delivering a world... ...Role We’re looking for a software engineer to help us serve OpenAI... ...directly with researchers training these models and with product... ...Collaborate closely with researchers, infra teams, and product engineers...TrainingFull time
$225k
...than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About the role: As a Software Engineer on the product team, you’ll be responsible for building and...TrainingFull timeLocal areaRelocationVisa sponsorship- ...or SSMs, a new primitive for training efficient, large-scale foundation... ...model innovation and systems engineering paired with a design-minded... ...Role Cartesia is hiring a Software Engineer, Product to build... ...end-to-end, such as realtime infra. Work alongside product teams...TrainingFull timeWork at officeVisa sponsorshipFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, RL Training Infra. Be the first to apply!
- software engineer full time San Francisco, CA
- facebook software engineer San Francisco, CA
- startup software engineer San Francisco, CA
- intermediate software engineer San Francisco, CA
- research software engineer San Francisco, CA
- software developer no experience San Francisco, CA
- rust software engineer San Francisco, CA
- freelance software developer San Francisco, CA
- work from home software developer San Francisco, CA
- software developer San Francisco, CA


