AI Training Systems Engineer: Distributed & RL
B Capital
B Capital in San Francisco is looking for an engineering professional to architect and optimize core training infrastructure for their AI models. You will work on distributed systems and large-scale data pipelines, focusing on performance and numerical stability. Successful candidates will have strong software engineering skills and experience in either distributed training or data infrastructure. The role offers top-tier compensation and comprehensive health and wellness benefits. #J-18808-Ljbffr B Capital
- ...Machine Learning Systems Engineer, RL Engineering San Francisco, CA | New... ...interpretable, and steerable AI systems. We want AI to be safe... ...cutting-edge systems that train AI models like Claude. You're... ...performance, large scale distributed systems Large scale LLM training...TrainingWork at officeVisa sponsorshipFlexible hours
- A leading AI research company in San Francisco seeks Senior/Staff Engineers skilled in distributed systems and large-scale ML training. Responsibilities include designing systems optimized for low-bandwidth conditions and implementing robust training strategies. Ideal...TrainingRemote work
- A leading AI technology company in San Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale training and fine-tuning of foundation models. You will design distributed training systems and optimize GPU utilization while collaborating...Training
- An AI and Robotics firm in San Francisco seeks a Staff/Principal ML Systems Engineer to enhance training performance for multimodal robotic data. You will lead efforts to improve end... ...will have significant experience in distributed training, a strong background in PyTorch...Training
- Genesis AI in San Francisco is looking for an experienced professional to optimize and build distributed training systems using PyTorch. The ideal candidate has over 8 years of experience in distributed systems, high-performance computing, and extensive expertise in Python...Training
- ...Type On-site Department Engineering Our Mission Reflection... ...states. Our team of AI researchers and company... ...that power our research, training, and production environments. These systems form the foundational... ...multi-tenant isolation. Distributed Systems Architecture:...TrainingFull timeRelocation package
- ...AI/ML Engineer (RL & Physical Systems) FLUIX is building the AI Operating System for data centers. We deploy... ..., from thermal systems to power distribution, where milliseconds and megawatts matter... ...environments to accelerate training, testing, and Sim2Real deployment....TrainingWeekend work
$117.2k - $313.7k
...Category Software Engineering Job Details About... ...Salesforce is the #1 AI CRM, where humans with... ...components/frameworks in distributed filesystems in an ever... ...that improve system scalability, robustness... ...promotion, benefits, training, assessment of job performance...TrainingImmediate startRemote work- ...About Us Most AI is frozen in place - it... ...time. Our vision is AI systems that are flexible, personalized... .... Researchers and ML engineers will hand you... ...: Design and operate distributed inference systems for... ...curate the datasets behind training and evaluation. The...TrainingFlexible hours
$150k - $300k
...that enables anyone to create, train, and deploy them. We... ...plane and pair it with the full rl post-training stack: environments... ...contexts. As a Research Engineer working on Distributed Training, you'll play a... ...focusing on our decentralizing AI training stack. If you love...TrainingRemote workWorldwideVisa sponsorshipRelocation packageFlexible hours- ...frontier of post-training and reinforcement... ...applied research engineers sit side-by-side with... ...it takes to bring AI to the enterprise.... ...founders. We've built RL infrastructure at... ...at Scale AI, and systems at Together, Two... ...with experience in distributed training Strong...TrainingDaily paidWork at officeVisa sponsorshipRelocation package
$255k - $405k
Slope is seeking a Software Engineer for its team in San Francisco, CA. The... ...for large-scale multimodal training. Responsibilities include managing distributed data pipelines and collaborating... ...strong experience in distributed systems and possess excellent organizational...Training$146.5k
...team: The ML Data Engineering team powers metadata... ...users worldwide. Our systems operate at massive... ...data engineering, and distributed systems, collaborating... ...cutting-edge generative AI and metadata enrichment... ...relevant education or training; and other business and...TrainingLocal areaWorldwideHome officeFlexible hours$180k - $215k
As a Backend Engineer on our application team at Windfall... ...will be building the system for ingesting and processing... ...and build a scalable distributed system capable of... ...relevant education or training. We also offer a comprehensive... ...intelligence and AI company that gives go-to...Training$295k
...to seamlessly blend high-level AI capabilities with the constraints of physical systems to improve peoples' lives.... ...About the Role As a Research Engineer, Distributed Data Systems, you will design... ...powers large-scale multimodal training and evaluation at OpenAI. You'...TrainingWork at officeRelocation package$142.6k - $261.5k
...organizations. Using our product-driven, AI-centric approach, we empower... ..., designers, and software engineers enable our clients to solve... .... Knowledgeable in system development lifecycle and technology... ...and interest in cloud and distributed systems architectures...Summer holidayFlexible hours$146.5k - $228k
...the team: The ML Data Engineering team powers metadata extraction... ...users worldwide. Our systems operate at massive... ...data engineering, and distributed systems, collaborating... ...-edge generative AI and metadata enrichment... ...relevant education or training; and other business...TrainingTemporary workLocal areaWorldwideHome officeFlexible hours$166k - $225k
...the world's best data and AI infrastructure platform... ...their business. Founded by engineers — and customer obsessed —... ...the next generation distributed data storage and processing systems that can outperform specialized... ...certifications and training, and specific work location...TrainingLocal areaWorldwide$350k
Research Engineer, RL Infrastructure and Reliability (Knowledge Work)... ...interpretable, and steerable AI systems. We want AI to be safe and... ...Knowledge Work team builds training environments and evaluations... ...experience operating ML or distributed systems at scale, including...TrainingVisa sponsorshipShift work$255k - $405k
...multimodal functionalities into our AI products, ensuring they are reliable,... ...benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale the... ...infrastructure that powers large‑scale multimodal training and evaluation at OpenAI. You’ll...TrainingFull timeWork at officeLocal areaRelocation packageFlexible hours- ...research on Protocol Learning : multi-participant training of foundation models where no single participant has... ...economics. We’re looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large‑scale training. You’ll be implementing...TrainingRemote workVisa sponsorship
$200k - $280k
A leading AI company in San Francisco is looking for a Staff Machine Learning Engineer to enhance inference systems at production scale. You will design algorithms, optimize performance, and collaborate on RL and post-training pipelines. Ideal candidates have 3+ years...TrainingFull time$350k
Menlo Ventures is seeking a Research Engineer to enhance the reliability and infrastructure of AI systems focused on professional workflows. The ideal candidate will... ...scale. Responsibilities include ensuring stable training environments, automating observability tools, and...TrainingWork at office- ...As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements...Training
$335k
...infrastructure that powers large-scale AI systems. We design and deliver next-... ...that support frontier model training and inference across an... ...We are seeking a System Engineer (Network / Storage / Systems... ...firmware, Linux systems, or distributed infrastructure. Experience...TrainingWork at officeRelocation package$250k - $350k
...AI is becoming vitally important in every function... ...state of the art post-training algorithms to reach the... ...As an ML Sys Research Engineer, you'll work on building... ...for our next-gen Agent RL training platform, support... ...to optimize our ML system. Your customer will be...TrainingFull time- A leading AI technology firm in San Francisco seeks an ML Sys Research Engineer to optimize algorithms for their next-generation Agent RL training platform. The role involves building and profiling frameworks, post-training state-of-the-art models, and collaborating with...Training
- A leading tech company based in San Francisco is seeking a Software Engineer to enhance its data and AI platform. The role involves developing high-performance distributed data systems and delivering on ambitious projects such as Delta Lake and performance engineering....
- AI Systems Engineer - Codex Core Agents The Codex Core Agents team builds the agent harness that... ...and increasingly part of how models are trained and evaluated, making this one of the... ...or operated production systems in distributed systems, infrastructure, developer tooling...Training
- AI Systems Engineer - Codex Core Agents Location San Francisco Employment Type Full time Department... ...or operated production systems in distributed systems, infrastructure, developer... ...using LLM systems, model evals, or post‑training feedback loops. Background in compilers...TrainingFull timeWork at officeLocal areaRelocation packageFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Training Systems Engineer: Distributed & RL. Be the first to apply!
- machine learning ai engineer San Francisco, CA
- senior ai engineer San Francisco, CA
- ai engineer remote San Francisco, CA
- ai ml engineer San Francisco, CA
- ai engineer San Francisco, CA
- ai developer San Francisco, CA
- ai research engineer San Francisco, CA
- ai prompt engineer San Francisco, CA
- operations support system engineer San Francisco, CA
- microsoft systems engineer San Francisco, CA

