Research Scientist / Engineer - Training Infrastructure
$220k - $300kIpro Networks Pte. Ltd.
Job Title: Research Scientist / Engineer – Training Infrastructure Position Type: Full time Location: Palo Alto, CA • Remote - US • Remote - International Salary Range: $220,000 - $300,000 (USD) Job ID#: 154559 We believe that multimodality is critical for intelligence. To go beyond language models and build more aware, capable and useful systems, the next step function change will come from vision. So, we are working on training and scaling up multimodal foundation models for systems that can see and understand, show and explain, and eventually interact with our world to effect change. We are looking for engineers with significant experience solving hard problems in PyTorch, CUDA and distributed systems. You will work alongside the rest of the research team to build & train cutting edge foundation models on thousands of GPUs that are built to scale from the ground up. Responsibilities Design, implement, and optimize efficient distributed training systems for models with thousands of GPUs Research and implement advanced parallelization techniques (FSDP, Tensor Parallel, Pipeline Parallel, Expert Parallel) Build monitoring, visualization, and debugging tools for large-scale training runs Optimize training stability, convergence, and resource utilization across massive clusters Requirements Extensive experience with distributed PyTorch training and parallelisms in foundation model training Deep understanding of GPU clusters, networking, and storage systems Familiarity with communication libraries (NCCL, MPI) and distributed system optimization
- Preferred) Strong Linux systems administration and scripting capabilities
- Preferred) Experience managing training runs across >100 GPUs
- Preferred) Experience with containerization, orchestration, and cloud infrastructure
- ...hardware and robot systems to the infrastructure and state-of-the-art... ...possibly by our cutting edge research and end-to-end system design... ...and robot hardware Develop training strategies that produce better... ...— or equivalent research/engineering experience Publication record...Training
- ...environments and handling scenarios unseen in training. We work at the intersection of... ..., robotics, and systems, with a research team that includes researchers from... ...reality. We're looking for a Research Scientist or Research Engineer to advance dexterous manipulation —...Training
- We're looking for Research Scientists and Research Engineers with deep robotics or autonomous systems domain knowledge to adapt our web-pretrained video model to real robot tasks. Post-training at Rhoda means taking a causal video generation model pretrained on internet...TrainingShift work
- ...and robot systems to the infrastructure and state-of-the-art foundation... ...by our cutting edge research and end-to-end system design... ...re looking for a Research Scientist or Research Engineer to own the strategy and systems... ...demonstrations our models train on. What You'll Do...Training
- ...hardware and robot systems to the infrastructure and state-of-the-art... ...possibly by our cutting edge research and end-to-end system... ...We're looking for Research Scientists and Research Engineers to push the frontier of large-scale pre-training for our video action model....Training
- ...: Full time Department: Research Overview At Rhoda AI, we... ...and robot systems to the infrastructure and state‑of‑the‑art foundation... ...for Applied Research Scientists and Research Engineers to take our foundation... ...modern ML pipelines: pre‑training, fine‑tuning, evaluation,...TrainingFull time
- ...possible, we are building across the entire robotics stack. We’re training state-of-the-art AI models that leverage our large-scale,... ...time on the things they value most. As a Machine Learning Research Engineer, you will work on the software and algorithms that enable...Training
- ...and robot systems to the infrastructure and state-of-the-art foundation... ...by our cutting edge research and end-to-end system... ...'re looking for Research Scientists and Research Engineers to build the data and evaluation... ...closely with pre-training and post-training teams to...Training
- ...the next generation of data infrastructure at Mistral AI. You will be a... ...governed data access for MLOps and research. You will take full... ...call rotations for critical training jobs. What will you do Build... ...anticipates exabyte growth. Platform Engineering: Contribute to the...TrainingWork at officeVisa sponsorship
- Ipro Networks Pte. Ltd. is seeking a Research Scientist / Engineer in Palo Alto, CA to develop and optimize distributed training infrastructure for multimodal foundation models. This role involves significant experience with PyTorch and managing large-scale GPU clusters...TrainingRemote job
- ...million, and former DeepMind research scientist Jason Ma. The company has... ...We are seeking a Research Engineer / Scientist to join our team... ...production research pipelines—from training advanced models to... ...production inference/debugging infrastructure. Familiarity with multi‑...TrainingTemporary work
$160k - $350k
Collinear is a research-focused AI company building systems that... .... We work across post-training, RL, distillation, and evaluation... ...with founders, research scientists, and engineering leads High-impact... ...them into robust, scalable infrastructure that moves the needle on real...TrainingFull timeWork at officeLocal areaImmediate startRelocation packageFlexible hours$150.29k - $171.67k
...Research Computing Cloud Engineer Business Affairs: University IT (UIT), Stanford, California, United... ...Responsibilities Research, HPC & AI Infrastructure: Architect cloud-native and hybrid... ...intensive workloads, such as AI training and genomics. Cloud Bursting & Integration...TrainingHourly payFull timeFixed term contractWeekend workAfternoon shift- Senior Research Engineer, Training Data Infrastructure in Foundation Models Cupertino, California, United States - Software and Services Our team is dedicated... ...data itself. Youураль will work alongside Research Scientists to transform theoretical observations into concrete,...Training
$126k - $423k
...Valley company is creating the digital infrastructure needed to bring intelligence to every... ...We are looking for a passionate Research Engineer (AI/RL Infrastructure) to join the Research... ..., YOU WILL: * Design and build training and evaluation infrastructure to support...TrainingFull timeFor contractorsFor subcontractorCasual workWork at officeImmediate startRemote workDay shift- ...hardware and robot systems to the infrastructure and state-of-the-art foundation... ...made possibly by our cutting edge research and end-to-end system design. We'... ...reality. We're looking for a Research Engineer to build and maintain the training platform that powers our model...Training
$207k - $300k
AI Innovation and Research Software Engineer, Platforms and Devices Google | Mountain View, CA, USA... .... Experience with Machine Learning Infrastructure. Experience with Machine Learning Research... ..., and relevant education or training. Your recruiter can share more about...TrainingFull time- Harmonic, located in Palo Alto, California, is seeking engineers for its reinforcement learning team. You will maintain and optimize the RL training and serving infrastructure, ensuring peak performance for model workloads. The ideal candidate has a strong background in...Training
- Rhoda AI, based in California, is seeking Research Scientists and Research Engineers to innovate in large-scale pre-training for video action models. You will lead the design and training of causal video generation models using web-scale video data. The ideal candidate...Training
- A cutting-edge web technology company is seeking a Research Engineer to enhance its core research product. The role involves improving models for web-scale indexing and establishing training strategies. Candidates should possess deep intuitions for modern model systems...Training
- ...trade-offs into upside. Make high-conviction bets - Try and fail. But succeed an unfair amount. Job Research engineer. You will enhance our core research product to train and scale models that serve a web-scale index. This includes shaping model architecture and...Training
- ...growing product lines. We're looking for a Research Engineer who can both set technical direction... ..., spoken-language domains. Design, train, and evaluate neural MT models — from... ...adjacent problems across the ML org. Data & infrastructure Identify, source, and curate training...Training
- ...’s most advanced mathematical reasoning engine, recently achieving Gold Medal-level performance... ...seeking a highly motivated and skilled Research Engineer to join our Reinforcement... ...to Chatbots That Don't Make Stuff Up? Training Data Podcast: Why Vlad Tenev and Tudor Achim...Training
- ...are building a mathematical reasoning engine that operates with absolute precision.... ...communication primitives to our distributed training loops and inference engines. We are... ...our proprietary RL training and serving infrastructure. You have the authority to refactor any...Training
$174k - $255k
Software Engineering Mountain View, CA (HQ) About the Team: Our team... ...products, and is not a research‑based role. Our team is small... ...Google Cloud Platform‑based infrastructure for software development and... ..., and relevant education or training. Your recruiter can share more...TrainingFull time- Research Engineer - Multimodal AI Los Altos, CA About Orbifold AI Backed by top-tier VCs in... ...redefining multimodal data curation and model training for enterprise AI. Our mission is to... ...that powers our multimodal AI infrastructure. Your primary focus will be developing...TrainingFlexible hours
- ...culture on Role Summary About the Research Engineering team The team spans Platform (shared... ...models. Working hand-in-hand with Research Scientists, you’ll either join: - Platform RE Team: Enhance the shared training framework, data pipelines and cluster...TrainingWork at officeVisa sponsorship
- ...Labs is an applied AI research lab pioneering data... ...environment curation for training and evaluating agents... ...for a Research Engineer to bridge cutting-edge... ...systems or research infrastructure at scale Proficiency... ...engineering or applied scientist role Contributions to...Training
$174k - $252k
Research Engineer, Gemmaverse Variants Research, DeepMind corporate_fare DeepMind place Mountain... ...experience, and relevant education or training. Your recruiter can share more about... ...Responsibilities Contribute to core tools and infrastructure for LLM research, driving major...TrainingFull time- ...in machine learning engineering or large-scale software... ..., or responsible AI research. Experience in Python... ...Architect and optimize training and inference... ...Collaborate with Research Scientists to translate safety research... ...maintain evaluation infrastructure to systematically...Training
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Research Scientist / Engineer - Training Infrastructure. Be the first to apply!
- machine learning research scientist Palo Alto, CA
- drug safety scientist Palo Alto, CA
- remote scientist Palo Alto, CA
- operations research scientist Palo Alto, CA
- senior scientist Palo Alto, CA
- scientist assay development Palo Alto, CA
- applied scientist Palo Alto, CA
- water quality scientist Palo Alto, CA
- cell culture scientist Palo Alto, CA
- analytical scientist Palo Alto, CA


