Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Scientist / Engineer - Training Infrastructure

$220k - $300k

Ipro Networks Pte. Ltd.

Job Title: Research Scientist / Engineer – Training Infrastructure Position Type: Full time Location: Palo Alto, CA • Remote - US • Remote - International Salary Range: $220,000 - $300,000 (USD) Job ID#: 154559 We believe that multimodality is critical for intelligence. To go beyond language models and build more aware, capable and useful systems, the next step function change will come from vision. So, we are working on training and scaling up multimodal foundation models for systems that can see and understand, show and explain, and eventually interact with our world to effect change. We are looking for engineers with significant experience solving hard problems in PyTorch, CUDA and distributed systems. You will work alongside the rest of the research team to build & train cutting edge foundation models on thousands of GPUs that are built to scale from the ground up. Responsibilities Design, implement, and optimize efficient distributed training systems for models with thousands of GPUs Research and implement advanced parallelization techniques (FSDP, Tensor Parallel, Pipeline Parallel, Expert Parallel) Build monitoring, visualization, and debugging tools for large-scale training runs Optimize training stability, convergence, and resource utilization across massive clusters Requirements Extensive experience with distributed PyTorch training and parallelisms in foundation model training Deep understanding of GPU clusters, networking, and storage systems Familiarity with communication libraries (NCCL, MPI) and distributed system optimization

  • Preferred) Strong Linux systems administration and scripting capabilities
  • Preferred) Experience managing training runs across >100 GPUs
  • Preferred) Experience with containerization, orchestration, and cloud infrastructure
As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at "". Compensation: The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility. #J-18808-Ljbffr Ipro Networks Pte. Ltd.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Research Scientist / Engineer - Training Infrastructure in Palo Alto, CA vacancy
  •  ...hardware and robot systems to the infrastructure and state-of-the-art...  ...possibly by our cutting edge research and end-to-end system design...  ...and robot hardware Develop training strategies that produce better...  ...— or equivalent research/engineering experience Publication record... 
    Training

    Rhoda ai

    Palo Alto, CA
    5 days ago
  •  ...environments and handling scenarios unseen in training. We work at the intersection of...  ..., robotics, and systems, with a research team that includes researchers from...  ...reality. We're looking for a Research Scientist or Research Engineer to advance dexterous manipulation —... 
    Training

    Rhoda ai

    Palo Alto, CA
    5 days ago
  • We're looking for Research Scientists and Research Engineers with deep robotics or autonomous systems domain knowledge to adapt our web-pretrained video model to real robot tasks. Post-training at Rhoda means taking a causal video generation model pretrained on internet... 
    Training
    Shift work

    Rhoda ai

    Palo Alto, CA
    2 days ago
  •  ...and robot systems to the infrastructure and state-of-the-art foundation...  ...by our cutting edge research and end-to-end system design...  ...re looking for a Research Scientist or Research Engineer to own the strategy and systems...  ...demonstrations our models train on. What You'll Do... 
    Training

    Rhoda ai

    Palo Alto, CA
    5 days ago
  •  ...hardware and robot systems to the infrastructure and state-of-the-art...  ...possibly by our cutting edge research and end-to-end system...  ...We're looking for Research Scientists and Research Engineers to push the frontier of large-scale pre-training for our video action model.... 
    Training

    Rhoda ai

    Palo Alto, CA
    5 days ago
  •  ...: Full time Department: Research Overview At Rhoda AI, we...  ...and robot systems to the infrastructure and state‑of‑the‑art foundation...  ...for Applied Research Scientists and Research Engineers to take our foundation...  ...modern ML pipelines: pre‑training, fine‑tuning, evaluation,... 
    Training
    Full time

    Gigascale Capital

    Palo Alto, CA
    1 day ago
  •  ...possible, we are building across the entire robotics stack. We’re training state-of-the-art AI models that leverage our large-scale,...  ...time on the things they value most. As a Machine Learning Research Engineer, you will work on the software and algorithms that enable... 
    Training

    Sunday Robotics

    Mountain View, CA
    1 day ago
  •  ...and robot systems to the infrastructure and state-of-the-art foundation...  ...by our cutting edge research and end-to-end system...  ...'re looking for Research Scientists and Research Engineers to build the data and evaluation...  ...closely with pre-training and post-training teams to... 
    Training

    Rhoda ai

    Palo Alto, CA
    4 days ago
  •  ...the next generation of data infrastructure at Mistral AI. You will be a...  ...governed data access for MLOps and research. You will take full...  ...call rotations for critical training jobs. What will you do Build...  ...anticipates exabyte growth. Platform Engineering: Contribute to the... 
    Training
    Work at office
    Visa sponsorship

    Mistral AI

    Palo Alto, CA
    2 days ago
  • Ipro Networks Pte. Ltd. is seeking a Research Scientist / Engineer in Palo Alto, CA to develop and optimize distributed training infrastructure for multimodal foundation models. This role involves significant experience with PyTorch and managing large-scale GPU clusters... 
    Training
    Remote job

    Ipro Networks Pte. Ltd.

    Palo Alto, CA
    1 day ago
  •  ...million, and former DeepMind research scientist Jason Ma. The company has...  ...We are seeking a Research Engineer / Scientist to join our team...  ...production research pipelines—from training advanced models to...  ...production inference/debugging infrastructure. Familiarity with multi‑... 
    Training
    Temporary work

    Dyna Robotics

    Redwood City, CA
    5 days ago
  • $160k - $350k

    Collinear is a research-focused AI company building systems that...  .... We work across post-training, RL, distillation, and evaluation...  ...with founders, research scientists, and engineering leads High-impact...  ...them into robust, scalable infrastructure that moves the needle on real... 
    Training
    Full time
    Work at office
    Local area
    Immediate start
    Relocation package
    Flexible hours

    Collinear AI, Inc.

    Sunnyvale, CA
    1 day ago
  • $150.29k - $171.67k

     ...Research Computing Cloud Engineer Business Affairs: University IT (UIT), Stanford, California, United...  ...Responsibilities Research, HPC & AI Infrastructure: Architect cloud-native and hybrid...  ...intensive workloads, such as AI training and genomics. Cloud Bursting & Integration... 
    Training
    Hourly pay
    Full time
    Fixed term contract
    Weekend work
    Afternoon shift

    Stanford University

    Stanford, CA
    3 days ago
  • Senior Research Engineer, Training Data Infrastructure in Foundation Models Cupertino, California, United States - Software and Services Our team is dedicated...  ...data itself. Youураль will work alongside Research Scientists to transform theoretical observations into concrete,... 
    Training

    Apple Inc.

    Cupertino, CA
    1 day ago
  • $126k - $423k

     ...Valley company is creating the digital infrastructure needed to bring intelligence to every...  ...We are looking for a passionate Research Engineer (AI/RL Infrastructure) to join the Research...  ..., YOU WILL: * Design and build training and evaluation infrastructure to support... 
    Training
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Immediate start
    Remote work
    Day shift

    Applied Intuition

    Sunnyvale, CA
    1 day ago
  •  ...hardware and robot systems to the infrastructure and state-of-the-art foundation...  ...made possibly by our cutting edge research and end-to-end system design. We'...  ...reality. We're looking for a Research Engineer to build and maintain the training platform that powers our model... 
    Training

    Rhoda ai

    Palo Alto, CA
    5 days ago
  • $207k - $300k

    AI Innovation and Research Software Engineer, Platforms and Devices Google | Mountain View, CA, USA...  .... Experience with Machine Learning Infrastructure. Experience with Machine Learning Research...  ..., and relevant education or training. Your recruiter can share more about... 
    Training
    Full time

    Google Inc.

    Mountain View, CA
    3 days ago
  • Harmonic, located in Palo Alto, California, is seeking engineers for its reinforcement learning team. You will maintain and optimize the RL training and serving infrastructure, ensuring peak performance for model workloads. The ideal candidate has a strong background in... 
    Training

    Harmonic

    Palo Alto, CA
    5 days ago
  • Rhoda AI, based in California, is seeking Research Scientists and Research Engineers to innovate in large-scale pre-training for video action models. You will lead the design and training of causal video generation models using web-scale video data. The ideal candidate... 
    Training

    Rhoda ai

    Palo Alto, CA
    5 days ago
  • A cutting-edge web technology company is seeking a Research Engineer to enhance its core research product. The role involves improving models for web-scale indexing and establishing training strategies. Candidates should possess deep intuitions for modern model systems... 
    Training

    Parallel Web Systems

    Palo Alto, CA
    4 days ago
  •  ...trade-offs into upside. Make high-conviction bets - Try and fail. But succeed an unfair amount. Job Research engineer. You will enhance our core research product to train and scale models that serve a web-scale index. This includes shaping model architecture and... 
    Training

    Parallel Web Systems

    Palo Alto, CA
    4 days ago
  •  ...growing product lines. We're looking for a Research Engineer who can both set technical direction...  ..., spoken-language domains. Design, train, and evaluate neural MT models — from...  ...adjacent problems across the ML org. Data & infrastructure Identify, source, and curate training... 
    Training

    Sanas

    Palo Alto, CA
    2 days ago
  •  ...’s most advanced mathematical reasoning engine, recently achieving Gold Medal-level performance...  ...seeking a highly motivated and skilled Research Engineer to join our Reinforcement...  ...to Chatbots That Don't Make Stuff Up? Training Data Podcast: Why Vlad Tenev and Tudor Achim... 
    Training

    Harmonic

    Palo Alto, CA
    4 days ago
  •  ...are building a mathematical reasoning engine that operates with absolute precision....  ...communication primitives to our distributed training loops and inference engines. We are...  ...our proprietary RL training and serving infrastructure. You have the authority to refactor any... 
    Training

    Harmonic

    Palo Alto, CA
    5 days ago
  • $174k - $255k

    Software Engineering Mountain View, CA (HQ) About the Team: Our team...  ...products, and is not a research‑based role. Our team is small...  ...Google Cloud Platform‑based infrastructure for software development and...  ..., and relevant education or training. Your recruiter can share more... 
    Training
    Full time

    X Development, LLC

    Mountain View, CA
    4 days ago
  • Research Engineer - Multimodal AI Los Altos, CA About Orbifold AI Backed by top-tier VCs in...  ...redefining multimodal data curation and model training for enterprise AI. Our mission is to...  ...that powers our multimodal AI infrastructure. Your primary focus will be developing... 
    Training
    Flexible hours

    Bonfirevc

    Palo Alto, CA
    5 days ago
  •  ...culture on Role Summary  About the Research Engineering team The team spans Platform (shared...  ...models. Working hand-in-hand with Research Scientists, you’ll either join: - Platform RE Team: Enhance the shared training framework, data pipelines and cluster... 
    Training
    Work at office
    Visa sponsorship

    Mistral AI

    Palo Alto, CA
    5 days ago
  •  ...Labs is an applied AI research lab pioneering data...  ...environment curation for training and evaluating agents...  ...for a Research Engineer to bridge cutting-edge...  ...systems or research infrastructure at scale Proficiency...  ...engineering or applied scientist role Contributions to... 
    Training

    Bespoke Labs

    Mountain View, CA
    2 days ago
  • $174k - $252k

    Research Engineer, Gemmaverse Variants Research, DeepMind corporate_fare DeepMind place Mountain...  ...experience, and relevant education or training. Your recruiter can share more about...  ...Responsibilities Contribute to core tools and infrastructure for LLM research, driving major... 
    Training
    Full time

    Google Inc.

    Mountain View, CA
    5 days ago
  •  ...in machine learning engineering or large-scale software...  ..., or responsible AI research. Experience in Python...  ...Architect and optimize training and inference...  ...Collaborate with Research Scientists to translate safety research...  ...maintain evaluation infrastructure to systematically... 
    Training

    WeAreTechWomen

    Mountain View, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Scientist / Engineer - Training Infrastructure. Be the first to apply!