Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior ML Systems Engineer Distributed Training at Scale

Rhoda ai

A leading robotics company in Palo Alto seeks a Staff/Principal ML Systems Engineer to enhance training performance for their innovative humanoid robots. You will optimize distributed training systems and engage closely with researchers to transform model changes into scalable implementations. This role promises significant impact on research cycles, enabling advancements in real-world robotics. Ideal candidates have extensive experience in distributed training and modern ML tools, thrive in fast-paced environments, and possess strong debugging skills. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior ML Systems Engineer Distributed Training at Scale in Palo Alto, CA vacancy
  •  ...infrastructure company in California seeks a Member of Technical Staff — Training to design and optimize large-scale distributed training systems for frontier AI models. Candidates should have 5+ years of experience in ML systems and be proficient in Python along with another... 
    Training

    RadixArk

    Palo Alto, CA
    5 days ago
  • The Mission: As a Senior Machine Learning Engineer, you will be responsible...  ...machine learning models/systems and innovative web...  ...processes for model training, fine-tuning, testing...  ...models at significant scale. Investigate, prototype...  ...and evolving ML Training and Inferencing... 
    Senior
    Training
    Local area

    Typeface

    Palo Alto, CA
    4 days ago
  • $153.2k - $234.1k

     ...hardware and battery systems to intuitive...  ...transportation on a global scale. Role Overview:...  ...machine learning engineer working on our...  ...vehicles. As a Senior ML Infra Engineer,...  ...machine learning model training and evaluation...  ...building large-scale distributed systems/... 
    Senior
    Training
    Work at office
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    5 days ago
  • $200k - $400k

     ...Institute Of Foundation Models Engineer The Institute of Foundation Models...  ...(IFM) designs and operates ultra-scale GPU supercomputing systems to train next-generation foundation models....  ...driving communication performance, distributed reliability, and cross-layer optimization... 
    Senior
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    2 days ago
  • $188.5k - $282.7k

    Rubrik, Inc. is seeking a Senior Software Engineer for its Atlas Distributed Systems team. You'll design and deliver innovative solutions for cloud storage while guiding architectural trends within our distributed file systems. The ideal candidate has a degree in Computer... 
    Senior

    Rubrik, Inc.

    Palo Alto, CA
    4 days ago
  •  ...all of their business systems through natural language...  ...Moveworks' Reasoning Engine and natural language...  ...backed by the global scale of ServiceNow and the...  ...help build cutting edge ML infrastructure for building...  ...including distributed training and inference pipeline... 
    Senior
    Training
    Work at office
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    2 days ago
  •  ...of their business systems through natural language...  ...' Reasoning Engine and natural language...  ...by the global scale of ServiceNow and...  ...datasets for model training and evaluation....  ..., and keeping our ML at the cutting edge...  ...We approach our distributed world of work with... 
    Senior
    Training
    Work at office
    Immediate start
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    3 days ago
  • Cerebras Systems builds the world's largest AI chip, 5...  ...GPUs. Our novel wafer‑scale architecture provides the...  ...industry‑leading training and inference speeds and...  ...effortlessly run large‑scale ML applications, without...  ...versatile and experienced engineer to join our SOTA... 
    Senior
    Training
    Internship

    Cerebras

    Sunnyvale, CA
    2 days ago
  • $300k - $400k

     ...You will own the systems layer that makes our frontier model training and inference fast...  ...bottlenecks in large-scale training runs...  ...communication primitives, or distributed training...  ...benchmarking distributed ML systems to...  ...— the scientists, engineers, and problem-solvers... 
    Training
    Visa sponsorship
    Flexible hours
    Shift work

    Periodic Labs

    Menlo Park, CA
    9 days ago
  • $166k - $225k

     ...their business. Founded by engineers — and customer obsessed...  ...with data to scaling our services and infrastructure...  ...building the next generation distributed data storage and processing systems that can outperform...  ...relevant certifications and training, and specific work... 
    Senior
    Training
    Local area
    Worldwide

    Databricks

    Mountain View, CA
    5 days ago
  •  ...AI lab in Santa Clara is seeking a skilled software engineer with over 8 years of experience to optimize machine...  ...-time applications. The role involves designing distributed training strategies, collaborating with ML researchers, and developing tools for performance enhancement... 
    Senior
    Training

    Odyssey

    Santa Clara, CA
    2 days ago
  • $224k - $356.5k

     ...Clara is seeking exceptional Senior Machine Learning and Simulation Engineers for their Autonomous...  ...design and development of large-scale RL training frameworks to enhance multi-...  ...over 12 years of experience in ML training, simulating AV systems, and must be proficient in C++... 
    Senior
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    5 hours ago
  • $159.3k - $230.7k

     ...hardware and battery systems to intuitive...  ...transportation on a global scale. The Data...  ...works on and delivers ML models to the...  ...foundation model pre-training and fine-tuning...  ...impact team of AI/ML engineers, data scientists...  ...vehicles. As a Senior AI/ML Engineer in... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    5 days ago
  •  ...breakthrough hardware and battery systems to intuitive design,...  ...on a global scale. The Data Scaling team...  ...works on and delivers ML models to the product that...  ...such as unsupervised pre-training, imitation learning, reinforcement...  ...quick iteration by distributed teams. Strong data... 
    Training
    Local area
    Remote work
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    4 hours ago
  • $153.2k - $234.1k

     ...hardware and battery systems to intuitive design,...  ...transportation on a global scale. Role Are you...  ...world scenarios. As a Senior ML engineer, you will build critical...  ...machine learning training and evaluation workflows...  ...building large-scale distributed systems, applications... 
    Senior
    Training
    Remote work
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    5 hours ago
  • $281k - $356k

     ...Senior Staff ML Engineer, Driver Understanding and Evaluation Waymo...  ...learning and data systems, simulation workflow...  ...learning models to deliver training and evaluation data...  ...fine-tuning large-scale generative models to...  ...Experience with large-scale distributed training and data... 
    Senior
    Training
    Full time

    Waymo

    Mountain View, CA
    3 days ago
  • $150k - $230k

     ...Senior Systems Engineer - AI Infrastructure On Site, Palo Alto, California...  ..., high-performance distributed GPU training. You'll work at the intersection...  ...implementing systems that run at scale. This is a systems...  ...(RDMA, InfiniBand) ML framework or runtime internals... 
    Senior
    Training

    Clockwork Systems

    Palo Alto, CA
    3 days ago
  • $155.42k - $395.9k

     ...to-end AI lifecycleof ML pipelines—from local experimentation...  ...and large-scale training to evaluation, lineage...  ...spanboth backend systems and user-facing interfaces, enabling ML engineers and researchers to develop...  ..., and test scalable distributed computing and data processing... 
    Senior
    Training
    Local area
    Remote work
    Relocation
    Flexible hours

    General Motors

    Mountain View, CA
    5 hours ago
  •  ...Sunnyvale, California, is looking for an experienced engineer to join its SOTA Training Platform team. The ideal candidate will have...  ...frameworks. Responsibilities include bringing ML models to life on Cerebras CSX systems, performance tuning, and contributing to tool improvements... 
    Senior
    Training

    Cerebras

    Sunnyvale, CA
    2 days ago
  • $140k - $185k

     ...unleashing autonomy at scale to transform the battlefield...  ...lives at risk. Our systems operate with distributed control, dynamic...  ...We are seeking a Senior Network Systems Engineer to deploy, operate, and...  ...education, specialized training, critical expertise, training... 
    Senior
    Training
    Full time
    Temporary work
    Work experience placement
    Local area
    Remote work

    Forterra, Inc.

    East Palo Alto, CA
    3 days ago
  •  ...Senior AI Systems Performance Engineer Palo Alto, California, United States...  ...businesses and operations at scale. SambaNova Suite™...  ...talented and driven ML performance engineer...  ...single-node and distributed systems. Basic...  ...or multimodal model training and inference.... 
    Senior
    Training

    SambaNova Systems

    Palo Alto, CA
    3 days ago
  •  ...technology company is hiring a Machine Learning Systems Engineer in Cupertino, California. You will...  ...Siri modeling teams to optimize model training and inference on Apple's custom Silicon....  ...ideal candidate has strong experience in ML models, with proficiency in Python and... 
    Training

    Apple

    Cupertino, CA
    5 hours ago
  •  ...Nuro, based in Mountain View, is seeking senior engineers to build and scale its large-scale computing infrastructure. The role involves designing...  ...applications. The ideal candidate has experience with distributed applications and holds a bachelor's degree in Computer... 
    Senior

    I did my part and supported the Regular Toilet

    Mountain View, CA
    4 hours ago
  •  ...in California is looking for an experienced Machine Learning Infrastructure Engineer. This role involves designing scalable ML training platforms, optimizing high-performance computing systems, and ensuring robust job scheduling and reliability. Ideal candidates will have... 
    Senior
    Training

    Dyna Robotics

    Redwood City, CA
    5 hours ago
  •  ...AI lifecycle of ML pipelines—from local...  ...and large-scale training to evaluation, lineage...  ...span both backend systems and user-facing...  ...interfaces, enabling ML engineers and researchers...  .... The Role: As a Senior AI/ML Engineer,...  ...test scalable distributed computing and data... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    Israelvcforum

    Mountain View, CA
    1 day ago
  • $158k - $241.9k

     ...breakthrough hardware and battery systems to intuitive design, intelligent software...  ...of transportation on a global scale. Role: As a Senior AI/ML Engineer within the Onboard Embodied AI organization...  ...with sophisticated neural networks trained from large-scale driving data and... 
    Senior
    Training
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    1 day ago
  • $167.2k - $250.8k

     ...world-class autonomous driving system that combines AD hardware...  ...and we are looking for an ML Software Engineer to join our Online Mapping...  ...label management, as well as training pipelines. About the Role...  ...and infrastructure such as distributed training and ML compilers.... 
    Senior
    Training

    I did my part and supported the Regular Toilet

    Mountain View, CA
    1 day ago
  • $195k - $230k

     ..., recommendation systems, and adtech. Recognized...  ...challenges at scale. Together, we...  ...looking for a Senior Machine Learning Engineer to help evolve...  ...from offline training → online inference...  ...large-scale data and ML systems (e.g., Spark, distributed training, real-... 
    Senior
    Training
    Full time
    Local area
    Work from home

    NewsBreak

    Mountain View, CA
    2 days ago
  • $133.95k - $245k

     ...for an exceptional Senior Machine Learning Engineer to help shape the future...  ...thinking, and scale that don't always have...  ...Improving evaluation and training or finetune models...  ...machine learning systems using production‑grade...  ...pipelines using distributed compute frameworks to... 
    Senior
    Training
    Work at office
    Remote work
    Flexible hours
    Shift work
    3 days per week

    Unchain Data

    Menlo Park, CA
    1 day ago
  • $144.7k - $261.3k

     ...infrastructure, and ML/AI GPU platforms...  ...is looking for a Senior Performance Engineer to join the AV...  ...input into large scale ML infrastructure...  ...’s long-term GPU system strategy and "evergreen...  ...large-scale ML training and inference...  ...within large-scale distributed production... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Flexible hours
    3 days per week

    General Motors

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Systems Engineer Distributed Training at Scale. Be the first to apply!