Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior ML Infrastructure Engineer: Scale GPU Training & HPC

Dyna Robotics

A cutting-edge robotics company based in California is looking for an experienced Machine Learning Infrastructure Engineer. This role involves designing scalable ML training platforms, optimizing high-performance computing systems, and ensuring robust job scheduling and reliability. Ideal candidates will have 7+ years in software with hands-on experience in ML model tuning and managing cloud environments. Join us to shape the future of AI-driven robotics. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior ML Infrastructure Engineer: Scale GPU Training & HPC in Redwood City, CA vacancy
  • $153.2k - $234.1k

     ...transportation on a global scale. Role Are you passionate...  ...-world scenarios. As a Senior ML engineer, you will build critical infrastructure that powers every...  ...supporting machine learning training and evaluation workflows...  ...ML training across large GPU/CPU clusters or specialized... 
    Senior
    Training
    Remote work
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    1 day ago
  •  ..., Inc. is looking for a skilled data engineer in San Carlos, California, to design, build, and maintain large-scale data pipelines for training and evaluation of robotics foundation...  ...Responsibilities include managing core data infrastructure and collaborating with a dedicated... 
    Senior
    Training

    AI Chopping Block, Inc.

    San Carlos, CA
    6 days ago
  •  ...Mountain View, California is looking for an experienced Data Engineer to design large-scale data pipelines and advanced data systems for autonomous...  ...in Python, and experience working with large-scale data infrastructure. This role offers a competitive salary range and a... 
    Senior

    I did my part and supported the Regular Toilet

    Mountain View, CA
    1 day ago
  • $181k - $297k

     ...We are seeking an HPC Network Engineer to design, deploy, and...  ...Ethernet fabrics for large-scale GPU clusters. The role...  ...supporting AI/ML training, inference, and HPC...  ...RDMA traffic. As a Senior Staff Software Engineer...  ...Experience with infrastructure automation or configuration... 
    Senior
    Training
    For contractors
    Work at office
    Flexible hours

    LinkedIn

    Mountain View, CA
    2 days ago
  • The Mission: As a Senior Machine Learning Engineer, you will be responsible for building...  ...processes for model training, fine-tuning, testing, and...  ...Generative AI models at significant scale. Investigate, prototype and...  ...with building and evolving ML Training and Inferencing... 
    Senior
    Training
    Local area

    Typeface

    Palo Alto, CA
    19 hours ago
  • $128.7k - $261.3k

     ...learning models from training frameworks (e.g. PyTorch...  ...two‑fold: build the ML deployment platform that...  ...performed manually by engineers. Build the developer...  ...production platform or infrastructure systems where reliability...  ...with the NVIDIA GPU stack at the integration... 
    Senior
    Training
    Local area
    Remote work
    Flexible hours
    Shift work

    General Motors

    Mountain View, CA
    1 day ago
  •  ...team at GM builds core infrastructure that supports the end-...  ...-end AI lifecycle of ML pipelines—from local experimentation...  ...and large-scale training to evaluation, lineage...  ..., enabling ML engineers and researchers to develop...  ...environments. The Role: As a Senior AI/ML Engineer, you... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    Israelvcforum

    Mountain View, CA
    2 days ago
  • $148k - $247k

     ...diverse perspectives and teamwork. ¹ As a Senior AI/ML Platform Engineer, you will architect and scale the ML platform for data scientists and ML...  ...model monitoring. Design and implement infrastructure for model training, hyperparameter tuning, experiment tracking... 
    Senior
    Training
    Full time
    Part time
    Immediate start
    Flexible hours

    Guidewire

    San Mateo, CA
    4 days ago
  • $242.1k - $293.8k

     ...technical challenges at scale, and helping to create...  ...ads machine learning infrastructure to deliver effective...  ...advertisers. As a Senior Machine Learning Infrastructure Engineer in our Ads ML Infra team, you'll build...  ...infrastructure including model training, data pipelines,... 
    Senior
    Training
    Full time
    Work experience placement
    H1b
    Work at office
    Local area
    Visa sponsorship
    Monday to Friday

    Roblox

    San Mateo, CA
    4 days ago
  •  ...proud to serve as the infrastructure platform for teams developing...  ...high-impact, ML-centric use cases. About...  ...Role We are seeking a Senior ML Infrastructure engineer to help build and scale robust Compute platforms...  ...performance computing (HPC). Experience working with... 
    Senior

    General Motors

    Mountain View, CA
    1 day ago
  • $170k - $240k

     ...delivering-driven expert in ML Training Infrastructure with a strong ability to...  ...development initiatives. As a Senior ML Engineer, you will collaborate...  ...support model training at scale. Model training performance...  ...distributed computing, GPU computing, and cloud environments... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    4 days ago
  •  ...developing end-to-end ML models for robot...  ...expertise: data pipelines, training infrastructure or inference. You'll...  ...multimodal data, scaling distributed training,...  ...distributed training across GPU clusters with minimal...  ...For Strong software engineering and systems... 
    Training

    Sunday

    Mountain View, CA
    1 day ago
  • $96.8k - $251.6k

     ...Description The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-...  ...on Oracle Cloud Infrastructure (OCI). This person...  ...used in large-scale, business-critical...  ...high throughput, GPU efficiency, reliability...  ...GPU inference or training workloads for latency... 
    Senior
    Training
    Temporary work
    Flexible hours

    Oracle

    Redwood City, CA
    6 days ago
  •  ...Network Engineer - AI/HPC Memphis, TN; Palo Alto, CA...  ...world to build a 100k GPU cluster on an ethernet...  ...can develop at hyper scale while optimizing performance...  ...optimize it to our training models and how we...  ...seamlessly build-out new GPU infrastructure with little to no... 
    Training

    Xai

    Palo Alto, CA
    5 days ago
  •  ...We are seeking an experienced Machine Learning Infrastructure Engineer to join our team and help scale our ML training platform. In this role, you will be responsible...  ...improve training performance across an expanding GPU ecosystem. You will work on cutting‑edge high-performance... 
    Training
    Local area

    Dyna Robotics

    Redwood City, CA
    2 days ago
  •  ...A leading robotics company in Palo Alto seeks a Staff/Principal ML Systems Engineer to enhance training performance for their innovative humanoid robots. You will optimize distributed training systems and engage closely with researchers to transform model changes into... 
    Senior
    Training

    Rhoda ai

    Palo Alto, CA
    2 days ago
  •  ...black.ai is looking for a skilled platform engineer in Palo Alto to enhance our AWS infrastructure and support quantum simulations. This role requires strong experience...  ...in platform engineering, DevOps practices, and GPU workloads. As a platform engineer, you will improve... 
    Senior

    Black Inc

    Palo Alto, CA
    1 day ago
  • $153k - $222k

     ...Decisive Point is looking for infrastructure engineers and ML engineers to join the Data & ML infra group in Mountain View, California. The role focuses on working across the ML lifecycle and solving broad data problems. Ideal candidates will have software engineering... 
    Senior
    Training

    Decisive Point

    Mountain View, CA
    1 day ago
  •  ...Zoox is seeking a Senior Software Engineer, ML Core to optimize ML tooling for autonomous driving. Join a mission-focused team to develop tools that accelerate model training and deployment. Your 6+ years of experience in Python or C++ will help drive innovations in machine... 
    Senior
    Training

    jobs.frontdoordefense.com - Jobboard

    Foster, CA
    1 day ago
  • $150k - $170k

     ...advantages for scale: photonsdon'tfeel...  ...fiber-optic infrastructure. In 2024,PsiQuantumannounced...  ...Software Engineering Team builds...  ...with GPU-accelerated computing...  .... GPU/HPC Bridge Work (30...  ...infrastructure, ML platforms, or early...  ...education and training, competencies,... 
    Senior
    Training
    Full time
    Shift work

    PsiQuantum

    Stanford, CA
    9 days ago
  • $197.3k - $313.7k

     ...Responsibilities Build, scale and maintain critical features and...  ...with architects, product owners, engineers, user experience designers and...  .... Experience with modern AI/ML frameworks, systems, libraries...  ...compensation, promotion, benefits, training, assessment of job performance... 
    Senior
    Training
    Flexible hours

    Centaur Labs

    Palo Alto, CA
    1 day ago
  • $162.78k - $221.47k

     ..., and fastest scales. From particle...  ...At SLAC, our infrastructure powers the discovery...  ...seeking a Senior Kubernetes Engineer to help...  ...guidance and training to help users...  ...workloads Optimize GPU and...  ...Familiarity with AI/ML infrastructure...  ...computing or HPC environment... 
    Senior
    Training
    Worldwide
    Flexible hours
    Night shift

    Stanford University

    Menlo Park, CA
    2 days ago
  •  ...Senior Solution Architect – AI / GPU Cloud Mountain View, California...  ...intersection of AI infrastructure, GPU cloud...  ..., enabling large-scale AI/ML and HPC workloads. Key...  ...Center Ops, and Engineering teams Identify...  ...with distributed training/inference, Kubernetes... 
    Senior
    Training

    Glint Tech Solutions LLC

    Mountain View, CA
    6 days ago
  •  ...technologies. We are looking for a Senior Machine Learning Engineer to build the AI foundation...  ..., from model research and training to deployment on embedded...  ...evaluation frameworks that scale across imaging datasets....  ..., with at least one major ML framework (PyTorch,... 
    Senior
    Training

    Kodiak Sciences Inc

    Palo Alto, CA
    6 days ago
  • $155.42k - $395.9k

     ...Description About the Team: The ML Inference Platform is part of the AV ML Infrastructure organization. Our team...  ...committed to maximizing GPU utilization across...  ...the Role: We are seeking a Senior ML Infrastructure engineer to help build and scale robust platforms for ML Inference... 
    Senior
    Local area
    Remote work
    Relocation
    Relocation package
    Flexible hours

    Israelvcforum

    Mountain View, CA
    1 day ago
  • $188k - $250k

     ...build, and productionize large-scale NLP and LLM systems that power...  ...that analyze AI Answering engine outputs and public web content...  ...customer problems into measurable ML deliverables and ship...  ...to production: data pipelines, training and inference, CI/CD for ML, observability... 
    Senior
    Training
    Local area

    Meltwater

    Redwood City, CA
    6 days ago
  • $195k - $298k

     ...Technical Center – Cole Engineering Center Podium or...  ...the Team The ML Compute Platform is...  ...organization within Infrastructure Platforms. Our...  ...platform supports training and deployment of...  ...commit to maximizing GPU utilization across...  ...Engineer to build and scale robust compute... 
    Training
    Local area
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    1 day ago
  • $248.71k - $292.6k

     ...developers the speed and scale they need....  ...Staff Software Engineer - High Performance GPU Inference...  ...software-defined infrastructure. Low‑Level GPU Optimization...  ...teams across ML compilers,...  ...and optimizing ML/HPC workloads on GPU...  ...with multi‑GPU training/inference frameworks... 
    Senior
    Training

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    19 hours ago
  •  ...About the role We're looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research....  ...generally support our research Maximize GPU allocation and utilization for both serving... 
    Training

    Character

    Redwood City, CA
    5 days ago
  • $317k - $370k

     ...Senior Engineering Manager, ML Platform Zoox is on a mission to reimagine...  ...growing Software Infrastructure engineering...  ...work on cutting-edge training and inference optimization...  ...experimentation and scale our multi-modal Foundation...  ...~ Experience with GPU-accelerated... 
    Senior
    Training

    Zoox

    San Mateo, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Infrastructure Engineer: Scale GPU Training & HPC. Be the first to apply!