Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Infra Engineer: GPU Fleet & Inference Orchestrator

Generalist

Generalist is seeking a candidate to manage GPU fleets for training large-scale AI models. You will optimize ML data loading, storage, and orchestration of robot inference fleets in compute-constrained environments. Ideal candidates have deep experience with GPUs, Slurm or Kubernetes, and a strong understanding of the ML hardware stack. Your role will significantly contribute to making general-purpose robots a reality. Join a team from leading AI labs committed to pioneering robotics and AI advancements. #J-18808-Ljbffr Generalist

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Infra Engineer: GPU Fleet & Inference Orchestrator in San Francisco, CA vacancy
  • Reducto, Inc. is hiring a Machine Learning Infra Engineer in San Francisco to build and maintain ML training and inference frameworks. The role focuses on high performance and scaling across multiple nodes and GPUs. The ideal candidate will have strong Python skills and... 
    Suggested

    Reducto, Inc.

    San Francisco, CA
    4 days ago
  • Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,... 
    Suggested

    Reducto

    San Francisco, CA
    4 days ago
  •  ...company based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on... 
    Suggested

    Reflection AI

    San Francisco, CA
    5 days ago
  •  ...training, from managing GPU/TPU compute and job orchestration to building reusable...  ...and model engineers to translate ideas into...  ...the intersection of ML, software engineering...  ...Will Own training/inference infrastructure: Design...  ...research needs into infra capabilities and guide... 
    Suggested
    Full time

    Monograph

    San Francisco, CA
    2 days ago
  • $100k - $200k

    Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal... 
    Suggested
    Work at office

    Voiceflow

    San Francisco, CA
    5 days ago
  •  ...Associates Limited is seeking a Senior ML Infrastructure Engineer to help build and scale Kubernetes-...  .... This role focuses on workload orchestration, GPU scheduling, and ensuring system...  ...experience with both training and inference infrastructure. The position offers... 

    Hamilton Barnes Associates Limited

    San Francisco, CA
    3 days ago
  • $325k

    About the Team Our Inference team brings OpenAI...  ...Role We're hiring engineers to scale and...  ...infrastructure across emerging GPU platforms. You'll...  ...with research, infra, and performance...  ...models across fleets of accelerators. Enjoy...  ...libraries, and orchestration layers. Are excited... 
    Fleet

    Centaur Labs

    San Francisco, CA
    4 days ago
  •  ...Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large...  ...distributed training systems and optimize GPU utilization while collaborating with cross...  ...have over 5 years of experience in ML infrastructure and a strong background in... 

    Baseten

    San Francisco, CA
    5 days ago
  •  ...leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing...  ...in performance engineering and strong experience with GPU programming and ML inference workloads. Candidates should have expertise in Python... 

    Amadeus Search

    San Francisco, CA
    2 days ago
  •  ...are hiring a Machine Learning Engineer to help us train and deploy the...  ...product. The Opportunity As an ML Infra Engineer , you’ll play a key role in building the inference and training frameworks that make...  ...across multi-node, multi-GPU environments with strong reliability... 
    Work at office
    Local area

    Reducto

    San Francisco, CA
    4 days ago
  • $128.7k - $261.3k

     ...export, kernel development, and performance engineering so that every cycle on our accelerators translates...  ...The AI Kernels team builds high‑performance GPU kernels and custom libraries that sit at the heart of on‑vehicle ML inference for ADAS and autonomous driving. We own... 
    Local area
    Flexible hours

    Israelvcforum

    San Francisco, CA
    5 days ago
  • A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering,... 
    Work at office

    Lattice, Inc.

    San Francisco, CA
    4 days ago
  • Reactor is looking for an experienced ML Inference Engineer with deep expertise in high-performance ML engineering. This role focuses on optimizing...  ...a related field is required, along with strong knowledge of GPU hardware and modern ML optimization techniques. The position... 

    Reactor

    San Francisco, CA
    5 days ago
  • We're looking for an ML Inference Engineer with deep expertise in high-performance ML engineering. This is a highly technical, high-impact role...  ...6), and advanced serving architectures Working knowledge of GPU hardware (NVIDIA) Strong understanding of transformer architectures... 
    Full time
    Visa sponsorship
    Relocation package

    Reactor

    San Francisco, CA
    5 days ago
  •  ...deploy, and maintain large distributed ML training and inference clusters Develop efficient, scalable...  ...Analyze, profile and debug low-level GPU operations to optimize performance...  ...Familiarity with containerization and orchestration frameworks (e.g., Kubernetes, Docker)... 

    Kindredventures

    San Francisco, CA
    2 days ago
  • $128.7k - $261.3k

     ...development, and performance engineering so that every cycle on...  ...into fast, reliable inference across GPUs powering GM...  ...shipped to production fleets. You’ll join a group of...  ...compiler, systems, and GPU engineers who enjoy...  ...reliable, and effortless for ML engineers across the AV... 
    Fleet
    Local area
    Flexible hours

    Israelvcforum

    San Francisco, CA
    5 days ago
  • $200k - $250k

     ...Build and operate the ML platform that powers AppFolio...  ...scalable training, inference, and cost‑efficient...  ...ECS, SageMaker, GPU fleets, model serving, autoscaling...  ...including data pipelines, GPU orchestration, and evaluation....  ...with a significant AI infra footprint. Experience... 
    Fleet
    Remote work

    AppFolio

    San Francisco, CA
    5 days ago
  • Jaide Health is seeking an engineer for their Model Efficiency team in...  ...focuses on building reliable ML systems while enhancing core performance...  ...techniques such as GPU/CUDA optimizations and collaborate...  ...and insights into the LLM inference ecosystem. A commitment to diversity... 
    Remote job

    Jaide Health

    San Francisco, CA
    3 days ago
  •  ...looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and...  ...fluent in Python, and have strong knowledge in GPU-accelerated inference. Excellent communication... 

    MakerMaker.AI

    San Francisco, CA
    2 days ago
  • $180k - $250k

    A tech innovation company is looking for a hands-on engineer in San Francisco to manage a vast fleet of GPU servers. You will build systems for tracking server lifecycle, automate provisioning and health checks, and ensure OS-level security. The role requires 5+ years of... 
    Fleet

    Fal

    San Francisco, CA
    5 days ago
  • $209k - $253k

    A leading AI infrastructure company in San Francisco seeks a Staff Software Engineer to design and develop control systems for GPU node management. The candidate will be critical in building foundational cloud infrastructure and achieving business goals. This role requires... 
    Fleet

    Crusoe Energy Systems LLC

    San Francisco, CA
    1 day ago
  • $250k

     ...for large-scale AI training and inference workloads. With expanding GPU infrastructure across Europe...  ...infrastructure limitations. As a Senior ML Infrastructure Engineer, the successful candidate will...  .... The role focuses on workload orchestration, GPU scheduling, inference... 

    Hamilton Barnes Associates Limited

    San Francisco, CA
    3 days ago
  • $220k - $320k

     ...love taking cutting-edge ML techniques and turning...  ...to meet you. About Inference.net Inference.net...  ...funded ten-person team of engineers who work in-person in...  ...large compute budget / GPU reservation, and...  ...training across our GPU fleet Deeply understand customer... 
    Fleet
    Work at office

    Inference

    San Francisco, CA
    2 days ago
  •  ...based in San Francisco, is seeking a Distributed Training and Inference Engineer to enhance its machine learning infrastructure. This role involves...  .... The ideal candidate will have over 5 years of experience in ML systems and be proficient in Python and C++. Sciforium offers... 
    Flexible hours

    Sciforium

    San Francisco, CA
    4 days ago
  • A cutting-edge technology company in San Francisco is seeking an ML Infrastructure Engineer to build and scale machine learning systems for real-time perception and inference. This role involves designing scalable training pipelines for computer vision models, optimizing... 

    Specter Services LLC

    San Francisco, CA
    2 days ago
  • A media technology company in San Francisco is seeking a Founding Engineer specializing in ML Inference. This highly technical role requires expertise in the ML infrastructure stack and aims to optimize generative media performance. The ideal candidate will drive innovations... 
    Relocation package

    Reactor.am

    San Francisco, CA
    1 day ago
  • $200k - $350k

    Inception in San Francisco is seeking engineers and scientists to design and optimize the compute...  ...role includes developing high-performance ML kernels for significant operations and...  ...precision arithmetic. A strong background in GPU programming and systems is necessary, as... 

    Inception LLC

    San Francisco, CA
    3 days ago
  •  ...scale . You're experienced with modern inference systems like TGI , vLLM , TensorRT-LLM ,...  ...contributions and staying current with ML infrastructure developments Bring practical...  ...this usually requires a large engineering effort dedicated to building specialized... 
    Work at office

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    1 day ago
  •  ...innovative team. You will own the optimizations for both training and on-robot inference stacks, focusing on achieving step-function gains. The ideal candidate should be proficient with the latest ML techniques and passionate about advancing robotics. Our mission is to make... 

    Generalist

    San Francisco, CA
    1 day ago
  • $200k

     ...deploying high‑throughput, ultra‑low‑latency inference engines for large language models or...  ...conversational AI. Possess a deep understanding of GPU architectures (NVIDIA Ampere/Hopper) and...  ...critical intersection between the core ML training team and the backend... 
    Full time
    Work at office

    Plaud

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Infra Engineer: GPU Fleet & Inference Orchestrator. Be the first to apply!