ML Infra Engineer: GPU Fleet & Inference Orchestrator

Generalist

Generalist is seeking a candidate to manage GPU fleets for training large-scale AI models. You will optimize ML data loading, storage, and orchestration of robot inference fleets in compute-constrained environments. Ideal candidates have deep experience with GPUs, Slurm or Kubernetes, and a strong understanding of the ML hardware stack. Your role will significantly contribute to making general-purpose robots a reality. Join a team from leading AI labs committed to pioneering robotics and AI advancements. #J-18808-Ljbffr Generalist

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the ML Infra Engineer: GPU Fleet & Inference Orchestrator in San Francisco, CA vacancy

ML Infra Engineer — Scalable GPU Training & Inference (SF)
Reducto, Inc. is hiring a Machine Learning Infra Engineer in San Francisco to build and maintain ML training and inference frameworks. The role focuses on high performance and scaling across multiple nodes and GPUs. The ideal candidate will have strong Python skills and...
Suggested
Reducto, Inc.
San Francisco, CA
4 days ago
ML Infra Engineer: Scale GPU Training & Inference
Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...
Suggested
Reducto
San Francisco, CA
4 days ago
Senior GPU ML Infra Engineer — Mid-Training & Inference
...company based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on...
Suggested
Reflection AI
San Francisco, CA
5 days ago
ML Infra Engineer
...training, from managing GPU/TPU compute and job orchestration to building reusable... ...and model engineers to translate ideas into... ...the intersection of ML, software engineering... ...Will Own training/inference infrastructure: Design... ...research needs into infra capabilities and guide...
Suggested
Full time
Monograph
San Francisco, CA
2 days ago
ML Infra Engineer: Scale GPU Compute & Models
$100k - $200k
Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal...
Suggested
Work at office
Voiceflow
San Francisco, CA
5 days ago
Senior ML Infra Engineer: GPU-Optimized Kubernetes Platform
...Associates Limited is seeking a Senior ML Infrastructure Engineer to help build and scale Kubernetes-... .... This role focuses on workload orchestration, GPU scheduling, and ensuring system... ...experience with both training and inference infrastructure. The position offers...
Hamilton Barnes Associates Limited
San Francisco, CA
3 days ago
Software Engineer, Inference - AMD GPU Enablement
$325k
About the Team Our Inference team brings OpenAI... ...Role We're hiring engineers to scale and... ...infrastructure across emerging GPU platforms. You'll... ...with research, infra, and performance... ...models across fleets of accelerators. Enjoy... ...libraries, and orchestration layers. Are excited...
Fleet
Centaur Labs
San Francisco, CA
4 days ago
Senior ML Training Systems Engineer - Distributed GPU Infra
...Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large... ...distributed training systems and optimize GPU utilization while collaborating with cross... ...have over 5 years of experience in ML infrastructure and a strong background in...
Baseten
San Francisco, CA
5 days ago
Senior ML Performance Engineer: LLM Benchmarking & GPU
...leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing... ...in performance engineering and strong experience with GPU programming and ML inference workloads. Candidates should have expertise in Python...
Amadeus Search
San Francisco, CA
2 days ago
Machine Learning Infra Engineer
...are hiring a Machine Learning Engineer to help us train and deploy the... ...product. The Opportunity As an ML Infra Engineer , you’ll play a key role in building the inference and training frameworks that make... ...across multi-node, multi-GPU environments with strong reliability...
Work at office
Local area
Reducto
San Francisco, CA
4 days ago
Senior ML Accelerator Engineer - GPU
$128.7k - $261.3k
...export, kernel development, and performance engineering so that every cycle on our accelerators translates... ...The AI Kernels team builds high‑performance GPU kernels and custom libraries that sit at the heart of on‑vehicle ML inference for ADAS and autonomous driving. We own...
Local area
Flexible hours
Israelvcforum
San Francisco, CA
5 days ago
ML Infra Engineer: Scale Training & Inference (Hybrid)
A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering,...
Work at office
Lattice, Inc.
San Francisco, CA
4 days ago
High-Performance ML Inference Engineer for Diffusion Models
Reactor is looking for an experienced ML Inference Engineer with deep expertise in high-performance ML engineering. This role focuses on optimizing... ...a related field is required, along with strong knowledge of GPU hardware and modern ML optimization techniques. The position...
Reactor
San Francisco, CA
5 days ago
ML Inference Engineer San Francisco · Engineering · Full Time →
We're looking for an ML Inference Engineer with deep expertise in high-performance ML engineering. This is a highly technical, high-impact role... ...6), and advanced serving architectures Working knowledge of GPU hardware (NVIDIA) Strong understanding of transformer architectures...
Full time
Visa sponsorship
Relocation package
Reactor
San Francisco, CA
5 days ago
Senior ML Infra Engineer - Large-Scale Training & Pipelines
...deploy, and maintain large distributed ML training and inference clusters Develop efficient, scalable... ...Analyze, profile and debug low-level GPU operations to optimize performance... ...Familiarity with containerization and orchestration frameworks (e.g., Kubernetes, Docker)...
Kindredventures
San Francisco, CA
2 days ago
Senior ML Compiler Engineer
$128.7k - $261.3k
...development, and performance engineering so that every cycle on... ...into fast, reliable inference across GPUs powering GM... ...shipped to production fleets. You’ll join a group of... ...compiler, systems, and GPU engineers who enjoy... ...reliable, and effortless for ML engineers across the AV...
Fleet
Local area
Flexible hours
Israelvcforum
San Francisco, CA
5 days ago
Staff Machine Learning Engineer
$200k - $250k
...Build and operate the ML platform that powers AppFolio... ...scalable training, inference, and cost‑efficient... ...ECS, SageMaker, GPU fleets, model serving, autoscaling... ...including data pipelines, GPU orchestration, and evaluation.... ...with a significant AI infra footprint. Experience...
Fleet
Remote work
AppFolio
San Francisco, CA
5 days ago
Staff ML Inference Engineer — Model Efficiency (Remote)
Jaide Health is seeking an engineer for their Model Efficiency team in... ...focuses on building reliable ML systems while enhancing core performance... ...techniques such as GPU/CUDA optimizations and collaborate... ...and insights into the LLM inference ecosystem. A commitment to diversity...
Remote job
Jaide Health
San Francisco, CA
3 days ago
Senior ML Inference Engineer Production Systems
...looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and... ...fluent in Python, and have strong knowledge in GPU-accelerated inference. Excellent communication...
MakerMaker.AI
San Francisco, CA
2 days ago
Senior GPU Infra Engineer — AI Fleet Automation
$180k - $250k
A tech innovation company is looking for a hands-on engineer in San Francisco to manage a vast fleet of GPU servers. You will build systems for tracking server lifecycle, automate provisioning and health checks, and ensure OS-level security. The role requires 5+ years of...
Fleet
Fal
San Francisco, CA
5 days ago
Staff Software Engineer — AI Infra Architect (GPU Fleet)
$209k - $253k
A leading AI infrastructure company in San Francisco seeks a Staff Software Engineer to design and develop control systems for GPU node management. The candidate will be critical in building foundational cloud infrastructure and achieving business goals. This role requires...
Fleet
Crusoe Energy Systems LLC
San Francisco, CA
1 day ago
AI/ML Infra Engineer - Hosting
$250k
...for large-scale AI training and inference workloads. With expanding GPU infrastructure across Europe... ...infrastructure limitations. As a Senior ML Infrastructure Engineer, the successful candidate will... .... The role focuses on workload orchestration, GPU scheduling, inference...
Hamilton Barnes Associates Limited
San Francisco, CA
3 days ago
Applied Machine Learning Engineer
$220k - $320k
...love taking cutting-edge ML techniques and turning... ...to meet you. About Inference.net Inference.net... ...funded ten-person team of engineers who work in-person in... ...large compute budget / GPU reservation, and... ...training across our GPU fleet Deeply understand customer...
Fleet
Work at office
Inference
San Francisco, CA
2 days ago
Senior Distributed ML Training & Inference Engineer
...based in San Francisco, is seeking a Distributed Training and Inference Engineer to enhance its machine learning infrastructure. This role involves... .... The ideal candidate will have over 5 years of experience in ML systems and be proficient in Python and C++. Sciforium offers...
Flexible hours
Sciforium
San Francisco, CA
4 days ago
Edge ML Infra Engineer for Real-Time Perception
A cutting-edge technology company in San Francisco is seeking an ML Infrastructure Engineer to build and scale machine learning systems for real-time perception and inference. This role involves designing scalable training pipelines for computer vision models, optimizing...
Specter Services LLC
San Francisco, CA
2 days ago
Founding ML Inference Engineer Ultra-Low Latency AI
A media technology company in San Francisco is seeking a Founding Engineer specializing in ML Inference. This highly technical role requires expertise in the ML infrastructure stack and aims to optimize generative media performance. The ideal candidate will drive innovations...
Relocation package
Reactor.am
San Francisco, CA
1 day ago
ML Kernel Engineer - High-Performance GPU Compute
$200k - $350k
Inception in San Francisco is seeking engineers and scientists to design and optimize the compute... ...role includes developing high-performance ML kernels for significant operations and... ...precision arithmetic. A strong background in GPU programming and systems is necessary, as...
Inception LLC
San Francisco, CA
3 days ago
LLM/ML Engineer (Inference)
...scale . You're experienced with modern inference systems like TGI , vLLM , TensorRT-LLM ,... ...contributions and staying current with ML infrastructure developments Bring practical... ...this usually requires a large engineering effort dedicated to building specialized...
Work at office
Gravity Engineering Services Pvt Ltd.
San Francisco, CA
1 day ago
ML Optimization Engineer: Training & Inference
...innovative team. You will own the optimizations for both training and on-robot inference stacks, focusing on achieving step-function gains. The ideal candidate should be proficient with the latest ML techniques and passionate about advancing robotics. Our mission is to make...
Generalist
San Francisco, CA
1 day ago
Machine Learning Engineer, Inference & Serving (Speech LLM) - San Francisco
$200k
...deploying high‑throughput, ultra‑low‑latency inference engines for large language models or... ...conversational AI. Possess a deep understanding of GPU architectures (NVIDIA Ampere/Hopper) and... ...critical intersection between the core ML training team and the backend...
Full time
Work at office
Plaud
San Francisco, CA
5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Infra Engineer: GPU Fleet & Inference Orchestrator. Be the first to apply!