ML Infra Engineer: GPU Fleet & Inference Orchestrator
Generalist
Generalist is seeking a candidate to manage GPU fleets for training large-scale AI models. You will optimize ML data loading, storage, and orchestration of robot inference fleets in compute-constrained environments. Ideal candidates have deep experience with GPUs, Slurm or Kubernetes, and a strong understanding of the ML hardware stack. Your role will significantly contribute to making general-purpose robots a reality. Join a team from leading AI labs committed to pioneering robotics and AI advancements. #J-18808-Ljbffr Generalist
- Reducto, Inc. is hiring a Machine Learning Infra Engineer in San Francisco to build and maintain ML training and inference frameworks. The role focuses on high performance and scaling across multiple nodes and GPUs. The ideal candidate will have strong Python skills and...Suggested
- Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...Suggested
- ...company based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on...Suggested
- ...training, from managing GPU/TPU compute and job orchestration to building reusable... ...and model engineers to translate ideas into... ...the intersection of ML, software engineering... ...Will Own training/inference infrastructure: Design... ...research needs into infra capabilities and guide...SuggestedFull time
$100k - $200k
Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal...SuggestedWork at office- ...Associates Limited is seeking a Senior ML Infrastructure Engineer to help build and scale Kubernetes-... .... This role focuses on workload orchestration, GPU scheduling, and ensuring system... ...experience with both training and inference infrastructure. The position offers...
$325k
About the Team Our Inference team brings OpenAI... ...Role We're hiring engineers to scale and... ...infrastructure across emerging GPU platforms. You'll... ...with research, infra, and performance... ...models across fleets of accelerators. Enjoy... ...libraries, and orchestration layers. Are excited...Fleet- ...Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large... ...distributed training systems and optimize GPU utilization while collaborating with cross... ...have over 5 years of experience in ML infrastructure and a strong background in...
- ...leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing... ...in performance engineering and strong experience with GPU programming and ML inference workloads. Candidates should have expertise in Python...
- ...are hiring a Machine Learning Engineer to help us train and deploy the... ...product. The Opportunity As an ML Infra Engineer , you’ll play a key role in building the inference and training frameworks that make... ...across multi-node, multi-GPU environments with strong reliability...Work at officeLocal area
$128.7k - $261.3k
...export, kernel development, and performance engineering so that every cycle on our accelerators translates... ...The AI Kernels team builds high‑performance GPU kernels and custom libraries that sit at the heart of on‑vehicle ML inference for ADAS and autonomous driving. We own...Local areaFlexible hours- A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering,...Work at office
- Reactor is looking for an experienced ML Inference Engineer with deep expertise in high-performance ML engineering. This role focuses on optimizing... ...a related field is required, along with strong knowledge of GPU hardware and modern ML optimization techniques. The position...
- We're looking for an ML Inference Engineer with deep expertise in high-performance ML engineering. This is a highly technical, high-impact role... ...6), and advanced serving architectures Working knowledge of GPU hardware (NVIDIA) Strong understanding of transformer architectures...Full timeVisa sponsorshipRelocation package
- ...deploy, and maintain large distributed ML training and inference clusters Develop efficient, scalable... ...Analyze, profile and debug low-level GPU operations to optimize performance... ...Familiarity with containerization and orchestration frameworks (e.g., Kubernetes, Docker)...
$128.7k - $261.3k
...development, and performance engineering so that every cycle on... ...into fast, reliable inference across GPUs powering GM... ...shipped to production fleets. You’ll join a group of... ...compiler, systems, and GPU engineers who enjoy... ...reliable, and effortless for ML engineers across the AV...FleetLocal areaFlexible hours$200k - $250k
...Build and operate the ML platform that powers AppFolio... ...scalable training, inference, and cost‑efficient... ...ECS, SageMaker, GPU fleets, model serving, autoscaling... ...including data pipelines, GPU orchestration, and evaluation.... ...with a significant AI infra footprint. Experience...FleetRemote work- Jaide Health is seeking an engineer for their Model Efficiency team in... ...focuses on building reliable ML systems while enhancing core performance... ...techniques such as GPU/CUDA optimizations and collaborate... ...and insights into the LLM inference ecosystem. A commitment to diversity...Remote job
- ...looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and... ...fluent in Python, and have strong knowledge in GPU-accelerated inference. Excellent communication...
$180k - $250k
A tech innovation company is looking for a hands-on engineer in San Francisco to manage a vast fleet of GPU servers. You will build systems for tracking server lifecycle, automate provisioning and health checks, and ensure OS-level security. The role requires 5+ years of...Fleet$209k - $253k
A leading AI infrastructure company in San Francisco seeks a Staff Software Engineer to design and develop control systems for GPU node management. The candidate will be critical in building foundational cloud infrastructure and achieving business goals. This role requires...Fleet$250k
...for large-scale AI training and inference workloads. With expanding GPU infrastructure across Europe... ...infrastructure limitations. As a Senior ML Infrastructure Engineer, the successful candidate will... .... The role focuses on workload orchestration, GPU scheduling, inference...$220k - $320k
...love taking cutting-edge ML techniques and turning... ...to meet you. About Inference.net Inference.net... ...funded ten-person team of engineers who work in-person in... ...large compute budget / GPU reservation, and... ...training across our GPU fleet Deeply understand customer...FleetWork at office- ...based in San Francisco, is seeking a Distributed Training and Inference Engineer to enhance its machine learning infrastructure. This role involves... .... The ideal candidate will have over 5 years of experience in ML systems and be proficient in Python and C++. Sciforium offers...Flexible hours
- A cutting-edge technology company in San Francisco is seeking an ML Infrastructure Engineer to build and scale machine learning systems for real-time perception and inference. This role involves designing scalable training pipelines for computer vision models, optimizing...
- A media technology company in San Francisco is seeking a Founding Engineer specializing in ML Inference. This highly technical role requires expertise in the ML infrastructure stack and aims to optimize generative media performance. The ideal candidate will drive innovations...Relocation package
$200k - $350k
Inception in San Francisco is seeking engineers and scientists to design and optimize the compute... ...role includes developing high-performance ML kernels for significant operations and... ...precision arithmetic. A strong background in GPU programming and systems is necessary, as...- ...scale . You're experienced with modern inference systems like TGI , vLLM , TensorRT-LLM ,... ...contributions and staying current with ML infrastructure developments Bring practical... ...this usually requires a large engineering effort dedicated to building specialized...Work at office
- ...innovative team. You will own the optimizations for both training and on-robot inference stacks, focusing on achieving step-function gains. The ideal candidate should be proficient with the latest ML techniques and passionate about advancing robotics. Our mission is to make...
$200k
...deploying high‑throughput, ultra‑low‑latency inference engines for large language models or... ...conversational AI. Possess a deep understanding of GPU architectures (NVIDIA Ampere/Hopper) and... ...critical intersection between the core ML training team and the backend...Full timeWork at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Infra Engineer: GPU Fleet & Inference Orchestrator. Be the first to apply!
- machine learning ai engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA
- computer vision machine learning engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- music orchestrator San Francisco, CA

