Senior GPU ML Infra Engineer — Mid-Training & Inference
Reflection AI
A cutting-edge AI technology company based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on experience with modern inference frameworks and a solid understanding of reinforcement learning technologies. Comprehensive healthcare benefits, parental leave, and daily meals are provided, along with competitive salary and equity packages. #J-18808-Ljbffr Reflection AI
- Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...Training
- ...San Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale training and fine-tuning of foundation... ...training systems and optimize GPU utilization while collaborating... ...over 5 years of experience in ML infrastructure and a strong...SeniorTraining
- ...the physical world. Training our models... ...heterogeneous fleet of GPU and TPU clusters —... ...seamless. The Team The ML Infrastructure... ...closely with ML Infra (training systems)... ...accelerators. Support Inference and Robot... ...Strong software engineering fundamentals Experience...Training
$250k
Hamilton Barnes Associates Limited in San Francisco is seeking an experienced engineer to design and maintain large-scale GPU clusters for training and inference. The candidate should have over 7 years in SRE or DevOps, with strong skills in Kubernetes and Linux systems...SeniorTraining- ...Member of Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves designing end-to-... ...real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems, and proficiency...Suggested
- A leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing platform... ...performance engineering and strong experience with GPU programming and ML inference workloads. Candidates should have expertise in...Senior
$295k - $380k
...OpenAI is searching for a Senior Software Engineer to join their Robotics team in San Francisco. The role focuses on maintaining and improving the training framework while actively reviewing and debugging code within ML systems. The ideal candidate should thrive in hands...SeniorTraining- ...Francisco is seeking an experienced Software Engineer to develop machine learning... ...involves building data pipelines, creating training platforms, and collaborating with various... ...particularly in distributed systems and ML workflows. Join us in shaping the future...SeniorTraining
- ...PDFs and spreadsheets. We train vision models to read... ...a Machine Learning Engineer to help us train and deploy... ...The Opportunity As an ML Infra Engineer , you’ll play... ...key role in building the inference and training frameworks... ...multi-node, multi-GPU environments with strong...TrainingWork at officeLocal area
- MakerMaker.AI is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and reliability... ..., and have strong knowledge in GPU-accelerated inference. Excellent...Senior
$96.8k - $306.4k
...Job Description The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-level,... ...workflows, scalable inference infrastructure, and enterprise... ..., high throughput, GPU efficiency, reliability,... ...-scale GPU inference or training workloads for latency, throughput...SeniorTrainingTemporary workFlexible hours- ...startup building production‑grade ML infrastructure used by... ...customers. They are looking for a Senior AI/ML Engineer to own model training pipelines, evaluation systems, and inference serving at scale. Full‑time,... ...with distributed training, GPU optimization, or inference serving...SeniorTrainingFull time
$200k - $350k
...company in San Francisco seeks candidates for a role specializing in robotic control systems. You will train whole-body policies, build simulation environments, and run GPU training experiments. Ideal candidates should have strong coding skills in Python, C++, or Rust, and...SeniorTraining$200k - $260k
...Senior Machine Learning Engineer, Voice AI San Francisco About the Role... ...is building the best inference infrastructure for... ...looking for a Senior ML Engineer to drive the... ...frontier. You'll profile GPU utilization, design... ...plus. ~ Experience training or fine-tuning speech...SeniorTrainingFull time- Comfy is seeking a skilled engineer to optimize model inference as part of the core ComfyUI team. This role focuses on enhancing AI model performance, memory management, and collaborating on innovative features. Ideal candidates have a strong background in PyTorch and...Senior
$204k - $259k
...generative modeling, Bayesian inference, hierarchical... ...you will report to a Senior Staff Software Engineer. You will:... ...life-cycle from pre-training and supervised fine-tuning... ...experience Experience in ML engineering and... ...We prefer: ML infra experience: training,...SeniorTrainingFull timeTemporary workRemote work- Define the ML strategy, raise the technical bar... ...between research and engineering reality. You will have... ...platform: feature store, training infrastructure, model... ...stack. Mentor senior and mid-level engineers, conduct... ...retrieval augmentation, and inference optimization. Expert‑...SeniorTraining
- ...is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates should have expertise in GPU fundamentals, deep...Training
$100k - $200k
Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal...Work at office- ...Mach9 ML Engineer Role At Mach9, ML Engineers build the... ...allows us to develop and train cutting edge 3D scene... ...is ideal for early-to-mid-career ML engineers who... ...to scale training and inference of your models and with... ...Familiarity with multi-GPU training and experiment...Training
$200k
...deploying high-throughput, ultra-low-latency inference engines for large language models or... ...AI. Possess a deep understanding of GPU architectures (NVIDIA Ampere/Hopper) and... ...critical intersection between the core ML training team and the backend infrastructure team...TrainingFull timeWork at officeWorldwide- MakerMaker, based in San Francisco, is seeking a highly skilled kernel engineer to write and optimize GPU kernels that enhance performance for training and inference. This role involves deep, low-level work to close the significant performance gap that exists in modern...SeniorTraining
- ...AI company in San Francisco is seeking a skilled ML Infrastructure Engineer to manage and optimize large-scale training systems. In this role, you will design and... ...infrastructure for model training, ensuring efficient GPU/TPU utilization while working closely with...Training
- A leading AI technology company in San Francisco is seeking an engineering professional to develop and manage intelligent job scheduling systems... ...role focuses on ensuring efficient resource allocation across GPU and TPU clusters while enhancing overall system reliability....Training
- ML Systems Engineer - Robotics & AI We are building the full-stack foundation for the next generation... ...and handling scenarios unseen in training. We work at the intersection of large-scale... ...bottleneck identification at different GPU counts. Drive measurable gains in...Training
- ...don't believe culture can be engineered - but when it falls into place... ...Overview We're looking for an ML infrastructure engineer to help... ...supports every stage of the ML training flywheel and be an important... ...distributed ML training on our GPU clusters Take ownership of performance...TrainingLocal area
- About the Role ML Ops Engineer — Agentic AI Lab (Founding Team... ...automating the model training, deployment,... ...compute orchestration, GPU infrastructure, fine-tuned... ...conversion, quantization, and inference rollout Manage hybrid... ...engineering, or infra-focused ML roles Deep...TrainingFull time
- ...seeking a Member of Technical Staff to design and optimize inference systems. The role involves managing KV cache... ...components. Ideal candidates should have strong software engineering skills and experience with ML inference systems, particularly in Python and C++. This...Senior
- ...we offer an innovative GPU marketplace and AI inference service that promise affordability... ...We're seeking a Senior Infrastructure Engineer to help build and scale... ...data infrastructure for AI/ML workloads, including... ...distributed file systems for training data and checkpoints...SeniorTrainingRemote work
$220k
Perplexity is looking for an engineer to join their team in San Francisco.... ...work on building and operating the inference engine, supporting new models, migrating GPU kernels, and developing a Rust-... ...software engineering with a focus on ML inference, familiarity with deep...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior GPU ML Infra Engineer — Mid-Training & Inference. Be the first to apply!
- computer vision machine learning engineer San Francisco, CA
- machine learning ai engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- senior office manager San Francisco, CA

