Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior GPU ML Infra Engineer — Mid-Training & Inference

Reflection AI

A cutting-edge AI technology company based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on experience with modern inference frameworks and a solid understanding of reinforcement learning technologies. Comprehensive healthcare benefits, parental leave, and daily meals are provided, along with competitive salary and equity packages. #J-18808-Ljbffr Reflection AI

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Senior GPU ML Infra Engineer — Mid-Training & Inference in San Francisco, CA vacancy
  • Reducto, Inc. is hiring a Machine Learning Infra Engineer in San Francisco to build and maintain ML training and inference frameworks. The role focuses on high performance and scaling across multiple nodes and GPUs. The ideal candidate will have strong Python skills and... 
    Training

    Reducto, Inc.

    San Francisco, CA
    3 days ago
  • Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,... 
    Training

    Reducto

    San Francisco, CA
    3 days ago
  •  ...San Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale training and fine-tuning of foundation...  ...training systems and optimize GPU utilization while collaborating...  ...over 5 years of experience in ML infrastructure and a strong... 
    Senior
    Training

    Baseten

    San Francisco, CA
    4 days ago
  •  ...ML Infrastructure Engineer In this role you will help scale and optimize our training systems and core model code. You'll own...  ...training, from managing GPU/TPU compute and job...  ...Own training/inference infrastructure: Design...  ...research needs into infra capabilities and guide... 
    Training

    Physical Intelligence

    San Francisco, CA
    3 days ago
  •  ...the physical world. Training our models...  ...heterogeneous fleet of GPU and TPU clusters —...  ...The Team The ML Infrastructure team...  ...work closely with ML Infra (training systems)...  .... - Support Inference and Robot Deployment...  ...- Strong software engineering fundamentals - Experience... 
    Training
    Flexible hours

    Physical Intelligence

    San Francisco, CA
    3 days ago
  • $250k

    Hamilton Barnes Associates Limited in San Francisco is seeking an experienced engineer to design and maintain large-scale GPU clusters for training and inference. The candidate should have over 7 years in SRE or DevOps, with strong skills in Kubernetes and Linux systems... 
    Senior
    Training

    Hamilton Barnes Associates Limited

    San Francisco, CA
    17 hours ago
  •  ...Member of Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves designing end-to-...  ...real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems, and proficiency... 

    Acceler8 Talent

    San Francisco, CA
    3 days ago
  • $128.7k - $261.3k

     ...export, kernel development, and performance engineering so that every cycle on our accelerators...  ...AI Kernels team builds high-performance GPU kernels and custom libraries that sit at the heart of our on-vehicle ML inference for ADAS and autonomous driving . We own... 
    Senior
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    San Francisco, CA
    1 day ago
  • $152k - $228k

     ...Job Description Job Description Senior ML Engineer About Invoca Invoca is an AI...  ...lifecycle at Invoca, from model training and fine-tuning through inference optimization and production APIs....  ...Server, Baseten, and Kubernetes-based GPU infrastructure. Profile and tune... 
    Senior
    Training
    Currently hiring
    Remote work
    Flexible hours

    Invoca

    San Francisco, CA
    1 day ago
  • A leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing platform...  ...performance engineering and strong experience with GPU programming and ML inference workloads. Candidates should have expertise in... 
    Senior

    Amadeus Search

    San Francisco, CA
    1 day ago
  •  ...PDFs and spreadsheets. We train vision models to read...  ...a Machine Learning Engineer to help us train and deploy...  ...As an ML Infra Engineer , you'll play...  ...key role in building the inference and training frameworks...  ...across multi-node, multi-GPU environments with strong... 
    Training
    Work at office
    Local area

    Reducto

    San Francisco, CA
    17 hours ago
  •  ...in San Francisco is searching for an ML Infrastructure and Platform Engineer. In this role, you will lead the architecture and scaling of our GPU compute platform from the ground up,...  ...ensuring high availability and low-latency inference. This is a founding technical hire... 

    URun

    San Francisco, CA
    4 days ago
  • $141k - $249k

     ...Build standardized distributed training frameworks for research and...  ...adopted into Waabi’s training and inference frameworks. Examples include...  ...- Work with researchers and ML engineers on best-practices for optimal...  ...Skilled in profiling CPU and GPU code using tools such as... 
    Senior
    Training
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    1 day ago
  • A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering,... 
    Training
    Work at office

    Lattice, Inc.

    San Francisco, CA
    3 days ago
  •  ...and games to robotics training, simulations, and digital...  ...exceptional research engineers and applied researchers...  ...Technical Staff - Data & ML Infrastructure Engineer...  ...'s model training and inference infrastructure. This...  ...You'll work across GPU kernels, inference systems... 
    Training

    Moonlake AI

    San Francisco, CA
    4 days ago
  • $151.8k - $265.35k

     ...creativity. We're seeking an outstanding ML infra engineer with deep expertise in building...  ..., scalable and reliable PyTorch training infrastructures, GPU optimizations with custom CUDA kernels...  ...profile GPU utilization, trace inference and training runs and help craft strategies... 
    Senior
    Training
    Full time
    Temporary work
    Local area
    Worldwide

    Adobe

    San Francisco, CA
    1 day ago
  • $300k - $430k

     ...team. About the Team The ML Infrastructure team...  ...the platforms for model training, the infrastructure for...  ...routing layer that manages inference across multiple...  ...Staff ML Infrastructure Engineer to own the platforms powering...  ...training: multi-node GPU clusters, fault tolerance... 
    Training
    Work at office

    Decagon

    San Francisco, CA
    2 days ago
  • $200k - $260k

     ...Senior Machine Learning Engineer, Voice AI San Francisco About the Role...  ...is building the best inference infrastructure for...  ...looking for a Senior ML Engineer to drive the...  ...frontier. You'll profile GPU utilization, design...  ...plus. ~ Experience training or fine-tuning speech... 
    Senior
    Training
    Full time

    Together AI

    San Francisco, CA
    17 hours ago
  •  ...is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates should have expertise in GPU fundamentals, deep... 
    Training

    Wafer

    San Francisco, CA
    2 days ago
  • $200k - $350k

     ...company in San Francisco seeks candidates for a role specializing in robotic control systems. You will train whole-body policies, build simulation environments, and run GPU training experiments. Ideal candidates should have strong coding skills in Python, C++, or Rust, and... 
    Senior
    Training

    Pantera Capital

    San Francisco, CA
    2 days ago
  • $204k - $259k

     ...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an autonomous...  ...modeling, Bayesian inference, hierarchical...  ...life-cycle from pre-training and supervised fine-...  ...experience Experience in ML engineering and...  ...prefer: ML infra experience: training... 
    Senior
    Training
    Full time
    Temporary work
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $161.93k - $227.33k

     ...Senior Machine Learning Engineer Brisbane, California About This Opportunity...  ...machine learning (AI/ML) systems in a cloud...  ...for current training paradigms. The Senior...  ...model management, and inference. Collaborate closely...  ...data. Experience GPU/Accelerator programming... 
    Senior
    Training
    Work at office
    Local area
    Remote work
    2 days per week
    3 days per week

    Freenome

    Brisbane, CA
    17 hours ago
  • $175k - $205k

     ...Role Description: We are seeking a Senior ML Scientist/Engineer to design models that operate on...  ...analysis, feature engineering, model training, evaluation, and optimization. Design...  ...on edge ML, resource-constrained inference, and efficient training techniques.... 
    Senior
    Training

    Gridware

    San Francisco, CA
    1 day ago
  • Define the ML strategy, raise the technical bar...  ...between research and engineering reality. You will have...  ...platform: feature store, training infrastructure, model...  ...stack. Mentor senior and mid-level engineers, conduct...  ...retrieval augmentation, and inference optimization. Expert‑... 
    Senior
    Training

    Sierracorp

    San Francisco, CA
    4 days ago
  • $131.4k - $235.95k

    Autodesk, Inc. is seeking a Senior Machine Learning Engineer for MLOps in San Francisco. You will ensure AI-powered experiences meet high standards...  ...include automating model testing, managing inference services, and integrating REST APIs. Required qualifications... 
    Senior

    Autodesk, Inc.

    San Francisco, CA
    17 hours ago
  • $200k

     ...deploying high-throughput, ultra-low-latency inference engines for large language models or...  ...AI. Possess a deep understanding of GPU architectures (NVIDIA Ampere/Hopper) and...  ...critical intersection between the core ML training team and the backend infrastructure team... 
    Training
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    4 days ago
  •  ...ML Ops Engineer — Agentic AI Lab (Founding Team) Location...  ...automating the model training, deployment, versioning...  ...compute orchestration, GPU infrastructure, fine-...  ...conversion, quantization, and inference rollout Manage...  ...engineering, or infra-focused ML roles ~ Deep... 
    Training
    Full time

    Fabrion

    San Francisco, CA
    1 day ago
  • $100k - $200k

    Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal... 
    Work at office

    Voiceflow

    San Francisco, CA
    4 days ago
  • $249.6k - $312k

     ...Senior Staff Data Scientist - Bayesian Experimentation and Causal Inference New York, New York, United States; San...  ...us to build the truth engine behind better mental...  ...directional evidence, mid levels reflect increasingly...  ...framing). Build training, playbooks, and... 
    Senior
    Training
    Work from home
    Flexible hours

    Headway - Design & Development

    San Francisco, CA
    1 day ago
  •  ...Francisco, is seeking an AI Platform Engineer to manage and optimize the training and inference of AI models. You will lead efforts in...  ...and distributed training on advanced GPU clusters. The ideal candidate has a solid foundation in ML engineering, particularly with Ray,... 
    Training

    Medium

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior GPU ML Infra Engineer — Mid-Training & Inference. Be the first to apply!