ML Infra Engineer: Scale GPU Compute & Models
$100k - $200kVoiceflow
Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal candidates have hands-on experience with AWS and passion for voice AI models. Join a dynamic team from companies like Waymo and Google, and work in a collaborative, dog-friendly office. Enjoy a competitive salary ranging from $100K to $200K along with equity options. #J-18808-Ljbffr Voiceflow
- ...ML Infrastructure Engineer In this role you will help scale and optimize our training systems and core model code. You'll own critical infrastructure for... ...training, from managing GPU/TPU compute and job orchestration... ...research needs into infra capabilities and guide...Suggested
- ...world. Training our models requires... ...heterogeneous fleet of GPU and TPU clusters... .... That doesn't scale. We need a scheduling and compute layer that makes... ...The Team The ML Infrastructure... ...closely with ML Infra (training systems... ...Strong software engineering fundamentals -...SuggestedFlexible hours
- URun in San Francisco is searching for an ML Infrastructure and Platform Engineer. In this role, you will lead the architecture and scaling of our GPU compute platform from the ground up, ensuring high availability and low-latency inference. This is a founding technical...Suggested
- ...company in San Francisco is seeking a skilled ML Infrastructure Engineer to manage and optimize large-scale training systems. In this role, you will design and maintain infrastructure for model training, ensuring efficient GPU/TPU utilization while working closely with...Suggested
- Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...Suggested
- ...based in San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on experience...
- ...looking for a Senior Software Engineer to build scalable infrastructure for large‑scale training and fine-tuning of foundation models. You will design... ...training systems and optimize GPU utilization while collaborating... ...5 years of experience in ML infrastructure and a...
- ...believe culture can be engineered - but when it falls... ...'re looking for an ML infrastructure... ...design, build, and scale the foundational systems... ...spanning vehicle compute to data collection... ...to large-scale model training and deployment... ...ML training on our GPU clusters Take...Local area
$250k - $380k
...time Department Scaling Compensation $25... ...powers frontier models at massive scale... ...execution across vast GPU/accelerator... ...looking for an engineer to design and implement... ..., and other infra groups to ensure... ...) part of the ML stack. Bonus... ...employment: protect computer hardware...Full timeWork at officeLocal areaRelocation packageFlexible hours- ...Machine Learning Infrastructure Engineer to design and scale critical infrastructure powering ML applications. This role... ...data pipelines and optimizing modeling processes, essential for developing... ...innovative solutions in brain-computer interface technology. The ideal...
- ...Staff ML Platform Engineer – Large Scale Training (LLMOps/MLOps) We're TrueFoundry... ...route between models. They integrate... ...guardrails A unified compute layer to run self-hosted... ..., optimizing multi-GPU training, and shipping... ...love solving gnarly infra challenges—this is your...Flexible hours
- Reducto, Inc. is hiring a Machine Learning Infra Engineer in San Francisco to build and maintain ML training and inference frameworks. The role focuses on high performance and scaling across multiple nodes and GPUs. The ideal candidate will have strong Python skills and...
- ML Systems Engineer - Robotics & AI We are building the full-stack... ...to the foundational models and video world models... ...intersection of large-scale learning, robotics, and... ...We are creating a new computing platform for physical... ...identification at different GPU counts. Drive...
- ...believe culture can be engineered - but when it falls... ...’re looking for an ML engineer to design,... ...(VLA) foundation model at the core of... ...decisions and large-scale training to closed-... ...Qualifications MS or PhD in Computer Science, Machine... ...training, and GPU-accelerated workflows...Local area
- ...spreadsheets. We train vision models to read those documents... ...automate processes at scale. We've grown... ...hiring a Machine Learning Engineer to help us train and... ...Opportunity As an ML Infra Engineer , you'll play... ...across multi-node, multi-GPU environments with...Work at officeLocal area
- ...technology company in San Francisco is seeking an ML Infrastructure Engineer to build and scale machine learning systems for real-time perception... ...involves designing scalable training pipelines for computer vision models, optimizing them for edge devices, and collaborating...
$128.7k - $261.3k
...reliably on real vehicles at scale. We pioneer new approaches to model export, kernel development, and performance engineering so that every cycle on our... ...builds high-performance GPU kernels and custom libraries... ...the heart of our on-vehicle ML inference for ADAS and autonomous...Local areaWork from homeRelocation packageFlexible hours- ...ML Ops Engineer — Agentic AI Lab (Founding Team) Location... ...-graph-grounded models. We're hiring an... ...You'll work across compute orchestration, GPU infrastructure, fine... ...platform engineering, or infra-focused ML roles ~... ...(spot instance scaling, batch prioritization...Full time
- ...Foundation Models For Biology You will play a pivotal... ..., implementing, and scaling foundational AI models... ...Implement high-performance ML algorithms optimised... ...foundation in distributed computing principles, parallel... ...Hardware optimisation (GPU/TPU/HPU) Finetuning...
$204k - $259k
...autolabels at a massive scale, serving as the... ...stack. We are an advanced ML and engineering team that leverages state-of-the-art computer vision, deep learning,... ...computer vision / multimodal models (e.g., Gemini) to... ...Collaborate closely with the ML Infra, Perception, Behavior,...Full timeRemote work$129.3k
...Learning Systems Engineer to join Frontier... ...infrastructure for large-scale machine learning models, particularly in... ...Amazon's massive computational infrastructure... ..., scalable ML systems. - Evaluate... ...and optimize GPU memory and throughput... ...research, data infra teams to integrate...InternshipLocal area$181.1k - $318.4k
.../Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge... ...every ounce of compute from our hardware. As... ...technologies and make it run at scale of Apple.... ...~ Familiarity with GPU programming concepts using... ...with one of the popular ML Frameworks like Pytorch...Relocation- ...Sesame believes in a future where computers are lifelike - with the ability... ...of LLM, speech, and vision models. Partner with ML infrastructure and training engineers to build a fast, cost-effective... ...ex. Bottleneck analysis in high-scale server systems or profiling low...Full timeContract workFlexible hours
- An innovative company is seeking a talented software engineer to join their dynamic Inference team. This role involves designing and implementing infrastructure for large-scale multimodal models, focusing on high-performance delivery of audio and image inputs. You'll collaborate...
- A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering,...Work at office
- A cutting-edge tech company in San Francisco seeks infrastructure engineers to enhance the tooling and systems that power its AI applications. Responsibilities include building GPU orchestration, scaling cloud batchjob systems, and designing efficient scheduling software...Visa sponsorship
- ...in San Francisco is seeking a talented engineer to design and implement robust CI/CD pipelines... ...will have a bachelor's degree in Computer Science or a related field, with at least... ...ensuring comprehensive observability of models. Enjoy competitive benefits and a dynamic...
- ...frontier of interactive world models : systems that generate... ...exceptional research engineers and applied researchers... ...Staff - Data & ML Infrastructure Engineer... .... You'll work across GPU kernels, inference systems... ...observability, and large-scale orchestration systems....
- ...Technologies, Inc. is looking for an MLOps Engineer in San Francisco to design and implement cloud-based workflows for AI models. This role involves collaboration with cross... ...and demands a Bachelor's degree in Computer Science with relevant experience in programming...Work at office
- ...individual to take on a hands-on role focused on scaling and optimizing ML training systems. Key responsibilities... ..., improving performance, and managing GPU/TPU compute resources. Ideal candidates will have strong software engineering foundations, hands-on experience in JAX...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Infra Engineer: Scale GPU Compute & Models. Be the first to apply!
- machine learning ai engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- entry level machine learning engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- computer vision machine learning engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA

