Senior ML Platform Engineer
$152k - $241.5kNVIDIA
- # Senior ML Platform EngineerApplylocations: US, CA, Santa Clara: US, MA, Westford: US, CO, Remote: US, NC, Durham: US, CO, Bouldertime type: Full timeposted on: Posted 2 Days Agojob requisition id: JR2013616NVIDIA is at the forefront of innovations in Artificial Intelligence, High-Performance Computing, and Visualization. Our invention—the GPU—functions as the visual cortex of modern computing and is central to groundbreaking applications from generative AI to autonomous vehicles. We are now looking for a ML Platform Engineer to help accelerate the next era of machine learning innovation.In this role, you will architect, build, and scale our high-performance ML infrastructure using modern Infrastructure-as-Code practices. Your primary focus will be on creating reliable, automated platforms that empower scientists and engineers to train and deploy the most advanced ML models on some of the world’s most powerful GPU systems. Join our top team and apply your SRE and software engineering skills to craft robust, user-friendly platforms for seamless ML development.**What You'll Be Doing:*** Design, build, and maintain our core ML platform infrastructure as code, primarily using Ansible and Terraform, ensuring reproducibility and scalability across large-scale, distributed GPU clusters.* Apply SRE principles to diagnose, troubleshoot, and resolve complex system issues across the entire stack, ensuring high availability and performance for critical AI workloads.* Develop robust internal automation and tooling for ML workflow orchestration, resource scheduling, and platform operations, with a strong focus on software engineering best practices.* Collaborate with ML researchers and applied scientists to understand infrastructure needs and build solutions that streamline their end-to-end experimentation.* Evolve and operate our multi-cloud and hybrid (on-prem + cloud) environments, implementing monitoring, alerting, and incident response protocols.* Participate in on-call rotation to provide support for platform services and infrastructure running critical ML jobs, driving root cause analysis and implementing preventative measures.* Write high-quality, maintainable code (Python, Go) to contribute to the core orchestration platform and automate manual processes.* Drive the adoption of modern GPU technologies and ensure smooth integration of next-generation hardware into ML pipelines (e.g., GB200, NVLink, etc.).**What We Need To See:*** BS/MS in Computer Science, Engineering, or equivalent experience.* 5+ years in software/platform engineering or SRE roles, including 3+ years focused on ML infrastructure or distributed compute systems.* Strong proficiency in Infrastructure-as-Code (IaC) tools, specifically Ansible and Terraform, with a proven track record of building and managing production infrastructure.* SRE-oriented mindset with extensive experience in diagnosing system-level issues, performance tuning, and ensuring platform reliability.* Solid understanding of ML workflows and lifecycle—from data preprocessing to deployment.* Proficiency in operating containerized workloads with Kubernetes and Docker.* Strong software engineering skills in languages such as Python or Go, with a focus on automation, tooling, and writing production-grade code.* Experience with Linux systems internals, networking, and performance tuning at scale.**Ways To Stand Out From The Crowd:*** Experience building or operating ML platforms supporting frameworks like PyTorch or TensorFlow at scale.* Deep understanding of distributed training techniques (e.g., data/model parallelism, Horovod, NCCL).* Expertise with modern CI/CD methodologies and GitOps practices.* Passion for building developer-centric platforms with great UX and strong operational reliability.* Proven ability to contribute code to complex orchestration or automation platforms.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until June 9, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
- J-18808-Ljbffr NVIDIA Corporation
Vacancy posted 16 hours ago
Similar jobs that could be interesting for youBased on the Senior ML Platform Engineer in Santa Clara, CA vacancy
$152k - $241.5k
NVIDIA Corporation is seeking a Senior ML Platform Engineer to design and scale high-performance ML infrastructure. You'll utilize IaC techniques with Ansible and Terraform, collaborating closely with ML researchers and ensuring system reliability and performance. This...SeniorRemote job- Dormont Manufacturing Co is seeking a Software Engineer to join the ML Integration and Quality team. This position focuses on integrating and... ...software components for large-scale ML workloads at Cerebras AI platform. You will debug complex systems, improve automation, and...SeniorWork at office
$160k - $200k
PlusAI, based in Silicon Valley, is seeking a Senior ML Infrastructure Engineer to design scalable architectures for machine learning models. This role involves building robust data pipelines, managing GPU clusters, and collaborating with cross-functional teams. Candidates...Senior- General Motors is seeking a Senior ML Infrastructure Engineer to build and scale a robust platform for machine learning inference workflows. You will design backend software components, collaborate with ML engineers, and lead initiatives across GM's ML ecosystem. With...SeniorRemote job
- General Motors is seeking a Machine Learning Engineer for the Model Deployment & Inference Solutions team in Sunnyvale, California. The role involves building and optimizing a unified ML deployment platform to ensure efficient model rollouts for autonomous vehicles. Candidates...Senior
- ...everything is connected and moves autonomously through a self‑managing urban transportation operating system. At 42dot, our AD ML Platform Engineers build the core data platform and ML training / eval platform for the cutting edge algorithms in autonomous driving. We...SeniorFull timeWork experience placement
- General Motors is looking for a Senior ML Infrastructure Engineer to build robust compute platforms for AI validation. This role emphasizes driving efficiency and maximizing GPU utilization while improving platform reliability. You will collaborate with engineers to shape...Senior
- General Motors in Sunnyvale, California, is offering a Staff ML Infra Engineer position that focuses on enhancing autonomous driving through machine learning solutions. The role involves designing scalable systems for training and evaluating ML models, requiring a strong...SeniorRemote work
- A leading AI-powered fraud detection platform in Mountain View is seeking experienced platform engineers to design and build advanced machine learning systems. You will engage in improving core detection algorithms, using unsupervised and supervised machine learning, and...Senior
- Israelvcforum is looking for a Senior ML Infrastructure Engineer in Mountain View, California. This position aims to build and scale robust platforms for ML inference workflows supporting GM’s AI efforts. You will collaborate with ML engineers and researchers to implement...SeniorRemote job
$100k
Netflix, Inc. is seeking a skilled software engineer to lead advancements in Metaflow, a platform enhancing machine learning applications. The role involves designing... ...engineering, particularly with Python and scalable ML systems. The position offers a competitive salary...SeniorFlexible hours$166k - $244k
Carlsbad Tech is actively seeking a Senior Software Engineer to work on the Gemini Live API in Sunnyvale, CA. This role involves building scalable... ...in software development, infrastructure management, and AI/ML technologies. Benefits include a competitive salary ranging...Senior$128.7k - $261.3k
...autonomous vehicle hardware. Our mission is two-fold: build the ML deployment platform that makes model rollouts fast and predictable, and... ...Claude Code, GitHub Copilot, or equivalent) as part of your engineering workflow. Experience designing clean, well-tested software...SeniorFlexible hours- Google Inc. is seeking a Senior Software Engineer for AI/ML in Sunnyvale, CA. The candidate will develop technologies that enhance user interaction and handle massive scale information. Responsibilities include writing code, testing, design collaboration, and ML solutions...Senior
- ...of software stack: a hardware-agnostic platform that makes every system — from a laptop... ...supercomputer — feel like one seamless engine. Developers can write once, run anywhere... ...About the Role We're looking for a Senior ML Performance Engineer to architect and lead...Senior
$159.3k - $230.7k
...The team directly works on and delivers ML models to the product that successively go... ...collaborative, high-impact team of AI/ML engineers, data scientists and engineers who are passionate... ...-generation autonomous vehicles. As a Senior AI/ML Engineer in the Embodied AI Data...SeniorLocal areaRemote workWork from homeRelocation packageFlexible hours$128.7k - $261.3k
...approaches to model export, kernel development, and performance engineering so that every cycle on our accelerators translates into better... ...kernels and custom libraries that sit at the heart of our on‑vehicle ML inference for ADAS and autonomous driving . We own making...SeniorLocal areaWork from homeRelocation packageFlexible hours$128.7k - $261.3k
...development, and performance engineering so that every cycle on our accelerators... ...Kernels teams to co-design a platform that enables new ideas in... ...driving. The Role As a Senior Compiler Engineer on the AI... ...reliable, and effortless for ML engineers across the AV...SeniorLocal areaWork from homeRelocation packageFlexible hours- An AI lab in Santa Clara is seeking a skilled software engineer with over 8 years of experience to optimize machine learning models for real... ...designing distributed training strategies, collaborating with ML researchers, and developing tools for performance enhancement....Senior
- Relha LLC is hiring a Senior Data Scientist (Machine Learning Engineer) in Sunnyvale, California. This role focuses on developing... ...closely with Data Scientists and ML Engineers, driving the next generation of compliance detection platforms while enhancing model serving and...Senior
$173k - $253k
...Senior MLOps Engineer Matterport is leading the digital transformation of the built world. Our groundbreaking spatial computing platform turns buildings into data making every space more valuable and... ...production. You will work closely with ML R&D Engineers and other...SeniorWork at officeWork from home- Apple Inc. is seeking a Senior/Staff Engineer in Santa Clara, California, to lead the design of scheduling systems for TPU workloads. The ideal... ...responsibilities such as developing orchestration systems for distributed ML workloads and mentoring engineers. Benefits include...Senior
$184k - $356.5k
NVIDIA Corporation is seeking a Senior Machine Learning Engineer focusing on perception for autonomous driving solutions. You will develop innovative deep learning models for road detection and collaborate with various teams to improve system accuracy. Ideal candidates...SeniorRemote job$152k - $287.5k
NVIDIA Gruppe, based in Santa Clara, is seeking a Senior Software Engineer to accelerate the development of machine learning innovations. In this role, you'll design and implement solutions for GPU clusters, enabling researchers to optimize their work. Strong expertise...Senior- Apple Inc. is seeking a skilled software engineer with machine learning experience for their team in Santa Clara, California. The role focuses... ...must have strong software development skills, experience with ML models, and a relevant degree. The position offers competitive...Senior
- ...leading machine learning frameworks, focusing on weather and climate AI. The role involves collaborating with teams to create scientific ML technology and validating applications on NVIDIA’s products. The ideal candidate should hold a BS/MS (PhD preferred) and possess...Senior
- A pioneering technology firm in Sunnyvale is seeking a Senior CV/ML Engineer. This role involves enhancing 3D capture technology through advanced image processing and computational photography. The ideal candidate should possess a strong background in Python and C++, alongside...SeniorRemote work
- Apple Inc. in Santa Clara, California, is seeking a Machine Learning Engineer to enhance intelligent search experiences. You will develop machine learning models aimed at improving search relevance and user satisfaction. The ideal candidate has over 5 years of experience...Senior
- ...CA. The role requires a deep understanding of LLM/VLM frameworks and solid software engineering skills in Python and C++. Successful candidates will have a track record of shipping ML systems to production. The position offers a competitive base salary range of $184,00...Senior
$174k - $252k
Google Inc. is seeking a Senior Machine Learning Engineer in Sunnyvale, CA, to improve AI model performance and efficiency. Candidates should possess a Bachelor's degree and significant experience in software development, testing, and performance optimization. Responsibilities...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior ML Platform Engineer. Be the first to apply!
Related searches
- machine learning software engineer Santa Clara, CA
- ai ml engineer Santa Clara, CA
- computer vision machine learning engineer Santa Clara, CA
- machine learning engineer Santa Clara, CA
- senior ml engineer Santa Clara, CA
- machine learning ai engineer Santa Clara, CA
- client platform engineer Santa Clara, CA
- platform engineer Santa Clara, CA
- senior platform engineer Santa Clara, CA
- platform engineering manager Santa Clara, CA


