Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Engineering Manager - ML Platform and Infrastructure

$204k - $343k

Decisive Point

About Applied Intuition Applied Intuition, Inc. is powering the future of physical AI. Founded in 2017 and now valued at $15 billion, the Silicon Valley company is creating the digital infrastructure needed to bring intelligence to every moving machine on the planet. Applied Intuition services the automotive, defense, trucking, construction, mining and agriculture industries in three core areas: tools and infrastructure, operating systems, and autonomy. Eighteen of the top 20 global automakers, as well as the United States military and its allies, trust the company’s solutions to deliver physical intelligence. Applied Intuition is headquartered in Sunnyvale, California, with offices in Washington, D.C.; San Diego; Ft. Walton Beach, Florida; Ann Arbor, Michigan; London; Stuttgart; Munich; Stockholm; Bangalore; Seoul; and Tokyo. Learn more at applied.co. We are an in-office company, and our expectation is that employees primarily work from their Applied Intuition office 5 days a week. However, we also recognize the importance of flexibility and trust our employees to manage their schedules responsibly. This may include occasional remote work, starting the day with morning meetings from home before heading to the office, or leaving earlier when needed to accommodate family commitments. About the role As an Engineering Manager on the ML Platform team, you'll lead a world-class group of engineers focused on building the infrastructure that powers Physical AI at scale. Your team will own three critical areas: Training & Inference Orchestration, where we build frameworks to efficiently schedule and run massive jobs across thousands of GPUs; GPU Cluster Architecture, where we design and scale what will be the largest GPU cluster for Physical AI in the industry; and Performance Optimization, where we push the limits of hardware utilization, throughput, and cost efficiency for large-scale training and inference workloads. You'll work at the intersection of systems engineering and ML, partnering directly with stack development and research teams to remove bottlenecks and accelerate the path from experimentation to production. At Applied Intuition, you will: Grow and manage a team of world-class infrastructure and systems engineers with the goal of delivering a best-in-class ML platform for Physical AI Own the design and evolution of frameworks for orchestrating distributed training and inference jobs across thousands of GPUs Drive the buildout and scaling of our GPU cluster infrastructure, making critical decisions on architecture, scheduling, networking, and resource management Lead efforts to optimize training and inference performance — including throughput, fault tolerance, GPU utilization, and cost efficiency at scale Set team goals and roadmap in alignment with research milestones, model development timelines, and production deployment requirements Partner closely with research, stack development, and infrastructure teams to understand their workflows and accelerate their iteration speed Drive hiring, mentoring, and growth for a high-performing, mission-driven team We’re looking for someone who has: 3+ years of engineering management experience, ideally leading infrastructure or platform teams Passion for building and leading high-performing teams that operate at the frontier of scale Deep experience with distributed systems, GPU computing, or large-scale ML infrastructure Direct experience building or operating large GPU clusters (1,000+ GPUs) Strong understanding of distributed training frameworks (e.g., PyTorch Distributed, Megatron-LM, DeepSpeed, FSDP) and job orchestration at scale Familiarity with GPU cluster management, high-performance networking (InfiniBand, RDMA), and resource scheduling (Slurm, Kubernetes) Track record of building and operating systems that run reliably at massive scale Nice to have: Background in training optimization techniques such as mixed-precision training, pipeline/tensor/data parallelism, or checkpointing strategies Experience with inference optimization (batching, model serving, quantization, compiler-level optimizations) Familiarity with Physical AI domains such as autonomous driving, robotics, or simulation Contributions to open-source ML infrastructure projects Compensation at Applied Intuition for eligible roles includes base salary, equity, and benefits. Base salary is a single component of the total compensation package, which may also include equity in the form of options and/or restricted stock units, comprehensive health, dental, vision, life and disability insurance coverage, 401k retirement benefits with employer match, learning and wellness stipends, and paid time off. Note that benefits are subject to change and may vary based on jurisdiction of employment. Applied Intuition pay ranges reflect the minimum and maximum intended target base salary for new hire salaries for the position. The actual base salary offered to a successful candidate will additionally be influenced by a variety of factors including experience, credentials & certifications, educational attainment, skill level requirements, interview performance, and the level and scope of the position. Please reference the job posting’s subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the location listed is: $204,000 - $343,000 USD annually. Applied Intuition is an equal opportunity employer and federal contractor or subcontractor. Consequently, the parties agree that, as applicable, they will abide by the requirements of 41 CFR 60-1.4(a), 41 CFR 60-300.5(a) and 41 CFR 60-741.5(a) and that these laws are incorporated herein by reference. These regulations prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities, and prohibit discrimination against all individuals based on their race, color, religion, sex, sexual orientation, gender identity or national origin. These regulations require that covered prime contractors and subcontractors take affirmative action to employ and advance in employment individuals without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status or disability. The parties also agree that, as applicable, they will abide by the requirements of Executive Order 13496 (29 CFR Part 471, Appendix A to Subpart A), relating to the notice of employee rights under federal labor laws. #J-18808-Ljbffr Decisive Point

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Engineering Manager - ML Platform and Infrastructure in Sunnyvale, CA vacancy
  • $276k - $367k

     ...As the Engineering Manager for the Machine Learning Infrastructure team, you will spearhead the development of the cutting-edge platform that powers Moveworks' conversational AI. This role is absolutely...  ...end-to-end systems for the entire ML/LLM lifecycle. This includes our... 
    Platform
    Full time

    Moveworks

    Mountain View, CA
    more than 2 months ago
  • $224k - $356.5k

     ...open-source benchmarking platform, AIPerf, is the growing standard...  .... As Technical Lead Manager, you will lead the engineering team within NVIDIA’s...  ...systems engineering, inference infrastructure, and open-source...  ...critical infrastructure, ML tooling, or distributed systems... 
    Platform
    Local area
    Worldwide

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...Engineering Manager, Inference ML Runtime Sunnyvale CA or Toronto Canada Cerebras Systems builds...  ...the fastest generative AI inference platform in the world. As an Engineering...  ...edge research into production-ready infrastructure to serve a variety of text-only and... 
    Platform

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $207k - $304k

     ...p e s t r y ) Software Engineering Mountain View, CA (HQ) About...  ...You will work closely with platform and infrastructure teams to go from prototypes...  ...and deployment of frontier ML techniques spanning...  ...technical leadership or people management role, with a focus on guiding... 
    Platform
    Full time
    Currently hiring
    Flexible hours

    X: The Moonshot Factory

    Mountain View, CA
    3 days ago
  • Decisive Point is looking for an Engineering Manager for the ML Platform team in Sunnyvale, California. This role involves leading a team to build and optimize the infrastructure for Physical AI, managing GPU clusters, and ensuring the delivery of high-performance ML solutions... 
    Platform

    Decisive Point

    Sunnyvale, CA
    3 days ago
  • $207k - $304k

     ...e s t r y ) Software Engineering Mountain View, CA (HQ)...  ...serve as the Engineering Manager for our Infrastructure and Developer Productivity...  ...: Architect self-service platforms and CI/CD primitives that...  ...multidisciplinary teams (ML, Computation, Power Systems... 
    Platform
    Full time
    Flexible hours

    X: The Moonshot Factory

    Mountain View, CA
    3 days ago
  • $255.7k - $346k

     ...Applied Intuition builds the software infrastructure for autonomous vehicles across...  ...and Europe. We are looking for an Engineering Manager to lead ML teams within SDS Core. This is a large...  ...program teams to translate vehicle platform constraints into model architecture... 
    Platform
    Full time

    Applied Intuition

    Sunnyvale, CA
    2 days ago
  • $255.7k - $346k

     ...company is creating the digital infrastructure needed to bring...  ...and trust our employees to manage their schedules responsibly...  ...Europe. We are looking for an Engineering Manager to lead ML teams within SDS Core....  ...teams to translate vehicle platform constraints into model architecture... 
    Platform
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Decisive Point

    Sunnyvale, CA
    3 days ago
  • $245k - $330k

     ...learning models, the Moveworks platform learns the unique language...  ...As a technical lead and manager of the core infrastructure team, you will be...  ...scale seamlessly and our engineering teams to build customer facing...  ...Experience with search and ML infrastructure is plus.... 
    Platform
    Full time
    Immediate start

    Moveworks

    Mountain View, CA
    more than 2 months ago
  •  ...Head of Infrastructure Engineering About the Company Pioneering cloud infrastructure company...  ...infrastructure standards to support large-scale AI/ML workloads. As the Head of...  ...architecture and high-performance storage platforms. Familiarity with datacenter-specific... 
    Platform

    Confidential

    San Jose, CA
    1 day ago
  • Business Area Engineering Seniority Level Mid-Senior level...  ...as much data under management as the hyperscalers, we...  ...and machine learning platform. You will be responsible...  ...enterprise AI infrastructure and deploying production...  ...) Experience with AI/ML orchestration software... 
    Platform
    Work from home
    Worldwide
    Flexible hours

    Nerdleveltech

    Santa Clara, CA
    4 days ago
  • $228.1k - $393.8k

     ...Senior Machine Learning Engineering Manager – Ads Predictions At Apple, we focus deeply...  ...Agile environment and are a hands-on ML leader who can drive execution while...  ...production Collaborate closely with platform and infrastructure teams to optimize training,... 
    Platform
    Relocation

    Apple

    Cupertino, CA
    5 days ago
  • $185.1k - $284.1k

    The Role As the Tech Lead Manager for the Rendering Infrastructure team within Simulation, you will be both the...  ...for a small, high-leverage group of engineers. The team owns the foundational...  ...Rendering algorithms team, Simulation platform teams, and downstream consumers (perception... 
    Platform
    Remote work
    Flexible hours

    General Motors

    Sunnyvale, CA
    5 days ago
  •  ...something. Apple’s Network Infrastructure is a critical...  ...leader of Network Security Engineering, you will collaborate...  ...Apple Cloud platforms as well as third-party...  ...seeking a highly motivated Manager with a strong passion...  ...partner closely with: AI/ML platform teams, Cloud... 
    Platform

    Apple

    Sunnyvale, CA
    5 days ago
  •  ...to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs....  ...for a deeply technical, hands-on engineering leader for our on-field Kernel Reliability...  ...: # Build a breakthrough AI platform beyond the constraints of the GPU... 
    Platform

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $224k - $356.5k

     ...intelligence. We are looking for a highly motivated Engineering Manager, Hardware Infrastructure Build Systems to join this dynamic and innovative...  ...dependency management, API integrations, and analytics platforms that empower thousands of engineers. Establishing... 
    Platform
    Remote work

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $192k - $287k

     ...only new products but also new platforms that reliably create value for...  ...the entire data, perception ML, and analytics stack, from raw...  ...behavior. Partner with the data engineer responsible for analytics to:...  ...improvements, and inform leadership. Manage a small Data team (1–2 data... 
    Platform
    Full time
    Immediate start
    Visa sponsorship

    Blue River Technology

    Santa Clara, CA
    5 days ago
  •  ...effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or...  ...closely with Hardware Engineering, Inference Engineering,...  ...AI Cloud Infrastructure & Operations Network...  ...basics ~ Hardware-centric platforms Proven ability... 
    Platform

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    4 days ago
  • $146.3k - $289.9k

     ...personalization. We are looking for an Engineering Manager to lead and grow a team of engineers building the AI infrastructure for Adobe Express. This role...  ...of Adobe’s most strategic platforms. This is an early-level...  ...in distributed systems, AI/ML infrastructure, or large-... 
    Platform
    Temporary work
    Local area
    Immediate start
    Worldwide
    Flexible hours

    Adobe

    San Jose, CA
    4 days ago
  • Apple Inc. is looking for an Engineering Manager in Cupertino, California, to lead a team responsible for building and operating scalable machine learning infrastructure. The role involves driving best practices in system design and collaborating with cross-functional... 

    Apple Inc.

    Cupertino, CA
    2 days ago
  • $207k - $300k

    Engineering Manager, Software Engineering corporate_fare Google place Sunnyvale...  ...developing large-scale infrastructure. 3 years of experience in...  ...and providing the essential platforms that enable developers to...  ...generation safety solutions for AI/ML networking and take them to... 
    Platform
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $192k - $267k

    Google is hiring a Customer Engineering (CE) Manager. Preferred working locations include Chicago...  ...virtualization, or containerization platforms. Experience with technical conversations...  ...systems (e.g., data platform, AI/ML, infrastructure). Experience as a pre-sales manager... 
    Platform
    Full time
    Temporary work

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $190k - $290k

    Engineering Manager — Foundational Data Systems for AI Location: Downtown...  ...Foundational Data Systems —the core infrastructure layer that everything else...  ...compute, storage, or data platforms Experience building or...  ...closely with research or ML infrastructure teams Experience... 
    Platform
    Work at office
    Flexible hours

    Dormont Manufacturing Co

    Mountain View, CA
    1 day ago
  • $200k - $250k

     ...Full time Department Engineering Compensation Estimated...  ...coding, revenue cycle management and more — all...  ...to lead the Frontend Platform team powering our Ambient...  ...Partner with backend and ML teams to define clean...  ...platforms, shared UI infrastructure, or design systems Strong... 
    Platform
    Full time
    Work at office
    Local area
    Remote work

    Monograph

    Mountain View, CA
    1 day ago
  •  ...for the modern world. Our cloud-native platform uses computer vision and AI to help businesses...  ...We are looking for a technically deep Engineering Manager to lead the AI team at Coram. This team...  ...record of shipping production‑grade ML systems at scale Ability to balance... 
    Platform
    Shift work

    Coram AI

    Sunnyvale, CA
    2 days ago
  • $278.2k - $339.25k

     ...the world's best data and AI infrastructure platform, so our customers can focus...  ...to their missions. Our engineering teams build highly technical...  ...trusted data analytics and ML platform in the world. Security...  ..., Authentication, Identity Management, Access Control, Key... 
    Platform
    Local area
    Worldwide

    Databricks Inc.

    Mountain View, CA
    2 days ago
  • $250k - $300k

     ...AI-powered digital commerce platform is revolutionizing the way...  ...online. Our unified ecommerce management solutions empower brands to...  ...Role We're looking for an Engineering Leader with a Data Science...  ...Design and build scalable ML infrastructure to support model training,... 
    Platform
    Temporary work

    CommerceIQ

    Mountain View, CA
    3 days ago
  • $164.8k - $226.6k

     ...industry. For more information, visit: Job Summary We are seeking a hands-on Principal Infrastructure Hardware Engineer to architect, design, and deliver system platforms supporting characterization, validation, ATE, and high-volume manufacturing of SiTime products... 
    Platform

    SiTime Corporation

    Santa Clara, CA
    a month ago
  • $212.7k - $287.7k

     ...recommendations powered by advanced relevance models, and deep insights into viewer behavior. As a Manager of ML Infrastructure, you will lead multiple engineering teams to define the vision, strategy, and execution plan for ML infrastructure stack and deliver highly... 
    Platform
    Local area
    Worldwide
    Flexible hours

    Amazon

    Sunnyvale, CA
    22 hours ago
  • $188k - $275k

     ...the most powerful end-to-end platform to develop, deploy, and...  ...CoreWeave’s industry-leading cloud infrastructure with the best-in-class tools...  ...thousands of concurrent ML runs and billions of data points...  .... Mentor and Grow Engineers: Manage and coach a high-caliber team... 
    Platform
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Immediate start
    Remote work
    Flexible hours

    Weights & Biases

    Sunnyvale, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Engineering Manager - ML Platform and Infrastructure. Be the first to apply!