Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal Machine Learning Engineer, Mobile AI Inference Optimization

$278.1k - $347.6k

Unity Technologies

Mountain View, CA, USA

Principal Machine Learning Engineer, Mobile AI Inference Optimization

Location

Mountain View, CA, USA

Department

AI & Machine Learning

Requisition ID

JOBREQ-2615941

Role description

The opportunity

We are building the next generation of mobile game AI experiences, deploying world models to mobile on-device. As our Principal Machine Learning Engineer, you will be the foremost technical authority on bringing state-of-the-art multi-modal models (transformers, diffusion networks, and JAPE-style architectures) from research to production on mobile hardware.

This is a deeply hands-on, high-impact role. You will define the inference strategy, drive architectural decisions across the full mobile ML stack, and mentor a team of senior and mid-level engineers. Your work will directly determine the latency, quality, and power profile of AI-driven features experienced by billions of mobile game players.

What you'll be doing

  • Technical Leadership:

  • Set the technical vision and roadmap for deploying multi-modal AI models to iOS and Android, spanning transformers, diffusion models, and JAPE-style generative architectures.

  • Make authoritative decisions on model compression, quantization, pruning, and knowledge distillation strategies to meet mobile latency and memory budgets.

  • Evaluate and select inference runtimes (e.g., CoreML, ONNX Runtime Mobile, TFLite, ExecuTorch) and drive adoption across the team.

  • Own the end-to-end optimization pipeline: from model export and graph transformation to hardware-specific kernel tuning on NPU, GPU, and CPU.

  • Architecture & Research Translation:

  • Collaborate directly with research scientists to translate novel model architectures into deployable, mobile-optimized implementations.

  • Design scalable systems for multi-modal inference that process diverse inputs - images, text, primitives, and metadata - and produce pixel-level outputs with real-time performance.

  • Pioneer new approaches to dynamic resolution, token reduction, and speculative decoding tailored to mobile constraints.

  • Track and rapidly adopt breakthroughs in efficient diffusion (e.g., consistency models, flow matching) and efficient attention (e.g., FlashAttention, linear attention variants).

  • Team & Cross-Functional Leadership:

  • Lead and mentor a team of ML engineers; define engineering best practices, code review standards, and on-device benchmarking methodology.

  • Partner with platform engineers, product managers, and runtime teams to align ML capabilities with device SKU constraints and product roadmaps.

  • Champion a culture of measurement: define KPIs for latency, accuracy, memory, and power consumption and ensure the team tracks them rigorously.

What we're looking for

  • 8+ years in ML engineering, with at least 3 years focused on on-device / edge inference optimization.

  • Proven production deployment of transformer-based models (e.g., ViT, LLaMA, Stable Diffusion) and/or JAPE-style generative architectures on mobile or embedded hardware.

  • Hands-on expertise with CoreML, TFLite, ONNX Runtime, and/or ExecuTorch; deep understanding of operator fusion, memory layout, and runtime scheduling.

  • Expert-level command of INT8/INT4/FP16 quantization, weight sharing, structured/unstructured pruning, and knowledge distillation.

  • Strong understanding of mobile SoC architectures (Apple Neural Engine, Qualcomm Hexagon/Adreno, ARM Mali) and how to target each for peak throughput.

  • Proficiency in C++ / Objective-C / Swift for runtime integration; solid Python for training-side tooling and export pipelines.

  • Ability to read, implement, and extend ML research papers; familiarity with efficient attention, diffusion samplers, and multi-modal fusion techniques.

  • Track record of technical leadership: setting direction, influencing cross-functional partners, and growing engineers.

You might also have

  • Experience shipping world-model or neural rendering pipelines (NeRF, 3DGS, or similar) on mobile.

  • Contributions to open-source ML inference frameworks or mobile ML research publications.

  • Familiarity with compiler stacks such as MLIR, TVM, or XLA for custom kernel generation.

  • Background in real-time graphics or game engine pipelines (Metal, Vulkan, OpenGL ES).

Additional information

  • International relocation support is not available for this position

Benefits

At Unity, we want our team members to thrive. We offer a wide range of benefits designed to support well-being and work-life balance.

Please note: Benefits eligibility, specific offerings, and coverage vary based on the country and employment status.

While specific benefits vary, here are some of the ways we strive to take care of our eligible team members globally: Comprehensive health, life, and disability insurance | Commute subsidy | Employee stock ownership | Competitive retirement/pension plans | Generous vacation and personal days | Support for new parents through leave and family-care programs | Office food snacks | Mental Health and Wellbeing programs and support | Employee Resource Groups | Global Employee Assistance Program | Training and development programs | Volunteering and donation matching program

Life at Unity

Unity [NYSE: U] is the world's leading game engine, powering play for more than 3 billion consumers each month. The top mobile games in the world, the most played PC indie titles, the most innovative console games, and virtually all of the top XR and Web Games are developed, deployed, and grown in Unity. Unity also enables teams across industries like automotive, manufacturing, and healthcare to design, simulate, and collaborate in 3D - closing the gap between ideas and reality. For more information, please visit

Unity is a proud equal opportunity employer. We are committed to fostering an inclusive, innovative environment and celebrate our employees across age, race, color, ancestry, national origin, religion, disability, sex, gender identity or expression, sexual orientation, or any other protected status in accordance with applicable law. Our differences are strengths that enable us to support the growing and evolving needs of our customers, partners, and collaborators. If you have a disability that means there are preparations or accommodations we can make to help ensure you have a comfortable and positive interview experience, please fill out this form ( to let us know.

This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.

Headhunters and recruitment agencies may not submit resumes/CVs through this website or directly to managers. Unity does not accept unsolicited headhunter and agency resumes. Unity will not pay fees to any third-party agency or company that does not have a signed agreement with Unity.

Your privacy is important to us. Please take a moment to review our Prospect Privacy Policy ( and Applicant Privacy Policy ( . Should you have any concerns about your privacy, please contact us at View email address on click.appcast.io.

#DIR #LI-AR1

*Note: This range reflects the anticipated base salary for this position. Beyond base salary, this role may be eligible for equity awards and participation in our company incentive plans (such as annual discretionary bonuses or sales commissions). The final offer amount will depend on several factors, including geographic location and the candidate's relevant experience, professional background, and skill set.

Gross pay salary

$278,100-$347,600 USD

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Principal Machine Learning Engineer, Mobile AI Inference Optimization in Mountain View, CA vacancy
  •  ...Inference Optimization MLE At Rhoda AI, we're building the next generation of generalist intelligent robots. We own the full robotics stack from high...  ...model versions Collaborate closely with research engineers to translate model innovations into optimized, deployment... 
    Suggested

    Rhoda ai

    Palo Alto, CA
    2 days ago
  •  ...Splunk AI Models Team Splunk, a Cisco...  ..., multi-modal machine-generated data —...  ...and Cisco's global engineering capabilities. Our...  ...data, deep learning-based time series...  ...Scale Training & Optimization – Experience optimizing...  ...pipelines, and inference efficiency to minimize... 
    Principal
    Flexible hours

    Webex Events (formerly Socio)

    Mountain View, CA
    21 hours ago
  • $170k - $216k

     ...Machine Learning Engineer, Model Optimization Waymo is an autonomous driving technology company with the mission...  ...Driver™—to improve access to mobility while saving thousands of lives now...  ...utilization in model training and model inference through model architecture/... 
    Suggested
    Full time
    Remote work

    Waymo

    Mountain View, CA
    3 days ago
  • $296.3k

     ...minimum. The Role: We are seeking a Principal AI Engineer to lead the design and advancement of...  ...powers large-scale training and cloud inference. This includes accelerating training...  ...processing pipelines, and Pytorch model optimization. This is a highly impactful position... 
    Principal
    Remote work
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  • $200k - $340k

     ...Distinguished Machine Learning Engineer, AI Systems Palo Alto, CA HP IQ is HP's new AI innovation lab. Combining startup agility...  ...roadmap, owning decisions across models, runtimes, inference engines, and optimization. Lead on device AI strategy, including model... 
    Principal
    Full time
    Temporary work
    Local area
    Flexible hours

    HP Development Company, L.P.

    Palo Alto, CA
    2 days ago
  • $148.7k - $258.72k

     ...View, CA, USA Senior Machine Learning Engineer, Ads Experimentation...  ...USA Department AI & Machine Learning...  ...how we validate and optimize our global advertising...  ...generation of causal inference and high-sensitivity...  ...surrogate metric design in mobile gaming or digital... 
    Temporary work
    Work at office
    Worldwide
    Relocation package

    Unity Technologies

    Mountain View, CA
    1 day ago
  • $150k - $170k

     ...Sr. Machine Learning Engineer/ Principal Machine Learning Engineer, Performance DSP...  ...PubMatic, we're transforming the mobile advertising landscape with...  ...algorithms and systems to optimize real-time bidding...  ...Bidder architecture and Gen-AI. BS or MS degree in Computer... 
    Principal
    Work at office
    Remote work

    Pubmatic

    Redwood City, CA
    1 day ago
  •  ...Seeking an experienced Machine Learning Engineer to lead the...  ...protect downstream agentic AI systems across phone,...  ...RLHF, DPO, and related optimization techniques to push detection...  ...that split safety inference between on-device (...  ...safety models into mobile-use agents, XR/AR assistants... 

    The Fountain Group

    Mountain View, CA
    4 days ago
  •  ...Software Engineer Applied Intuition, Inc....  ...future of physical AI. Founded in 2017 and...  ...to every moving machine on the planet. Applied...  ...experience in optimizing ML models and deploying...  ...latency of model inference for compute boards...  ...working with deep learning frameworks (e.g.,... 
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Applied Intuition

    Sunnyvale, CA
    3 days ago
  • $213k - $263k

     ...Machine Learning Engineer, Runtime & Optimization Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver...  ...'s Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    1 day ago
  • $276k - $414k

     ...themselves, live in the moment, learn about the world, and have...  .... We’re looking for a Principal Machine Learning Engineer to join the Content ML team...  ...design, train, deploy, and optimize state-of-the-art machine learning...  ...Experience contributing to AI publications If you have a... 
    Principal
    Live in
    Work at office
    Local area

    Snap

    Palo Alto, CA
    3 days ago
  • $147.4k - $272.1k

     ...California, United States Machine Learning and AI The Intelligence Platform...  ...platform, and the primary inference platform that enable next...  ...and driven Machine Learning Engineer who has a robust...  ...stack, ensuring performance optimization and alignment with broader... 
    Relocation

    Apple Inc.

    Cupertino, CA
    4 days ago
  •  ...builds the world's largest AI chip, 56 times larger...  ...industry-leading training and inference speeds and empowers machine learning users to effortlessly...  ...The Inference ML Engineering team at Cerebras Systems...  ...of various features. Optimize software to accelerate generative... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    21 hours ago
  • At Rhoda AI, we're building the full-stack foundation...  ...intersection of large-scale learning, robotics, and systems,...  ...for an ML Infrastructure Engineer to help build and operate the inference systems that power our...  ...and on-prem environments Optimize latency, throughput, and... 

    Rhoda AI

    Palo Alto, CA
    2 days ago
  •  ...new architectures for AI/ML accelerator integrated...  ...operations for optimal assignment of computational...  ...** in Electrical Engineering, Computer Science, Data...  ...of experience in AI & Machine learning ( academic or industrial...  ...secure, cloud-enabled, mobile-friendly infrastructure... 
    Principal
    Work experience placement
    Local area

    Hewlett Packard Enterprise Development LP

    Milpitas, CA
    21 hours ago
  • $270k

     ...systems and sub-second multimodal inference at scale barely existed....  ...from varied backgrounds who learn fast, thrive in ambiguity, and...  ...enough to make a case. Inference Optimization. Deep understanding of modern...  ...to major inference engines, or deep-dive technical write... 
    Principal
    Full time
    Work at office
    Relocation package

    Inworld

    Mountain View, CA
    3 days ago
  • $128.7k - $261.3k

     ...The Model Deployment & Inference Solutions team in GM AV deploys machine learning models from training...  ...and predictable, and optimize models so they meet the...  ...performed manually by engineers. Build the developer...  ...to help us transform mobility. Explore our global... 
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours
    Shift work

    General Motors

    Mountain View, CA
    2 days ago
  • $275.8k - $340.5k

     ...the future of mobility with advanced self...  ...develop while learning from leaders at...  ...demands of AI and ML innovation...  ...productivity of ML engineers, and drive the...  ...Validation & Inference: Ensures robust...  ...: The Principal AI/ML Engineer...  ...involve applying machine learning models... 
    Principal
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  • $194k - $214k

     ...waste. Instrumental's AI-powered platform gives...  ...-centric Senior ML Engineer who will join our cross...  ...Experience with deep learning in a production setting...  ..., deployment, and inference at scale with familiarity...  ...deployment, and performance optimization. Feel at home... 

    Instrumental Inc

    Palo Alto, CA
    21 hours ago
  •  ...resilience. Powered by the Illumio AI Security Graph, our breach...  ...Our Team's Vision: Our Engineering team is shaping the future of...  ...Asynchronous Systems: Architect and optimize high-throughput, event-driven...  ...managing proprietary model inference endpoints. This position... 
    Immediate start

    Illumio

    Sunnyvale, CA
    12 days ago
  • $147k - $211k

     ...training, and deploying machine learning models using...  ...with generative AI techniques (e.g.,...  ...'s software engineers develop the next-...  ...processing, UI design and mobile; the list goes on...  ...will build and optimize the deep learning...  ...efficient GenAI inference integration. Own... 
    Full time
    Immediate start

    Google Inc.

    Mountain View, CA
    21 hours ago
  • Rhoda ai in Palo Alto is seeking an Inference Infrastructure Engineer to help power their model deployment stack for humanoid robots. This role involves designing...  ...on Kubernetes deployment pipelines and resource optimization across GPU clusters, you will play a crucial... 

    Rhoda ai

    Palo Alto, CA
    4 days ago
  • $296.3k

     ...the future of mobility with advanced self...  ...develop while learning from leaders at...  ...demands of AI and ML innovation...  ...productivity of ML engineers, and drive the...  ...Validation & Inference: Ensures robust...  ...: The Principal AI/ML Engineer...  ...involve applying machine learning models... 
    Principal
    Local area
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  • $195k - $230k

     ...information powered by advanced AI, recommendation systems,...  ...are looking for a Senior Machine Learning Engineer to help evolve our large-...  ..., and multi-objective optimization to balance engagement, retention...  ...offline training → online inference → A/B experimentation →... 
    Full time
    Local area
    Work from home

    NewsBreak

    Mountain View, CA
    1 day ago
  • $120k - $215k

     ...Senior Machine Learning Engineer – Fine-Tuning and On-device AI Palo Alto, CA Who We Are HP IQ is HP's new AI...  ...Engineer to lead the fine-tuning, optimization, and deployment of AI models...  ...a strong emphasis on on-device inference. You will work on cutting-edge... 
    Full time
    Temporary work
    Local area
    Flexible hours

    HP IQ

    Palo Alto, CA
    7 days ago
  • $128.7k - $261.3k

     ...and more accessible mobility. For the AI Kernels & Compilers...  ..., and performance engineering so that every cycle...  ...into fast, reliable inference across GPUs...  ...turns them into highly optimized inference artifacts...  ...developing and deploying machine learning models?... 
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  • $172.2k - $258.4k

     ...We are looking for a Staff Machine Learning Engineer to join our Vector Core Modeling...  ...a key member of the Vector AI group, you will play a...  ...doing Design, implement, and optimize the core ads models Build...  ...consumers each month. The top mobile games in the world, the most... 
    Work at office
    Worldwide
    Relocation package

    Unity

    Mountain View, CA
    21 hours ago
  • $158k - $241.9k

     ...teams are redefining mobility. Through a human-...  ...Role: As a Senior AI/ML Engineer within the Onboard Embodied...  ...-edge end-to-end machine learning solutions directly...  ...of real-time inference and robust autonomous...  ...methodologies, and inference optimization strategies suited... 
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    4 days ago
  • $200k - $250k

     ...Senior Machine Learning Engineer (Mandarin Speaking) Menlo Park,...  ...of the most powerful AI advertising solutions...  ...business outcomes for mobile app marketers through...  ...relevance, ranking, and optimization. Your work will have...  ...optimizing inference performance on GPUs,... 
    Temporary work
    Work at office
    Flexible hours

    Moloco

    Menlo Park, CA
    1 day ago
  • $140.7k - $223.4k

     ...Mountain View, CA, USA Senior Machine Learning Engineer, Advertiser Growth...  ...View, CA, USA Department AI & Machine Learning Requisition...  ...budget pacing: Design and optimize sophisticated pacing...  ...consumers each month. The top mobile games in the world, the most... 
    Work at office
    Worldwide
    Relocation package

    Unity Technologies

    Mountain View, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Machine Learning Engineer, Mobile AI Inference Optimization. Be the first to apply!