Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Machine Learning Engineer, On-Device & Mobile AI Optimization

$188.2k - $282.2k

Unity Technologies

San Francisco, CA, USA

Senior Machine Learning Engineer, On-Device & Mobile AI Optimization

Location

San Francisco, CA, USA

Department

AI & Machine Learning

Requisition ID

JOBREQ-2616041

Role description

The opportunity

We are building the next generation of AI-driven game experiences, running generative models on-device, right where the players are - on phones, tablets, laptops, and desktops. Our games run inside a modern, browser-native runtime (built on technologies such as WebGPU and WebNN), so the models that power these experiences must be deployed and accelerated entirely within that runtime. As a Senior Machine Learning Engineer for On-Device & Mobile AI, you will take state-of-the-art multi-modal models - transformers, diffusion networks, and vision-language models (VLMs) - and make them run fast, small, and reliably on mobile and constrained hardware.

This is a deeply hands-on role. You will own the optimization and deployment of significant parts of the inference stack - from a trained checkpoint leaving research, through export, quantization, and kernel-level tuning, to a shipped feature running inside the engine at interactive frame rates within a fixed memory and power budget. Your work directly shapes the latency, quality, memory footprint, and battery profile of AI features experienced by billions of players.

This role is for an engineer who is energized by the gap between a research model and a shipping, on-device product. If you enjoy profilers, frame captures, op-fusion, and shaving milliseconds and megabytes, this is your role.

What you'll be doing

  • Inference & On-Device Optimization

  • Own the optimization pipeline for the models you ship: model export, graph transformation, operator fusion, memory-layout planning, and hardware-specific tuning across NPU, mobile GPU, and desktop/laptop GPU.

  • Apply quantization (INT4/INT8/FP16), weight sharing, structured/unstructured pruning, and knowledge distillation to hit hard latency, memory, and power budgets - and validate them against quality bars.

  • Do low-level performance work: write and tune WebGPU compute shaders (WGSL) and, where relevant, native kernels (Metal, Vulkan/SPIR-V compute, CUDA); profile with browser and platform tools (Chrome/Dawn GPU traces, PIX, Instruments/Metal System Trace,

  • Snapdragon Profiler, Nsight, RenderDoc), and eliminate bottlenecks at the op and memory-bandwidth level.

  • Apply efficiency techniques - dynamic resolution, token reduction, cross-frame caching/reuse, reduced-step diffusion samplers - as engineering levers to meet budgets on target SKUs.

  • Runtime & Systems Integration

  • Work with WebGPU-targeted inference runtimes (ONNX Runtime Web, Transformers.js, WebLLM, TensorFlow.js) alongside native options (CoreML, ONNX Runtime, TFLite, ExecuTorch), and extend or build glue code where off-the-shelf options fall short of our diffusion and VLM workloads.

  • Build parts of the integration between the ML runtime and the game engine: real-time scheduling, memory pooling, zero-copy buffer sharing between the inference and render paths, and frame-budget management alongside the renderer.

  • Build supporting engineering for your components: model packaging and asset pipelines, on-device fallbacks and SKU-aware capability tiers, crash/quality telemetry, and automated on-device benchmarking in CI.

  • Research Productionization

  • Partner with research scientists to turn novel CV and multi-modal architectures into implementations that are deployable, debuggable, and fast on device.

  • Provide a feedback loop into research: surface hardware constraints, op-support gaps, and cost models early so model design and deployment converge.

  • Track breakthroughs in efficient inference (efficient attention, distillation, reduced-step diffusion) and assess them pragmatically: what actually moves latency/memory/power on our target devices.

  • Collaboration & Engineering Quality

  • Contribute to engineering best practices, code-review standards, performance-regression gates, and on-device benchmarking methodology.

  • Support a culture of measurement: track KPIs for latency, quality, memory, and power for the systems you work on, across the device matrix.

  • Partner with platform engineers, product managers, and runtime teams to align your work with device-SKU constraints and product roadmaps.

  • Share knowledge and mentor junior and mid-level engineers through code review, pairing, and design discussion.

What we're looking for

  • 5+ years in software/ML engineering, with meaningful time focused on on-device / edge inference or real-time, performance-critical systems.

  • Production deployment of transformer- and/or diffusion-based models (e.g., ViT, Stable Diffusion, CLIP/SigLIP-style encoders) on mobile, desktop, or embedded hardware - shipped, not just prototyped.

  • Hands-on experience with at least one major inference runtime (ONNX Runtime / ORT Web, CoreML, TFLite, ExecuTorch) and a working understanding of operator fusion, memory layout, and runtime scheduling.

  • Low-level performance engineering: solid command of at least one GPU/compute API - WebGPU/WGSL, Metal, Vulkan, D3D12, or CUDA - and the profiling tools to go with it. You can read a frame capture and a kernel trace and reason about where the time and memory go.

  • Working knowledge of model-optimization techniques - quantization (INT4/INT8/FP16), weight sharing, pruning, and distillation - and the judgment to apply them to hit latency and memory budgets. You use them effectively as engineering tools.

  • Understanding of target hardware: mobile SoCs (Apple Neural Engine, Qualcomm Hexagon/Adreno, ARM Mali) and/or desktop/laptop GPUs (Apple Silicon, NVIDIA, AMD, Intel).

  • Strong Python for export pipelines and training-side tooling; familiarity with the core languages of a browser-native runtime (TypeScript/JavaScript, WGSL) is a plus.

  • Working fluency with the models you deploy - enough to read an architecture, modify it for deployment, and reason about accuracy trade-offs.

  • A collaborative working style: clear communication, reliable delivery, and a willingness to support and learn from teammates.

You might also have

  • Experience shipping world-model, neural-rendering, or real-time generative pipelines NeRF, 3DGS, real-time diffusion, or similar) on device.

  • Hands-on experience deploying models through WebGPU - e.g., ONNX Runtime Web WebGPU EP), Transformers.js, WebLLM, or TensorFlow.js - including writing/tuning WGSL compute shaders.

  • Game-engine or real-time-graphics background (Unity, Unreal, or a custom engine; Metal/Vulkan/D3D/OpenGL ES render pipelines) - especially integrating compute workloads alongside a renderer.

  • Contributions to open-source ML inference frameworks, runtimes, or GPU/compute libraries especially in the WebGPU ecosystem (Dawn, wgpu, ORT Web, Transformers.js, WebLLM).

  • Familiarity with compiler stacks (MLIR, TVM, IREE, XLA) for custom kernel generation and graph optimization.

  • Experience with on-device benchmarking infrastructure, performance-regression CI, and device-farm matrices.

  • Proficiency in C++/Objective-C/Swift for runtime integration.

Additional information

  • Relocation support is not available for this position

  • Work visa/immigration sponsorship is not available for this position

Benefits

At Unity, we want our team members to thrive. We offer a wide range of benefits designed to support well-being and work-life balance.

Please note: Benefits eligibility, specific offerings, and coverage vary based on the country and employment status.

While specific benefits vary, here are some of the ways we strive to take care of our eligible team members globally: Comprehensive health, life, and disability insurance | Commute subsidy | Employee stock ownership | Competitive retirement/pension plans | Generous vacation and personal days | Support for new parents through leave and family-care programs | Office food snacks | Mental Health and Wellbeing programs and support | Employee Resource Groups | Global Employee Assistance Program | Training and development programs | Volunteering and donation matching program

Life at Unity

Unity [NYSE: U] is the world's leading game engine, powering play for more than 3 billion consumers each month. The top mobile games in the world, the most played PC indie titles, the most innovative console games, and virtually all of the top XR and Web Games are developed, deployed, and grown in Unity. Unity also enables teams across industries like automotive, manufacturing, and healthcare to design, simulate, and collaborate in 3D - closing the gap between ideas and reality. For more information, please visit

Unity is a proud equal opportunity employer. We are committed to fostering an inclusive, innovative environment and celebrate our employees across age, race, color, ancestry, national origin, religion, disability, sex, gender identity or expression, sexual orientation, or any other protected status in accordance with applicable law. Our differences are strengths that enable us to support the growing and evolving needs of our customers, partners, and collaborators. If you have a disability that means there are preparations or accommodations we can make to help ensure you have a comfortable and positive interview experience, please fill out this form ( to let us know.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.

Headhunters and recruitment agencies may not submit resumes/CVs through this Web site or directly to managers. Unity does not accept unsolicited headhunter and agency resumes. Unity will not pay fees to any third-party agency or company that does not have a signed agreement with Unity.

Your privacy is important to us. Please take a moment to review our Prospect Privacy Policy ( and Applicant Privacy Policy ( . Should you have any concerns about your privacy, please contact us at View email address on click.appcast.io.

#SEN #LI-MC1

*Note: This range reflects the anticipated base salary for this position. Beyond base salary, this role may be eligible for equity awards and participation in our company incentive plans (such as annual discretionary bonuses or sales commissions). The final offer amount will depend on several factors, including geographic location and the candidate's relevant experience, professional background, and skill set.

Gross pay salary

$188,200-$282,200 USD

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Machine Learning Engineer, On-Device & Mobile AI Optimization in San Francisco, CA vacancy
  • ZETIC.ai is seeking an ML Software Engineer in San Francisco to optimize AI models for edge devices. You will lead strategies for deploying models, ensuring high performance while collaborating with engineers to achieve reliable production. The ideal candidate has 3+ years... 
    Suggested

    CAPSA

    San Francisco, CA
    2 days ago
  • $116.9k - $200.4k

     ...that builds verification, optimization, and analytics solutions for...  .... IAS is looking for a Senior Machine Learning Engineer on the Data Sciences Team....  ...networks, video/CTV, and mobile apps. As a Machine Learning...  ...the system design for our AI/ML‑based services. Design... 
    Senior
    Full time

    Integral Ad Science

    San Francisco, CA
    3 days ago
  •  ...food supply by building the AI farmer that automates our...  ...seeing. We are looking for a Senior Machine Learning Engineer to build creative,...  ...both in the cloud and on edge devices. Design and implement intelligent...  ...sampling infrastructure to optimize data collection and improve... 
    Senior
    Full time
    Work at office
    Flexible hours
    Weekend work

    Orchard Robotics

    San Francisco, CA
    1 day ago
  • $180k - $270k

     ...Offers Equity Genies is an AI avatar technology company powering...  ...will have an AI persona. Senior Machine Learning Engineer to join our Avatar...  ...realtime animation systems across mobile and web environments....  ...engines. Debug, profile, and optimize runtime performance across... 
    Senior
    Full time
    Work experience placement
    Work at office

    Cerebras

    San Francisco, CA
    3 days ago
  •  ...The Role We're looking for someone who loves optimizing model inference to join us in building the core of ComfyUI - the most complex and bleeding-edge part of our engine. You'll be working on making AI models run faster and more efficiently than anyone thought possible... 
    Senior

    Comfy

    San Francisco, CA
    3 days ago
  •  ...Senior Applied Machine Learning Engineer, Asset Intelligence MaintainX is the world's leading mobile-first Asset and Work Intelligence platform for industrial...  ...driving the roadmap for AI-enabled maintenance...  ...retraining. Drive performance optimization through techniques like... 
    Senior

    MaintainX

    San Francisco, CA
    18 hours ago
  •  ...human potential through optimal sleep. As the world’s...  ..., software, and AI technology to make it...  ...We’re looking for a Machine Learning Engineer to build and ship consumer...  ...partners (Product, Mobile, Backend, Clinical) to...  ...approaches (on‑device/federated learning, differential... 
    Full time
    Immediate start
    Worldwide
    Night shift

    Eight Sleep

    San Francisco, CA
    3 days ago
  •  ...care. We sit at the intersection of AI, robotics, and healthcare, operating...  ...printing factory producing custom medical devices at a scale the industry has never...  ...healthcare, this is the place. The Role As a Senior Machine Learning Engineer, you will build the intelligence... 
    Senior
    Work at office

    Hike Medical

    San Francisco, CA
    3 days ago
  • $200k - $400k

     ...generation data platform to train AI video models. Troveo offers the...  ...an innovative strategic engineer to help us scale. Role Overview The Senior Machine Learning Engineer will play a central role in designing, building, and optimizing large‑scale machine learning pipelines... 
    Senior
    Work experience placement

    Troveo AI

    San Francisco, CA
    4 days ago
  •  ...connect and drive people forward. We are looking for a Machine Learning Engineer to join the growing AI and Machine Learning team at Strava. This team is...  ...of an ML pipeline from model building, evaluation, optimizing performance, and ensuring the scalability and reliability... 
    Senior
    Work at office
    Worldwide
    Flexible hours
    3 days per week

    Strava

    San Francisco, CA
    3 days ago
  •  ...About Us At Hayden AI, we are on a mission to harness the...  ...enforcement to transportation optimization technologies and beyond, our innovative mobile perception system empowers our...  .... About the Role As the Senior Software Engineer - Device, you will shape Hayden AI’s... 
    Senior
    Work at office
    3 days per week

    Hayden AI Technologies, Inc.

    San Francisco, CA
    8 hours ago
  • $186.1k - $300.55k

     ...What you\'ll do We are looking for a Senior Machine Learning Engineer to redefine how we operate our global...  ...are passionate about applying complex AI architectures to massive datasets (...  ...cardinality, high-volume time series data Optimize inference pipelines to run with low... 
    Senior
    Contract work
    Work at office
    Local area
    Remote work
    2 days per week

    DocuSign

    San Francisco, CA
    2 days ago
  • $160k - $240k

    Tensec is looking for a Machine Learning Engineer in San Francisco to build algorithms and optimization systems that drive their autonomous decision engine. This role involves designing trading strategies, building execution layers, and deploying robust models that enhance... 
    Senior
    Relocation package

    Tensec

    San Francisco, CA
    2 days ago
  • $200k - $260k

     ...Senior Machine Learning Engineer, Voice AI San Francisco About the Role Together AI is building the best inference infrastructure for voice applications...  ...-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus,... 
    Senior
    Full time

    Together AI

    San Francisco, CA
    18 hours ago
  • $204k - $259k

     ...-to improve access to mobility while saving thousands...  ...builds the system which learns the spatial-temporal...  ...downstream teams on the optimization and integration into...  ...of sensors, enabling engineers like you to (1) develop...  ...You will: Apply machine learning techniques to... 
    Senior
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $161.93k - $227.33k

     ...Senior Machine Learning Engineer Brisbane, California About This Opportunity: At Freenome, we are...  ...artificial intelligence/machine learning (AI/ML) systems in a cloud environment....  ...: enabling distributed DL pipelines, optimizing hardware utilization for efficient... 
    Senior
    Work at office
    Local area
    Remote work
    2 days per week
    3 days per week

    Freenome

    Brisbane, CA
    18 hours ago
  • $204k - $259k

     ...Experienced Driver™—to improve access to mobility while saving thousands of lives now lost...  ...the AV stack. We are an advanced ML and engineering team that leverages state-of-the-art computer vision, deep learning, and generative AI to automatically analyze driving logs,... 
    Senior
    Full time
    Remote work

    Waymo

    San Francisco, CA
    4 days ago
  • $185k - $280k

     ...Machine Learning Engineer Kiddom is a groundbreaking educational platform that...  ..., and deploying ML/AI systems in production environments...  ...personalization, search, or workflow optimization. Strong programming...  ..., prior experience, seniority, and demonstrated role related... 
    Senior
    Permanent employment
    Full time
    Local area
    Flexible hours

    Kiddom

    San Francisco, CA
    1 day ago
  •  ...bridge between research and engineering reality. You will have direct influence on the company’s AI roadmap. Responsibilities Architect...  ...observability stack. Mentor senior and mid-level engineers,...  ...augmentation, and inference optimization. Expert‑level command of... 
    Senior

    Sierracorp

    San Francisco, CA
    4 days ago
  •  ...Senior Machine Learning Engineer San Francisco About Us Beast Industries is a multifaceted media...  ...The Opportunity We're doing an AI-first engineering rebuild for a company...  ...production. Develop, evaluate, and optimize models against real business problems... 
    Senior
    Work at office
    Relocation package
    Flexible hours
    2 days per week

    MrBeast

    San Francisco, CA
    4 days ago
  • $131.4k - $235.95k

     ...tools for making buildings, machines, and even the latest...  ...people in the world. As a Senior Machine Learning Engineer focused on Machine Learning...  ...and BIM, you will ensure AI-powered experiences meet...  ...performance tuning, cost optimization, and capacity planning.... 
    Senior
    For contractors
    Remote work

    Autodesk

    San Francisco, CA
    2 days ago
  • $128.7k - $261.3k

     ...and more accessible mobility. For the AI Kernels & Compilers...  ...development, and performance engineering so that every cycle...  ...The Role As a Senior Compiler Engineer on...  ...them into highly optimized inference artifacts...  ...developing and deploying machine learning models?... 
    Senior
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    San Francisco, CA
    1 day ago
  •  ...The Role We're looking for someone who loves optimizing model inference to join us in building the core of ComfyUI – the most complex and bleeding‑edge part of our engine. You'll be working on making AI models run faster and more efficiently than anyone thought possible... 
    Senior

    Comfy

    San Francisco, CA
    18 hours ago
  • $133.5k - $212k

     ...Senior Machine Learning Engineer (Nova) Iterable is the leading AI-powered customer engagement platform that helps leading brands like Redfin, SeatGeek, Priceline...  ...seamless cross-channel interactions, and optimize engagement—all with enterprise-grade security and... 
    Senior
    Contract work
    Local area
    Immediate start
    Remote work
    Worldwide
    Home office
    Flexible hours

    Iterable

    San Francisco, CA
    18 hours ago
  • $198k - $221.5k

     ...Department Department Technology Engineering Compensation $198K - $221.5K •...  ...Product : Work at the intersection of AI and fitness to launch and optimize product experiences used by tens...  ...Team Have worked on numerous machine learning problems and broken them into incremental... 
    Senior
    Full time

    Alumni Ventures

    San Francisco, CA
    2 days ago
  • $181.1k - $318.4k

     ...environmental sustainability and optimal resource utilization. This...  .... As a Sr. ML Optimization Engineer, you will work at the...  ...strategy, applied analytics, machine learning, and large-scale optimization...  ...infrastructures, supporting billions of devices globally. Description The... 
    Senior
    Relocation

    Apple Inc.

    San Francisco, CA
    2 days ago
  • $128.7k - $261.3k

     ...more sustainable, and more accessible mobility. For the AI Kernels & Compilers team, that...  ...kernel development, and performance engineering so that every cycle on our accelerators...  ...benchmarking, profiling, debugging and optimizing accelerator libraries and kernels to... 
    Senior
    Local area
    Flexible hours

    Israelvcforum

    San Francisco, CA
    4 days ago
  • $180k - $300k

    Senior Software Engineer, Machine Learning We’re looking for a product-minded machine learning engineer who’s excited...  ...boundaries of what’s possible with AI. At Arcade, you’ll experiment with...  .... Lay the foundation for content optimization: Build data pipelines connecting... 
    Senior
    Remote work
    Relocation

    Arcade

    San Francisco, CA
    4 days ago
  • $180k - $220k

    The Sr. Machine Learning Engineer will join our Applied Data Science group, part...  ...for real‑time performance optimization and machine learning solutions...  ...work for anyone with senior hands‑on adtech vendor experience...  ...and apply state‑of‑the‑art AI and machine learning in a... 
    Senior
    Full time
    Work at office
    Local area
    Remote work
    3 days per week

    Nexxen

    San Francisco, CA
    5 days ago
  • $183.7k - $248.6k

     ...opportunity Unity is looking for a Senior Machine Learning Infrastructure Engineer to join our Vector Ads team,...  ...model versioning, and inference optimization What we're looking for...  ...billion consumers each month. The top mobile games in the world, the most played... 
    Senior
    Work at office
    Remote work
    Worldwide
    Relocation package

    UNITY

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Machine Learning Engineer, On-Device & Mobile AI Optimization. Be the first to apply!