Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Inference Engineer - Model Optimization & Deployment

Zoox Inc.

Job Description

Job Description

The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence.


As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices.

In this role, you will:
  • Optimize large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs) using advanced quantization (PTQ, QAT), pruning, mixed-precision inference frameworks, and parameter-efficient fine-tuning (LoRA, QLoRA).
  • Architect and implement model conversion and compilation pipelines using TensorRT for edge deployment.
  • Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
  • Develop and optimize custom ML OPs and TensorRT Plugins with efficient CUDA kernels to minimize latency and maximize memory bandwidth on AI accelerators.
  • Write production-level, low latency, and memory-safe C++ and CUDA code for real-time inference on vehicle systems.
Qualifications:
  • Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference frameworks (INT8, FP8, FP4, BF16/FP16).
  • Proven experience optimizing large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs/VLAs) utilizing Efficient Attention mechanisms (e.g., FlashAttention, Linear Attention), KV-cache optimization (e.g., PagedAttention) and Speculative Decoding.
  • Extensive experience with model conversion/compilation pipelines (e.g., ONNX, TensorRT, torch.compile) and performing rigorous latency benchmark and model quality parity valuation.
  • Proficiency in low-level programming for AI accelerators, specifically developing and optimizing custom ML OPs and TensorRT Plugins with efficient CUDA kernel implementations.
  • Production-level C++ (14/17/20) and Python programming skills, with experience developing concurrent, memory-safe, real-time inference code for edge devices.
Bonus Qualifications:
  • Familiarity with SOTA autonomous driving perception algorithms (temporal 3D object detection, BEV, 3D Occupancy Networks) and multi-modal sensor processing (Vision, LiDAR, Radar).
  • Experience with distributed training pipelines and model/tensor parallelism (PyTorch Distributed, Ray, DeepSpeed, Megatron-LM) and runtime efficiency optimization for GPU clusters.
  • Experience with end-to-end autonomous driving paradigms (VLM/VLA models, Foundation models) and edge deployment technologies (e.g., TensorRT-LLM).

Base Salary Range

 

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

 

Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

About Zoox

Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Follow us on LinkedIn

Accommodations

If you need an accommodation to participate in the application or interview process please reach out to View email address on ziprecruiter.com or your assigned recruiter.

A Final Note:

You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Vacancy posted 26 days ago
Similar jobs that could be interesting for youBased on the Senior AI Inference Engineer - Model Optimization & Deployment in Seattle, WA vacancy
  •  ...multi-modality foundation model to drive the next generation...  ...intelligence. As a Model Optimization & Deployment Engineer, you will focus on bringing...  ...build highly concurrent inference code to ensure real-time, deterministic...  ...and minimize latency on AI accelerators. Write... 
    Senior
    Temporary work
    Relocation package

    Zoox

    Seattle, WA
    4 days ago
  • $202.16k - $368.22k

     ...applied research in Generative AI and CV/Multimodal...  ...groups dedicated to generative models for content creation, image...  ...Multimodal Model Training and Inference Optimization Engineer with expertise in...  ...performance, scalability, and deployment of large-scale generative AI... 
    Senior
    Temporary work
    Local area

    ByteDance

    Seattle, WA
    2 days ago
  • $70k - $300k

     ...Staff AI Software Engineer - Edge Model Optimization & Deployment FieldAI is transforming how robots interact with the real world. Our growing ML team in Seattle...  ...platforms. In this role, you will own the edge inference stack end to end, profiling and accelerating... 
    Suggested

    Field AI

    Seattle, WA
    4 days ago
  • $167.2k - $209k

     ...pioneering cloud service provider in Seattle seeks a Senior Engineer 2 for its AI Inference Data Plane team. This role requires designing and delivering...  ...technical leadership, system design, performance optimization, mentorship, and operational excellence. Candidates should... 
    Senior
    Remote work

    DigitalOcean

    Seattle, WA
    7 days ago
  • $70 - $75 per hour

     ...Senior AI Engineer – Privacy Pay Range: $70/hr - $75/hr...  ...applying large language models, retrieval-augmented...  ...workflows. Build and optimize data pipelines using...  ...model training and inference. Apply prompt engineering...  ...pipelines. Deploy and manage AI workloads... 
    Senior

    Cynet Systems

    Bellevue, WA
    1 day ago
  •  ...Senior Principal AI Agent / ML Software Engineer The Senior Principal AI Agent /...  ...workflows, scalable inference infrastructure, and...  ...systems, model serving, AI workflow...  ...distributed services optimized for low latency, high...  ...reviews, test strategy, deployment automation,... 
    Senior

    Oracle

    Seattle, WA
    1 day ago
  •  ...a mission to reinvent AI inference infrastructure from the...  ...every layer, from model architecture to kernels...  ...Infrastructure Software Engineer to own and evolve the...  ...scale predictably, and deploy seamlessly across managed...  ...reliability and cost optimization, working closely with... 
    Work at office
    Flexible hours
    3 days per week

    ElastixAI Inc.

    Seattle, WA
    17 days ago
  • $128k - $184k

     ...usher in this new era, we seek AI-native thinkers across every...  .... You'll own the full AI engineering lifecycle: design, prompt/tool engineering, evals, deployment, measurement, and optimization. You'll work with a small, high-powered modeling and infrastructure team. What... 
    Senior
    Flexible hours

    Snowflake Computing

    Bellevue, WA
    11 hours ago
  • $150k - $220k

     ...Senior Software Engineer, AI QXO, Inc. is the largest publicly traded distributor...  ..., AI to design, build, and deploy production-grade AI agents...  ...Architect, build, and optimize AI agents using modern agent...  ...equivalents). Implement MCP (Model Context Protocol) servers,... 
    Senior
    Flexible hours

    QXO

    Seattle, WA
    1 day ago
  • $175k - $200k

     .... WHAT YOU WILL DO: As an AI Engineer Consultant, you will work as...  ...you will design, develop, and deploy scalable AI and machine learning...  ..., including large language model (LLM) applications and...  ...deployments Deploy, monitor, and optimize AI solutions in production, focusing... 
    Senior
    Live in

    Kalles Group

    Seattle, WA
    3 days ago
  • $13 per hour

     ...Senior/Lead AI Software Engineer Join an agile team with deep startup roots. We operate as a high-velocity...  ...tools to deliver secure, optimized, and high-quality code. Design and...  ...Guide technical decision-making for AI model deployment, safety constraints, and... 
    Senior
    Immediate start

    Salesforce

    Seattle, WA
    11 hours ago
  • A startup building AI infrastructure is seeking a Senior Systems Engineer to support deployment and maintenance of their systems. This hands-on role involves validating...  ...deployments in a data center environment, ensuring optimal performance and reliability. Candidates should... 
    Senior

    Nscale

    Seattle, WA
    2 days ago
  • $176.76k - $232k

     ...The Enterprise Data & AI team is a strategic and...  ...As a Senior AI/ML Engineer, you will lead the delivery...  ...problems. You will build, deploy, scale and maintain AI...  ...challenges from setting up model training and fine-tuning...  ...design for serving AI/ML inference solutions in... 
    Senior
    Permanent employment
    Contract work
    Part time
    Work visa

    lululemon

    Seattle, WA
    2 days ago
  • $145k - $210k

     ...Senior AI/ML Engineer Cooley is seeking a Senior AI/ML Engineer to join the Practice Engineering...  ...will play a key role in building, deploying, and operating enterprise scale Artificial...  ...code for AI and ML workloads Optimize performance, reliability, and cost efficiency... 
    Senior
    Full time
    Temporary work
    Work at office
    Flexible hours
    Weekend work

    Cooley Corp.

    Seattle, WA
    4 days ago
  • $150.33k - $183.74k

     ...challenging opportunity for a Senior Databricks AI/ML Engineer to join our community....  ...focuses on building and deploying scalable AI/ML solutions across...  ...to operationalize models, transforming them into robust...  ...tuning. Develop and optimize complex SQL queries and stored... 
    Senior
    Full time
    Temporary work
    Part time
    Work experience placement
    Immediate start
    Work from home
    Flexible hours
    Shift work

    PEMCO Insurance

    Seattle, WA
    1 day ago
  •  ...bulldozers, loaders, excavators—into AI-powered fleets that operate...  ...experiment. Built by engineers from mining, construction,...  ...control policy development and deployment. Stay up-to-date on cutting...  ..., imitation learning, and optimization for dynamic systems. ~ Proficiency... 
    Senior
    Full time

    Aim Company

    Seattle, WA
    7 hours ago
  • $320k

     ...Role Our mandate is to make inference deployment boring and unattended....  ..., and Trainium — and every model update must reach production...  ...unattended. As a Software Engineer on the Launch Engineering team...  ...is a resource‑constrained optimization problem at its core: validation... 
    Senior
    Visa sponsorship
    Shift work

    Menlo Ventures

    Seattle, WA
    1 day ago
  • $105.8k - $174.8k

     ...and Decision Science – AI Native Engineering Physical AI Engineering Consultant, Senior Consultant The...  ...ensure data integrity and optimize learning processes,...  ...improve high-performance models. This position may...  ...Jira to develop and deploy analytical solutions with... 
    Senior
    Full time
    Work experience placement
    Summer holiday
    Flexible hours

    EY

    Seattle, WA
    4 days ago
  • $143.7k - $194.4k

     ...unparalleled ML inference and training...  ...wide range of models and supporting...  ...boundary, our engineers build systematic...  ...fine tuned for optimal performance for...  ...'s possible in AI acceleration....  ...frameworks for deployment on custom ML hardware...  ...mentorship. Our senior members enjoy... 
    Work experience placement
    Internship
    Flexible hours

    Amazon

    Seattle, WA
    2 days ago
  • $124k - $280k

     ...Data, Analytics & AI Industry/Sector...  ...data and analytics engineering focus on...  ...optimising algorithms, models, and systems to enable...  ...health plans. As a Senior Manager, you will...  ...) and operational optimization Foster a collaborative...  ...of PHI-compliant deployment patterns and HIPAA... 
    Senior
    Full time
    H1b

    PwC

    Seattle, WA
    11 hours ago
  •  ...the early stages of deploying our robotaxis on...  ...-scale Foundation models, VLMs, and VLAs to...  ...our ML Performance Optimization initiatives and...  ...of strong software engineers and act as a force...  ...edge ML Training OR Inference performance...  ...artificial intelligence (AI) tools to support... 
    Senior

    Zoox

    Seattle, WA
    22 days ago
  • $262k - $365k

    Senior Staff Software AI Engineer, Data Cloud Frontier AI In accordance with Washington state law, we are highlighting our comprehensive...  ...technical project strategy, ML design, and optimizing ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging... 
    Senior
    Full time
    Temporary work
    Immediate start
    Flexible hours

    Google Inc.

    Seattle, WA
    1 day ago
  • $105.8k - $174.8k

     ...skills and ambitions. As a Senior AI Native Engineer, you will be at the...  ...ensure data integrity and optimize learning processes, all while...  ...to improve high‑performance models. This position may have travel...  ...such as Jira to develop and deploy analytical solutions with multiple... 
    Senior
    Full time
    Work experience placement
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    Seattle, WA
    1 day ago
  • $200k - $332k

     ...in the early stages of deploying our robotaxis on...  ...lead our ML Performance Optimization initiatives and make our Training and Inference platform that enables...  ...and Advanced Hardware Engineering group and have the opportunity...  ...for distributed model training. Experience... 
    Senior
    Temporary work
    Relocation package

    Zoox

    Seattle, WA
    more than 2 months ago
  •  ...AI Infrastructure SpecialistAs vCluster's AI Infrastructure...  ...to a production-ready deployment. This is not a...  ....Infrastructure Optimization: Configure and troubleshoot...  ...: Collaborate with Engineering and Product to surface...  ...Familiarity: Experience with inference serving, GPU... 
    Remote work
    Flexible hours

    vCluster

    Seattle, WA
    2 days ago
  • $178k - $316k

     ...design and deliver AI-powered...  ...to invent and deploy the next generation...  ...for Applied AI Engineers at Staff and Sr...  ...on a variety of models and modeling systems...  ....g., SFT/DPO), optimize prompts, and...  .../cost‑aware inference; contribute to...  ...and learn from senior ML/SWE teammates... 
    Work at office
    3 days per week

    Quizlet

    Seattle, WA
    4 days ago
  •  ...the power of data, AI, and emerging GenAI...  ...Science - AI Native Engineering AI Engineering, Senior Manager, Consultant...  ...clients define and deploy Generative AI (GenAI...  ...frameworks, AI operating models, defining solution...  ...support, and restoration optimization Vegetation... 
    Senior
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    Seattle, WA
    4 days ago
  • $140k - $180k

     ...business services, strategic business models and design-led user experiences. Their...  ...the way their clients do business. As a Senior AI Engineer, you will understand how AI is...  ...workshops to comprehensive enterprise-level deployments. Responsibilities Implementing AI solutions... 
    Senior
    Full time

    Reply

    Seattle, WA
    4 days ago
  • $231k

     ...leave, a flexible work model (with some pretty...  ...Group is using AI to re-invent how we do engineering to deliver hyper-personalized...  ..., and safe deployment into production...  ...solutions leveraging ML, optimization, and intelligent...  ...stores, and online inference systems Expertise... 
    Immediate start
    Flexible hours

    Expedia Group

    Seattle, WA
    4 days ago
  • $293k - $325k

     ...Data Engineer Opportunity At OpenAI The Statsig...  ...needs of frontier AI products. Our...  ..., causal inference, and product analytics...  ...help them train new models to deliver to users...  ...write, debug and optimize Spark code. This...  ...AI research and deployment company dedicated... 
    Senior

    OpenAI

    Bellevue, WA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Inference Engineer - Model Optimization & Deployment. Be the first to apply!