Senior AI Inference Engineer - Model Optimization & Deployment

$242k - $290k

Zoox Inc.

The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence.

As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices.

In this role, you will:

Optimize large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs) using advanced quantization (PTQ, QAT), pruning, mixed-precision inference frameworks, and parameter-efficient fine-tuning (LoRA, QLoRA).
Architect and implement model conversion and compilation pipelines using TensorRT for edge deployment.
Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
Develop and optimize custom ML OPs and TensorRT Plugins with efficient CUDA kernels to minimize latency and maximize memory bandwidth on AI accelerators.
Write production-level, low latency, and memory-safe C++ and CUDA code for real-time inference on vehicle systems.

Qualifications:

Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference frameworks (INT8, FP8, FP4, BF16/FP16).
Proven experience optimizing large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs/VLAs) utilizing Efficient Attention mechanisms (e.g., FlashAttention, Linear Attention), KV-cache optimization (e.g., PagedAttention) and Speculative Decoding.
Extensive experience with model conversion/compilation pipelines (e.g., ONNX, TensorRT, torch.compile) and performing rigorous latency benchmark and model quality parity valuation.
Proficiency in low-level programming for AI accelerators, specifically developing and optimizing custom ML OPs and TensorRT Plugins with efficient CUDA kernel implementations.
Production-level C++ (14/17/20) and Python programming skills, with experience developing concurrent, memory-safe, real-time inference code for edge devices.

Bonus Qualifications:

Familiarity with SOTA autonomous driving perception algorithms (temporal 3D object detection, BEV, 3D Occupancy Networks) and multi-modal sensor processing (Vision, LiDAR, Radar).
Experience with distributed training pipelines and model/tensor parallelism (PyTorch Distributed, Ray, DeepSpeed, Megatron-LM) and runtime efficiency optimization for GPU clusters.
Experience with end-to-end autonomous driving paradigms (VLM/VLA models, Foundation models) and edge deployment technologies (e.g., TensorRT-LLM).

$242,000 - $290,000 a year

Base Salary Range

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

About Zoox

Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We're looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Accommodations

If you need an accommodation to participate in the application or interview process please reach out to [email protected] or your assigned recruiter.

A Final Note:

You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Senior AI Inference Engineer - Model Optimization & Deployment in Nacogdoches, TX vacancy

Sr Software Engineer, AI Tools - On-Device Generative AI Model Optimization
$140.8k - $211.2k
...Technologies, Inc. Job Area Engineering Group: Machine... ...enable next‑generation AI experiences and drive... ...time. What You’ll Do Model Reauthoring &... ...core pipeline code. Inference Optimization for Edge Hardware Integrate... ...Translate end‑customer deployment constraints — target...
Senior
Full time
Work from home
Qualcomm
Nacogdoches, TX
1 day ago
Senior/Staff AI Performance Engineer: Inference Optimization
...Qualcomm in San Diego is looking for an AI Engineer specializing in machine learning. You will convert and optimize models, analyze performance, and collaborate across teams to advance AI technologies. The ideal candidate should have extensive hands-on experience with...
Senior
Qualcomm
Nacogdoches, TX
3 days ago
Staff Machine Learning Engineer - Model Optimization & Quantization
$158.4k - $237.6k
...Technologies, Inc.Job Area:Engineering Group, Engineering Group... ...RoleJoin the Qualcomm AI Hub team and help developers... ...tools to help developers optimize and deploy machine learning models on edge and mobile... ...or similar families) for inference optimizationFamiliarity with...
Suggested
Work experience placement
Immediate start
Work from home
Nutanix
Nacogdoches, TX
4 days ago
Senior Applied AI Engineer
$139.87k - $250.38k
...here! PURPOSE OF THE JOB The Senior Applied AI Engineer is responsible for leading the design... ...-edge AI technologies, ensures models are deployed securely, cost-effectively, and in... ...regulated environments. Monitor and optimize AI workloads for cost efficiency in...
Senior
Full time
Work at office
Local area
ICW Group
Nacogdoches, TX
18 hours ago
#Windows Audio Software Engineer (Voice AI) Engineer, Senior
$111.3k - $166.9k
...Technologies, Inc. Job Area: Engineering Group, Engineering Group... ...-generation on-device Voice AI capabilities on Windows-on-... ...multiple audio features. Optimize and validate performance... .../or UMDF , including build, deployment, and debugging fundamentals....
Senior
Work experience placement
Work from home
Qualcomm
Nacogdoches, TX
4 days ago
Physical AI Engineering Consultant - Senior - Consulting - Open Location
$105.8k - $174.8k
...skills and ambitions. As a Senior AI Native Engineer, you will be at the forefront... ...ensure data integrity and optimize learning processes, all... ...to improve high‑performance models. This position may have travel... ...as Jira to develop and deploy analytical solutions with multiple...
Senior
Full time
Work experience placement
Summer holiday
Flexible hours
Ernst & Young Oman
Nacogdoches, TX
4 days ago
AI Performance Engineer (Cloud AI Engineering), Sr | Staff | Sr. Staff
$178.4k - $267.6k
.... Job Area: Engineering Group, Engineering Group... ...Summary: AI Performance Engineer... ...software solutions for Inference Acceleration.... ...development to commercial deployment-and demands strategic... ...: ~ Convert, optimize and deploy models for efficient...
Senior
Work experience placement
Work from home
Qualcomm
Nacogdoches, TX
1 day ago
Datacenter AI Systems and Solutions Engineer, Sr Staff
$162.6k - $244k
...Job Area: Engineering Group, Engineering... ...Qualcomm Datacenter AI Systems and Solutions... ..., develop, optimize, and validate software... ...solutions that enable the deployment of cutting-edge AI... ...class Qualcomm AI inference accelerators for... ..., diffusion models, and hybrid systems...
Senior
Work experience placement
Work from home
Qualcomm
Nacogdoches, TX
1 day ago
Applied AI Health Data System Engineer-Senior Manager
$124k - $280k
...Data, Analytics & AI Industry/Sector... ...data and analytics engineering focus on... ...optimising algorithms, models, and systems to enable... ...health plans. As a Senior Manager, you will... ...) and operational optimization Foster a collaborative... ...of PHI-compliant deployment patterns and HIPAA...
Senior
Full time
H1b
PwC
Nacogdoches, TX
2 days ago
Lead Software & AI Engineer
$165k - $175k
...Lead Software & AI Engineer San Diego, California, United... ..., developing, and deploying advanced software solutions... ..., and Large Language Models (LLMs) into secure... ...frameworks, model deployment, inference pipelines, and... ...Experience developing and optimizing CI/CD pipelines using:...
For contractors
G2IT LLC
Nacogdoches, TX
2 days ago
Senior Software Engineer - Edge AI/GenAI
$111.3k - $166.9k
...Technologies, Inc. Job Area Engineering Group Software... ...across subsystems—AI/Gen AI, and... ...with AI and GenAI inference frameworks such as... ...foundation in AI concepts, model architectures,... ..., implementation, deployment, and support.... ...and applications. Optimize AI Pipeline for performance...
Senior
Work experience placement
Work from home
Qualcomm
Nacogdoches, TX
2 days ago
Senior Cloud AI LLM Serving Engineer
...technology firm in San Diego seeks an LLM Serving Engineer to develop scalable AI solutions. This role involves building LLM inference platforms and collaborating with teams to... ...machine learning. Responsibilities include optimizing deep learning workloads and utilizing...
Senior
Qualcomm
Nacogdoches, TX
3 days ago
Senior AI Research Quantization Engineer
$140.8k - $211.2k
...General Summary Qualcomm AI Research is looking... ...-class algorithm engineers in general domain machine... ..., and user-friendly model optimization tools such as... ...technology that will be deployed worldwide in our industry... ...modal, VLA Efficient inference algorithms, e.g. batching...
Senior
Work experience placement
Worldwide
Qualcomm
Nacogdoches, TX
3 days ago
Senior AI & Cloud Security Engineer
$131k - $169k
...Karbon, a leader in AI-powered practice management software, seeks a Senior Security Engineer to enhance its security posture. This role involves partnering across teams... ...security practices from feature design to deployment while leveraging AI tools for security improvements...
Senior
Flexible hours
Karbon
Nacogdoches, TX
18 hours ago
Entry level & Senior Software Engineer, Core AI Software (Onsite)
$140.8k - $211.2k
...transformation. As a Qualcomm AI Software Engineer, you will develop... ...generative AI models on Snapdragon... ...framework for inference on resource‑... ...Validate, analyze, and optimize the performance... ...experience (senior). Proficiency in... ...acceleration and deployment of generative AI...
Senior
Internship
Qualcomm
Nacogdoches, TX
4 days ago
GPU AI Compiler Engineer: MLIR & Graph Optimizations
$141.6k - $212.4k
...Qualcomm in San Diego seeks a Software Engineer with strong C/C++ skills and familiarity with GPGPU APIs. The ideal candidate will have... ...include improving machine learning frameworks, optimizing GPU resource utilization, and writing documentation. The position...
Nutanix
Nacogdoches, TX
3 days ago
Applied AI Health System Engineer - Senior Manager
$124k - $280k
...Competency: Data, Analytics & AI Industry/Sector:... ...in data and analytics engineering focus on leveraging... ...optimising algorithms, models, and systems to enable... ...for health systems. As a Senior Manager, you will serve... ...AI in a HIPAA-compliant deployment context...
Senior
Full time
H1b
PwC
Nacogdoches, TX
18 hours ago
Efficient AI Systems, Principal Engineer (On-Device/Edge)
$206.9k - $310.3k
...Inc. Job Area: Engineering Group, Engineering... ...General Summary: Auto AI Systems: Team Lead... ...Visual Language Action Models (VLAs), VLMs, LLMs used... ...-to-end AD design and deployment on the Qualcomm Ride platform... ...kernel/compiler optimization. Strong...
Temporary work
Work experience placement
Work from home
Qualcomm
Nacogdoches, TX
4 days ago
AI Engineer
...AI Engineer We are seeking an innovative and hands-on AI Engineer to join our Data Science... .... What You'll Do Design and deploy scalable AI/ML and LLM-powered... ...lifecycle including deployment, monitoring, optimization, and retraining Implement evaluation...
The GOAL Family of Companies
Nacogdoches, TX
18 hours ago
AI Engineer
...About the job AI Engineer AI / Machine Learning Engineer... ...expertise with hands-on experience deploying LLM-based solutions into... ...ecosystems and understands how to optimize, evaluate, and productionize... ...and benchmark ML and language models using structured...
Full time
Work at office
Calqulate
Nacogdoches, TX
4 days ago
AI Software Engineer Core Stack & Edge Inference
$140.8k - $211.2k
...Qualcomm is looking for an AI Software Engineer in San Diego, CA to develop and implement machine learning techniques across... ...building software for Qualcomm's AI Stack SDKs to optimize performance of generative AI models. Candidates should possess a Bachelor's degree in Engineering...
Qualcomm
Nacogdoches, TX
3 days ago
Senior AI Benchmark Engineer for Embedded ML on SoCs
...Qualcomm in San Diego is seeking an AI Software Engineer to develop and implement cutting-edge machine learning techniques. In this role, you... ...SDK into applications and collaborate with various teams to optimize performance across technology verticals. The ideal candidate...
Senior
Qualcomm
Nacogdoches, TX
4 days ago
AI/ML Data Engineer (Databricks)
...lab to clinic. Role Overview AI/ML Data Engineer – Global Data and Analytics team. Design, build and optimize data pipelines and infrastructure... ...into technical solutions. Deploy AI/ML solutions at scale... ...data infrastructure needs for model training, tuning and deployment...
Summer work
Work at office
Night shift
Ortho Clinical Diagnostics
Nacogdoches, TX
1 day ago
Sr. AI Engineer - Life Sciences
...Senior ML Engineer page is loaded## Senior ML Engineerlocations: US - CA -... ...machine learning algorithms and models to solve problems involving... ...design, implementation and deployment. This role strongly... ...application of cutting-edge AI methodologies at Exact Sciences...
Senior
Full time
For contractors
Local area
Night shift
Exact Sciences
Nacogdoches, TX
3 days ago
Senior Modem Power Software Engineer - 6G Optimization
$122.5k - $183.7k
...Qualcomm is seeking a highly skilled Senior Engineer to join their Modem Power Software team in San Diego. In this role, you will design and optimize advanced power management software for next-generation mobile platforms, including 6th Generation modem technologies....
Senior
Qualcomm
Nacogdoches, TX
1 day ago
Senior Software Engineer, Autonomy Behaviors
$160k - $240k
...Senior Software Engineer Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission of protecting... ..., robotics, control systems, optimization, and data analysis to create... ...mission autonomy software stack and deploying them across a breadth of...
Senior
Full time
Temporary work
Part time
Worldwide
Shield AI
Nacogdoches, TX
18 hours ago
Senior Engineer, Software - Autonomous Aircraft Integration
$160k - $240k
...Flight Integration Engineer Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission... ...candidate will be skilled at deploying autonomy solutions onto unmanned platforms... ...the ability to troubleshoot and optimize system performance. Excellent...
Senior
Full time
Temporary work
Part time
Work experience placement
Work at office
Worldwide
Shield AI
Nacogdoches, TX
1 day ago
Senior Staff Software Engineer
$162.6k - $244k
...Inc. Job Area: Engineering Group, Engineering... ...General Summary: As a Senior Software Engineer, you... ..., development, and optimization and commercialization... ...(core, cache, memory models, bus architecture, etc... ...for the development and deployment of highly critical processes...
Senior
Work experience placement
Immediate start
Work from home
Qualcomm
Nacogdoches, TX
4 days ago
Principal SW Engineer - LLM Serving (Cloud AI)
$200.8k - $301.2k
..., Inc. Job Area: Engineering Group Machine Learning... ...Qualcomm Cloud AI team is... ...software solutions for Inference Acceleration. We... ...technology, performance modeling, and bottleneck... ...&D to commercial deployment. The environment... ..., analyzing, and optimizing neural networks...
Work experience placement
Work from home
Stryker
Nacogdoches, TX
3 days ago
Senior Software Engineer, Autonomous Pilot Integration
$160k - $240k
...Founded in 2015, Shield AI is a venture‑backed defense‑tech... ...providers). It’s a hands‑on role for engineers who like seeing their code... ...bring‑up, container‑based deployment (e.g., k3s/k3d), and configuration... ...ability to troubleshoot and optimize system performance across the...
Senior
Full time
Temporary work
Part time
Work experience placement
Work at office
Worldwide
Shield AI
Nacogdoches, TX
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Inference Engineer - Model Optimization & Deployment. Be the first to apply!