Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Inference Engineer - Model Optimization & Deployment

$242k - $290k

Zoox Inc.

The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence.

As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices.

In this role, you will:

  • Optimize large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs) using advanced quantization (PTQ, QAT), pruning, mixed-precision inference frameworks, and parameter-efficient fine-tuning (LoRA, QLoRA).
  • Architect and implement model conversion and compilation pipelines using TensorRT for edge deployment.
  • Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
  • Develop and optimize custom ML OPs and TensorRT Plugins with efficient CUDA kernels to minimize latency and maximize memory bandwidth on AI accelerators.
  • Write production-level, low latency, and memory-safe C++ and CUDA code for real-time inference on vehicle systems.
Qualifications:
  • Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference frameworks (INT8, FP8, FP4, BF16/FP16).
  • Proven experience optimizing large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs/VLAs) utilizing Efficient Attention mechanisms (e.g., FlashAttention, Linear Attention), KV-cache optimization (e.g., PagedAttention) and Speculative Decoding.
  • Extensive experience with model conversion/compilation pipelines (e.g., ONNX, TensorRT, torch.compile) and performing rigorous latency benchmark and model quality parity valuation.
  • Proficiency in low-level programming for AI accelerators, specifically developing and optimizing custom ML OPs and TensorRT Plugins with efficient CUDA kernel implementations.
  • Production-level C++ (14/17/20) and Python programming skills, with experience developing concurrent, memory-safe, real-time inference code for edge devices.
Bonus Qualifications:
  • Familiarity with SOTA autonomous driving perception algorithms (temporal 3D object detection, BEV, 3D Occupancy Networks) and multi-modal sensor processing (Vision, LiDAR, Radar).
  • Experience with distributed training pipelines and model/tensor parallelism (PyTorch Distributed, Ray, DeepSpeed, Megatron-LM) and runtime efficiency optimization for GPU clusters.
  • Experience with end-to-end autonomous driving paradigms (VLM/VLA models, Foundation models) and edge deployment technologies (e.g., TensorRT-LLM).

$242,000 - $290,000 a year

Base Salary Range

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

About Zoox

Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We're looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Follow us on LinkedIn

Accommodations

If you need an accommodation to participate in the application or interview process please reach out to [email protected] or your assigned recruiter.

A Final Note:

You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior AI Inference Engineer - Model Optimization & Deployment in Nacogdoches, TX vacancy
  • $140.8k - $211.2k

     ...Technologies, Inc. Job Area Engineering Group: Machine...  ...enable next‑generation AI experiences and drive...  ...time. What You’ll Do Model Reauthoring &...  ...core pipeline code. Inference Optimization for Edge Hardware Integrate...  ...Translate end‑customer deployment constraints — target... 
    Senior
    Full time
    Work from home

    Qualcomm

    Nacogdoches, TX
    1 day ago
  •  ...Qualcomm in San Diego is looking for an AI Engineer specializing in machine learning. You will convert and optimize models, analyze performance, and collaborate across teams to advance AI technologies. The ideal candidate should have extensive hands-on experience with... 
    Senior

    Qualcomm

    Nacogdoches, TX
    3 days ago
  • $158.4k - $237.6k

     ...Technologies, Inc.Job Area:Engineering Group, Engineering Group...  ...RoleJoin the Qualcomm AI Hub team and help developers...  ...tools to help developers optimize and deploy machine learning models on edge and mobile...  ...or similar families) for inference optimizationFamiliarity with... 
    Suggested
    Work experience placement
    Immediate start
    Work from home

    Nutanix

    Nacogdoches, TX
    4 days ago
  • $139.87k - $250.38k

     ...here! PURPOSE OF THE JOB The Senior Applied AI Engineer is responsible for leading the design...  ...-edge AI technologies, ensures models are deployed securely, cost-effectively, and in...  ...regulated environments. Monitor and optimize AI workloads for cost efficiency in... 
    Senior
    Full time
    Work at office
    Local area

    ICW Group

    Nacogdoches, TX
    18 hours ago
  • $111.3k - $166.9k

     ...Technologies, Inc. Job Area: Engineering Group, Engineering Group...  ...-generation on-device Voice AI capabilities on Windows-on-...  ...multiple audio features. Optimize and validate performance...  .../or UMDF , including build, deployment, and debugging fundamentals.... 
    Senior
    Work experience placement
    Work from home

    Qualcomm

    Nacogdoches, TX
    4 days ago
  • $105.8k - $174.8k

     ...skills and ambitions. As a Senior AI Native Engineer, you will be at the forefront...  ...ensure data integrity and optimize learning processes, all...  ...to improve high‑performance models. This position may have travel...  ...as Jira to develop and deploy analytical solutions with multiple... 
    Senior
    Full time
    Work experience placement
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    Nacogdoches, TX
    4 days ago
  • $178.4k - $267.6k

     .... Job Area: Engineering Group, Engineering Group...  ...Summary: AI Performance Engineer...  ...software solutions for Inference Acceleration....  ...development to commercial deployment-and demands strategic...  ...: ~ Convert, optimize and deploy models for efficient... 
    Senior
    Work experience placement
    Work from home

    Qualcomm

    Nacogdoches, TX
    1 day ago
  • $162.6k - $244k

     ...Job Area: Engineering Group, Engineering...  ...Qualcomm Datacenter AI Systems and Solutions...  ..., develop, optimize, and validate software...  ...solutions that enable the deployment of cutting-edge AI...  ...class Qualcomm AI inference accelerators for...  ..., diffusion models, and hybrid systems... 
    Senior
    Work experience placement
    Work from home

    Qualcomm

    Nacogdoches, TX
    1 day ago
  • $124k - $280k

     ...Data, Analytics & AI Industry/Sector...  ...data and analytics engineering focus on...  ...optimising algorithms, models, and systems to enable...  ...health plans. As a Senior Manager, you will...  ...) and operational optimization Foster a collaborative...  ...of PHI-compliant deployment patterns and HIPAA... 
    Senior
    Full time
    H1b

    PwC

    Nacogdoches, TX
    2 days ago
  • $165k - $175k

     ...Lead Software & AI Engineer San Diego, California, United...  ..., developing, and deploying advanced software solutions...  ..., and Large Language Models (LLMs) into secure...  ...frameworks, model deployment, inference pipelines, and...  ...Experience developing and optimizing CI/CD pipelines using:... 
    For contractors

    G2IT LLC

    Nacogdoches, TX
    2 days ago
  • $111.3k - $166.9k

     ...Technologies, Inc. Job Area Engineering Group Software...  ...across subsystems—AI/Gen AI, and...  ...with AI and GenAI inference frameworks such as...  ...foundation in AI concepts, model architectures,...  ..., implementation, deployment, and support....  ...and applications. Optimize AI Pipeline for performance... 
    Senior
    Work experience placement
    Work from home

    Qualcomm

    Nacogdoches, TX
    2 days ago
  •  ...technology firm in San Diego seeks an LLM Serving Engineer to develop scalable AI solutions. This role involves building LLM inference platforms and collaborating with teams to...  ...machine learning. Responsibilities include optimizing deep learning workloads and utilizing... 
    Senior

    Qualcomm

    Nacogdoches, TX
    3 days ago
  • $140.8k - $211.2k

     ...General Summary Qualcomm AI Research is looking...  ...-class algorithm engineers in general domain machine...  ..., and user-friendly model optimization tools such as...  ...technology that will be deployed worldwide in our industry...  ...modal, VLA Efficient inference algorithms, e.g. batching... 
    Senior
    Work experience placement
    Worldwide

    Qualcomm

    Nacogdoches, TX
    3 days ago
  • $131k - $169k

     ...Karbon, a leader in AI-powered practice management software, seeks a Senior Security Engineer to enhance its security posture. This role involves partnering across teams...  ...security practices from feature design to deployment while leveraging AI tools for security improvements... 
    Senior
    Flexible hours

    Karbon

    Nacogdoches, TX
    18 hours ago
  • $140.8k - $211.2k

     ...transformation. As a Qualcomm AI Software Engineer, you will develop...  ...generative AI models on Snapdragon...  ...framework for inference on resource‑...  ...Validate, analyze, and optimize the performance...  ...experience (senior). Proficiency in...  ...acceleration and deployment of generative AI... 
    Senior
    Internship

    Qualcomm

    Nacogdoches, TX
    4 days ago
  • $141.6k - $212.4k

     ...Qualcomm in San Diego seeks a Software Engineer with strong C/C++ skills and familiarity with GPGPU APIs. The ideal candidate will have...  ...include improving machine learning frameworks, optimizing GPU resource utilization, and writing documentation. The position... 

    Nutanix

    Nacogdoches, TX
    3 days ago
  • $124k - $280k

     ...Competency: Data, Analytics & AI Industry/Sector:...  ...in data and analytics engineering focus on leveraging...  ...optimising algorithms, models, and systems to enable...  ...for health systems. As a Senior Manager, you will serve...  ...AI in a HIPAA-compliant deployment context... 
    Senior
    Full time
    H1b

    PwC

    Nacogdoches, TX
    18 hours ago
  • $206.9k - $310.3k

     ...Inc. Job Area: Engineering Group, Engineering...  ...General Summary: Auto AI Systems: Team Lead...  ...Visual Language Action Models (VLAs), VLMs, LLMs used...  ...-to-end AD design and deployment on the Qualcomm Ride platform...  ...kernel/compiler optimization. Strong... 
    Temporary work
    Work experience placement
    Work from home

    Qualcomm

    Nacogdoches, TX
    4 days ago
  •  ...AI Engineer We are seeking an innovative and hands-on AI Engineer to join our Data Science...  .... What You'll Do Design and deploy scalable AI/ML and LLM-powered...  ...lifecycle including deployment, monitoring, optimization, and retraining Implement evaluation... 

    The GOAL Family of Companies

    Nacogdoches, TX
    18 hours ago
  •  ...About the job AI Engineer AI / Machine Learning Engineer...  ...expertise with hands-on experience deploying LLM-based solutions into...  ...ecosystems and understands how to optimize, evaluate, and productionize...  ...and benchmark ML and language models using structured... 
    Full time
    Work at office

    Calqulate

    Nacogdoches, TX
    4 days ago
  • $140.8k - $211.2k

     ...Qualcomm is looking for an AI Software Engineer in San Diego, CA to develop and implement machine learning techniques across...  ...building software for Qualcomm's AI Stack SDKs to optimize performance of generative AI models. Candidates should possess a Bachelor's degree in Engineering... 

    Qualcomm

    Nacogdoches, TX
    3 days ago
  •  ...Qualcomm in San Diego is seeking an AI Software Engineer to develop and implement cutting-edge machine learning techniques. In this role, you...  ...SDK into applications and collaborate with various teams to optimize performance across technology verticals. The ideal candidate... 
    Senior

    Qualcomm

    Nacogdoches, TX
    4 days ago
  •  ...lab to clinic. Role Overview AI/ML Data Engineer – Global Data and Analytics team. Design, build and optimize data pipelines and infrastructure...  ...into technical solutions. Deploy AI/ML solutions at scale...  ...data infrastructure needs for model training, tuning and deployment... 
    Summer work
    Work at office
    Night shift

    Ortho Clinical Diagnostics

    Nacogdoches, TX
    1 day ago
  •  ...Senior ML Engineer page is loaded## Senior ML Engineerlocations: US - CA -...  ...machine learning algorithms and models to solve problems involving...  ...design, implementation and deployment. This role strongly...  ...application of cutting-edge AI methodologies at Exact Sciences... 
    Senior
    Full time
    For contractors
    Local area
    Night shift

    Exact Sciences

    Nacogdoches, TX
    3 days ago
  • $122.5k - $183.7k

     ...Qualcomm is seeking a highly skilled Senior Engineer to join their Modem Power Software team in San Diego. In this role, you will design and optimize advanced power management software for next-generation mobile platforms, including 6th Generation modem technologies.... 
    Senior

    Qualcomm

    Nacogdoches, TX
    1 day ago
  • $160k - $240k

     ...Senior Software Engineer Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission of protecting...  ..., robotics, control systems, optimization, and data analysis to create...  ...mission autonomy software stack and deploying them across a breadth of... 
    Senior
    Full time
    Temporary work
    Part time
    Worldwide

    Shield AI

    Nacogdoches, TX
    18 hours ago
  • $160k - $240k

     ...Flight Integration Engineer Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission...  ...candidate will be skilled at deploying autonomy solutions onto unmanned platforms...  ...the ability to troubleshoot and optimize system performance. Excellent... 
    Senior
    Full time
    Temporary work
    Part time
    Work experience placement
    Work at office
    Worldwide

    Shield AI

    Nacogdoches, TX
    1 day ago
  • $162.6k - $244k

     ...Inc. Job Area: Engineering Group, Engineering...  ...General Summary: As a Senior Software Engineer, you...  ..., development, and optimization and commercialization...  ...(core, cache, memory models, bus architecture, etc...  ...for the development and deployment of highly critical processes... 
    Senior
    Work experience placement
    Immediate start
    Work from home

    Qualcomm

    Nacogdoches, TX
    4 days ago
  • $200.8k - $301.2k

     ..., Inc. Job Area: Engineering Group Machine Learning...  ...Qualcomm Cloud AI team is...  ...software solutions for Inference Acceleration. We...  ...technology, performance modeling, and bottleneck...  ...&D to commercial deployment. The environment...  ..., analyzing, and optimizing neural networks... 
    Work experience placement
    Work from home

    Stryker

    Nacogdoches, TX
    3 days ago
  • $160k - $240k

     ...Founded in 2015, Shield AI is a venture‑backed defense‑tech...  ...providers). It’s a hands‑on role for engineers who like seeing their code...  ...bring‑up, container‑based deployment (e.g., k3s/k3d), and configuration...  ...ability to troubleshoot and optimize system performance across the... 
    Senior
    Full time
    Temporary work
    Part time
    Work experience placement
    Work at office
    Worldwide

    Shield AI

    Nacogdoches, TX
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Inference Engineer - Model Optimization & Deployment. Be the first to apply!