Senior AI Inference Engineer - Model Optimization & Deployment
$242k - $290kZoox Inc.
The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence.
As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices. In this role, you will:- Optimize large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs) using advanced quantization (PTQ, QAT), pruning, mixed-precision inference frameworks, and parameter-efficient fine-tuning (LoRA, QLoRA).
- Architect and implement model conversion and compilation pipelines using TensorRT for edge deployment.
- Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
- Develop and optimize custom ML OPs and TensorRT Plugins with efficient CUDA kernels to minimize latency and maximize memory bandwidth on AI accelerators.
- Write production-level, low latency, and memory-safe C++ and CUDA code for real-time inference on vehicle systems.
- Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference frameworks (INT8, FP8, FP4, BF16/FP16).
- Proven experience optimizing large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs/VLAs) utilizing Efficient Attention mechanisms (e.g., FlashAttention, Linear Attention), KV-cache optimization (e.g., PagedAttention) and Speculative Decoding.
- Extensive experience with model conversion/compilation pipelines (e.g., ONNX, TensorRT, torch.compile) and performing rigorous latency benchmark and model quality parity valuation.
- Proficiency in low-level programming for AI accelerators, specifically developing and optimizing custom ML OPs and TensorRT Plugins with efficient CUDA kernel implementations.
- Production-level C++ (14/17/20) and Python programming skills, with experience developing concurrent, memory-safe, real-time inference code for edge devices.
- Familiarity with SOTA autonomous driving perception algorithms (temporal 3D object detection, BEV, 3D Occupancy Networks) and multi-modal sensor processing (Vision, LiDAR, Radar).
- Experience with distributed training pipelines and model/tensor parallelism (PyTorch Distributed, Ray, DeepSpeed, Megatron-LM) and runtime efficiency optimization for GPU clusters.
- Experience with end-to-end autonomous driving paradigms (VLM/VLA models, Foundation models) and edge deployment technologies (e.g., TensorRT-LLM).
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior AI Inference Engineer - Model Optimization & Deployment in Nacogdoches, TX vacancy
$158.4k - $237.6k
..., Inc. Job Area: Engineering Group, Engineering Group... ...Join the Qualcomm AI Hub team and help developers... ...tools to help developers optimize and deploy machine learning models on edge and mobile hardware... ...or similar families) for inference optimization ~ Familiarity...SuggestedWork experience placementImmediate startWork from home$178.4k - $267.6k
...Technologies, Inc. is seeking a Machine Learning Engineer in San Diego, California, to work with cutting-edge AI technologies and frameworks. The ideal... ...workflows. Responsibilities include architecting model optimization techniques and collaborating with various teams...Senior$139.87k - $250.38k
...here! PURPOSE OF THE JOB The Senior Applied AI Engineer is responsible for leading the design... ...-edge AI technologies, ensures models are deployed securely, cost-effectively, and in... ...regulated environments. Monitor and optimize AI workloads for cost efficiency in...SeniorFull timeWork at officeLocal area$111.3k - $166.9k
...Technologies, Inc. Job Area: Engineering Group, Engineering Group... ...-generation on-device Voice AI capabilities on Windows-on-... ...multiple audio features. Optimize and validate performance... .../or UMDF , including build, deployment, and debugging fundamentals....SeniorWork experience placementWork from home$162.6k - $244k
...Qualcomm Datacenter AI Systems and Solutions Engineer Job Description... ...research, develop, optimize, and validate software... ...solutions that enable the deployment of cutting-edge AI... ...-class Qualcomm AI inference accelerators for... ...the implementation of model fine-tuning, distillation...SeniorWork experience placementWork from home$178.4k - $267.6k
.... Job Area: Engineering Group, Engineering Group... ...Summary: AI Performance Engineer... ...software solutions for Inference Acceleration.... ...development to commercial deployment-and demands strategic... ...: ~ Convert, optimize and deploy models for efficient...SeniorWork experience placementWork from home$124k - $280k
...Data, Analytics & AI Industry/Sector... ...data and analytics engineering focus on... ...optimising algorithms, models, and systems to enable... ...health plans. As a Senior Manager, you will... ...) and operational optimization Foster a collaborative... ...of PHI-compliant deployment patterns and HIPAA...SeniorFull timeH1b$165k - $175k
...Lead Software & AI Engineer San Diego, California, United... ..., developing, and deploying advanced software solutions... ..., and Large Language Models (LLMs) into secure... ...frameworks, model deployment, inference pipelines, and... ...Experience developing and optimizing CI/CD pipelines using:...For contractors$168k - $211k
...Get AI-powered advice on this job... ...companies design optimal clinical trials... .... As a Senior AI Data Scientist... ...of the art AI models and a range of... ...and production deployment. Responsibilities... ...with software engineers to develop... ...pipelines and inference models to meet...SeniorFull timeTemporary workWork at officeRemote workWork from homeHome officeFlexible hours- ...Nutanix is seeking a skilled engineer for a role focused on developing software solutions for Inference Acceleration. This position requires experience in delivering... ...encompasses the entire product lifecycle from R&D to deployment. Ideal candidates will have strong...Senior
- ...Country: USA Summary The DevTestOps engineer handles daily requests from the... ...infrastructure issues, manage VMs, explore new AI Tools and develop new automated engineering... ...and tools to automate build, testing, and deployment processes.Monitoring: Designing custom dashboards...SeniorTemporary workWork at officeLocal areaRemote workWorldwideShift work
$124k - $280k
...Competency: Data, Analytics & AI Industry/Sector:... ...in data and analytics engineering focus on leveraging... ...optimising algorithms, models, and systems to enable... ...for health systems. As a Senior Manager, you will serve... ...AI in a HIPAA-compliant deployment context...SeniorFull timeH1b$140k - $150k
...are seeking an innovative and hands-on AI Engineer to join our Data Science team within Business... ...You'll Do Design and deploy scalable AI/ML and LLM-powered... ...lifecycle including deployment, monitoring, optimization, and retraining Implement evaluation...- ...About the job AI Engineer AI / Machine Learning Engineer... ...expertise with hands-on experience deploying LLM-based solutions into... ...ecosystems and understands how to optimize, evaluate, and productionize... ...and benchmark ML and language models using structured...Full timeWork at office
$158.4k - $237.6k
...Qualcomm in San Diego is seeking an AI Software Engineer to develop cutting-edge machine learning techniques and implement AI solutions across... ...include designing software for Qualcomm AI Stack SDKs and optimizing performance for resource-constrained systems. The ideal...- ...Qualcomm in San Diego is seeking an AI Software Engineer to develop and implement cutting-edge machine learning techniques. In this role, you... ...SDK into applications and collaborate with various teams to optimize performance across technology verticals. The ideal candidate...Senior
- ...A leading technology employer is looking for a Senior Software Engineer specializing in Optimization & Decision Support in San Diego, California. The role involves developing and maintaining a multi-objective optimization system, working closely with stakeholders to define...Senior
$178.4k - $267.6k
...A leading technology company in San Diego seeks a Sr. Staff Engineer to join their Machine Learning Engineering team, focusing on model optimization and enabling on-device AI. Candidates should have strong experience in software engineering and AI frameworks, as well...Senior- ...Senior Software Engineer (Optimization & Decision Support) Job Openings Senior Software Engineer (Optimization... ...Job Functions: Develop, deploy, and maintain a containerized multi-... ...understanding and creating complex mathematical models Performance benchmarking and...Senior
$123k - $209k
...The Sr. Machine Learning Engineer, with minimal guidance from more... ...machine learning algorithms and models to solve problems involving biological... ..., design, implementation and deployment. This role strongly... ...the application of cutting-edge AI methodologies at Exact Sciences...SeniorFull timeFor contractorsLocal areaNight shift$134k - $184k
...through implementation and deployment, while leveraging novel technologies... ...in C++14 and software engineering techniques including multi-... ...management, and performance optimization Have experience integrating... ...processing or mathematical modeling Have experience with GPU...SeniorFull timeLocal area- ...mighty team works with senior leaders and... ...strategic and rigorous engineering leader who is... ...innovation and approaches AI agent design as an... ..., develop, and deploy AI agents and... ...production-including model selection, prompt... ...availability and optimal performance of applications...SeniorTemporary workLocal area
- ...SoftClouds is seeking a technical specialist to design, deploy, migrate, manage, and support Oracle Agile PLM environments. The role... ...Oracle Database schemas, perform backups, restores, cloning, and optimize queries. Implement security configurations including SSL/TLS,...SeniorShift work
- ...Overview AI is one of the fastest growing product... ...Seismic Aura, our leading AI engine, is powering this... ...successful sales outcomes. As a Senior Software Engineer – AI/... ...role in developing and optimizing backend systems that... ...and Continuous Deployment (CI/CD) with expertise...Senior
$190k - $240k
...This role is a hands‑on engineering position inside... ...CIAM infrastructure and deployments using Infrastructure as... ...Monitor, debug, and optimize CIAM services for performance... ...API design, data modeling, latency, error handling... ...such as Cursor and other AI‑augmented development...SeniorWork at officeRemote workFlexible hours$160k - $240k
...Senior Software Engineer Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission of protecting... ..., robotics, control systems, optimization, and data analysis to create... ...mission autonomy software stack and deploying them across a breadth of...SeniorFull timeTemporary workPart timeWorldwide$164k - $273k
...year Department Engineering Location San Diego... ...libraries, agentic AI solutions, and analytics... ...in threat‑modeling sessions. Engineer... ...and data structures to optimize performance and scalability... ...Experience with containerized deployment technologies (...SeniorFull timeImmediate start$160k - $240k
...Flight Integration Engineer Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission... ...candidate will be skilled at deploying autonomy solutions onto unmanned platforms... ...the ability to troubleshoot and optimize system performance. Excellent...SeniorFull timeTemporary workPart timeWork experience placementWork at officeWorldwide$135k - $185k
...such as Cameo Systems Modeler, MagicDraw, IBM Rhapsody... ..., and digital engineering activities for Navy and... ...government stakeholders, senior leaders, or cross-functional... ...work-life balance. AI at G2 Ops. At G2 Ops,... ...infrastructures, optimizing resiliency in system design...SeniorFull timeTemporary workWork at officeLocal areaRemote workFlexible hours$111.3k - $166.9k
...Technologies, Inc. Job Area: Engineering Group, Engineering... ...across subsystems— AI/Gen AI, Multimedia and... ...experience with AI and GenAI inference frameworks such as... ...in AI concepts, model architectures, tensor... ...design, implementation, deployment, and support. Principal...SeniorWork experience placementWork from home
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior AI Inference Engineer - Model Optimization & Deployment. Be the first to apply!
Related searches
- ai engineer Nacogdoches, TX
- ai developer Nacogdoches, TX
- senior manager quality engineering Nacogdoches, TX
- senior cloud solutions architect Nacogdoches, TX
- senior strategic account manager Nacogdoches, TX
- sr technical product manager Nacogdoches, TX
- senior director continuous improvement Nacogdoches, TX
- senior performance engineer Nacogdoches, TX
- senior lawyer Nacogdoches, TX
- senior manager diversity & inclusion Nacogdoches, TX

