High-Performance AI Inference Engineer (TensorRT)

$124k - $195.5k

NVIDIA Gruppe

NVIDIA Gruppe is looking for a passionate Software Engineer to join its TensorRT team in Santa Clara, California. This role involves designing and developing high-performance AI inference solutions while contributing to performance optimizations and collaborating with various teams. The ideal candidate possesses a Master's or PhD in relevant fields and strong C++ skills. The position offers a competitive base salary ranging from 124,000 USD to 195,500 USD based on location and experience. #J-18808-Ljbffr NVIDIA Gruppe

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the High-Performance AI Inference Engineer (TensorRT) in Santa Clara, CA vacancy

AI Inference Systems Engineer - TensorRT Special Platforms
...Corporation is looking for a passionate Software Engineer to join the TensorRT team in Santa Clara, California. You will... ...in deep learning and work with cutting-edge AI technology, contributing to high-performance AI inference solutions. Your role involves designing and...
Performance
NVIDIA Corporation
Santa Clara, CA
1 day ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
AI Inference Performance Engineer
$152k - $241.5k
...and benchmark GenAI inference on NVIDIA's latest... ...the industry’s performance standards across language... ...directly within TensorRT-LLM, SGLang, and... ...GPU performance engineering and public accountability... ...other emerging AI use cases.... ...learning inference or high-performance systems...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI-Native Systems Software Engineer, TensorRT
$152k - $241.5k
...in the age of Generative AI? Join NVIDIA’s TensorRT team to help lead a first... ...point for out-of-framework inference globally. We are moving... ...of AI agents to produce high-performance, high-quality, modern C++... ...are a systems‑thinking C++ engineer who wants to help scale...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software... ...multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning inference. Closely...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
AI Inference Engineer
...Role: AI Inference Engineer Location: San Jose, CA Duration: 6 to... ...Overview: We are seeking a highly skilled AI Inference Engineer... ...our team and drive the performance, scalability, and reliability... ..., Triton Inference Server, TensorRT-LLM, TorchServe, or KServe...
Performance
Triune Infomatics Inc
San Jose, CA
17 hours ago
High Speed AI Interconnect Signal Integrity Engineer
$100k
High Speed AI Interconnect Signal Integrity Engineer Tenstorrent is leading the industry on cutting‑edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI... ...technologies for next‑generation AI inference and training clusters. This...
Performance
Permanent employment
Tenstorrent Inc.
Santa Clara, CA
4 days ago
Senior High Performance AI Engineer
$184k - $287.5k
...unlimited potential of AI to define the next era... ...Develop reproducible, high-fidelity evaluation frameworks covering performance, quality and developer... ...distributed training, and inference/serving—and with model/... ...Computer Science, Electrical Engineering, or related field (or...
Performance
NVIDIA
Santa Clara, CA
4 days ago
Senior AI Inference Compiler Engineer
$152k - $241.5k
...learning ignited modern AI — the next era of... ...Deep Learning Compiler Engineer. NVIDIA is hiring software... ...the backbone of NVIDIA’s inference engine, spanning across... ...deliver leading inference performance, fast build time,... ...kernel generation with high performance and fast build...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Principal AI Inference Engineer Open-Source & GPU-Focused
$272k - $431.25k
NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
DL Software Engineer - TensorRT Performance & Inference
NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Software Engineer focused on improving performance of deep learning inference software like TensorRT. The ideal candidate will have a strong foundation in C++ and Python, relevant experience with deep learning frameworks...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
...Groq delivers fast, efficient AI inference. Our LPU-based system... ...we are on a mission to make high performance AI compute more accessible... ...Build fast. Sr. Staff Software Engineer - High Performance GPU... ...production (e.g., Triton, TensorRT, or custom GPU services). Deploying...
Performance
I did my part and supported the Regular Toilet
Palo Alto, CA
4 days ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
AI Inference Performance Engineer — Scale LLMs & GPU Clusters
$124k - $195.5k
NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have...
Performance
NVIDIA Corporation
Santa Clara, CA
3 days ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior High-Performance AI Training Engineer
$184k - $287.5k
...tapping into the unlimited potential of AI to define the next era of computing. An era... ...platforms, identifying fundamental performance limiters. Prioritize and solve performance... ...most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)
$229.9k - $262.4k
...Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) Overview: At Capital One, we are creating responsible and... ...capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring...
Performance
Full time
Part time
Local area
Capital One Financial Corp
San Jose, CA
4 days ago
Senior Software Engineer, Inference
$152k - $204k
...Senior Software Engineer, Inference Sunnyvale, CA / Bellevue... ...The Essential Cloud for AI™. Built for pioneers... ...superior infrastructure performance with deep technical... ...custom accelerators for high-efficiency workloads... ...frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve,...
Performance
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
3 days ago
Sr. AI Inference Systems Engineer
$120.1k - $225.7k
...Entails End-to-End Inference Optimization: Lead the optimization... ...: Design and implement high-performance inference frameworks;... ...team members to build a robust AI inference technical ecosystem... ...Computer Science, Electronic Engineering, AI, or related fields;...
Performance
Relocation package
Tencent
Palo Alto, CA
2 days ago
Senior AI Inference & Distributed Systems Engineer
...Devices is seeking a strategic software engineering lead in Santa Clara, California. This role involves improving application performance and engaging in sophisticated software engineering... ...include developing techniques for inference optimization and supporting the ROCm...
Performance
Advanced Micro Devices
Santa Clara, CA
3 days ago
Senior Software Engineer - TensorRT Edge-LLM
...time large language model inference? Join NVIDIA’s TensorRT Edge‑LLM team and help... ...next generation of edge AI for automotive and robotics... ...and robotics to deliver high‑performance, production‑ready solutions... ...Science, Electrical/Computer Engineering, or a closely related...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026
$124k - $195.5k
Deep Learning Software Engineer, TensorRT Performance We are now looking for a Deep Learning... ...performance of NVIDIA’s inference ecosystem. NVIDIA is... ...breakthroughs in areas like Generative AI, Recommenders and Vision... ...employer. As we highly value diversity in our current...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago
AI Inference Co-Design Engineer for Real-Time HW
$132k - $330k
Software Engineer, AI Inference Codesign The AI inference co-design team's goal is to take research... ...cutting-edge MLIR compiler and solve the performance bottlenecks needed to achieve real-... ...utilization and throughput Implement/improve highly performant micro kernels for Tesla's...
Performance
Hourly pay
Full time
Temporary work
Flexible hours
Tesla Motors, Inc.
Palo Alto, CA
22 hours ago
Principal AI/ML System Software Engineer
...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing... ...with distributed, high-performance software design and implementation... ...fields Experience with inference servers/model serving frameworks (such as TensorRT-LLM, vLLM, SGLang, etc.)...
Performance
Work experience placement
3 days per week
d-Matrix
Santa Clara, CA
2 days ago
Senior High-Performance AI Engineer — GPU & Multi-Agent Systems (Equity)
$184k - $287.5k
2100 NVIDIA USA in Santa Clara is seeking a Senior High Performance AI Engineer to design and optimize cutting-edge AI systems. The role involves collaboration with software and hardware teams to create innovative runtimes and orchestration tools for the CUDA ecosystem...
Performance
2100 NVIDIA USA
Santa Clara, CA
1 day ago
Staff Software Engineer, Inference
$188k - $275k
...Essential Cloud for AI™. Built for... ...superior infrastructure performance with deep... ...What You'll Do: Inference Platform Team The... ...powering low-latency, high-throughput AI workloads... ...a Staff Software Engineer (IC5) on the... ...as vLLM, Triton, TensorRT-LLM, Ray Serve, or...
Performance
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
Sunnyvale, CA
8 days ago
Principal AI Inference Systems Engineer
...generation computing experiences-from AI and data centers, to PCs, gaming... ...for a Senior Staff AI Infra Engineer who is passionate about improving the performance of key applications and benchmarks... ...and accelerate LLM training and inference on AMD GPUs, improving kernel, communication...
Performance
Advanced Micro Devices , Inc.
Santa Clara, CA
22 hours ago
AI Platform Engineer, Training and Inference
...AI Platform Engineer - Training & Inference Saviynt's AI-powered identity platform manages and governs human... ...), and NVIDIA Triton (TensorRT/ONNX) as a unified deployment graph... ...memory sharing • Optimise inference performance: configure fractional GPU allocation...
Performance
Saviynt
Milpitas, CA
3 days ago
Senior AI/ML Research Engineer (Computer Vision)
...We're a team of engineers, clinicians, and innovators... ...helps care teams perform with greater precision... .... As a Senior AI/ML Research Engineer... ...multimodal stack-where a high-level model... ...Real-time / edge inference optimization (e.g., TensorRT, NVIDIA Jetson)....
Performance
Local area
Worldwide
Flexible hours
Intuitive
Sunnyvale, CA
2 days ago
AI-Native Systems Engineer: High-Performance C++ & Agents
$152k - $241.5k
NVIDIA Gruppe is seeking a passionate C++ engineer to join the TensorRT team in Santa Clara, California. This role focuses on architecting an AI-native framework, utilizing advanced... ...learning techniques and optimizing performance through agent-based workflows. If you...
Performance
NVIDIA Gruppe
Santa Clara, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to High-Performance AI Inference Engineer (TensorRT). Be the first to apply!