High-Performance AI Inference Engineer (TensorRT)
$124k - $195.5kNVIDIA Gruppe
NVIDIA Gruppe is looking for a passionate Software Engineer to join its TensorRT team in Santa Clara, California. This role involves designing and developing high-performance AI inference solutions while contributing to performance optimizations and collaborating with various teams. The ideal candidate possesses a Master's or PhD in relevant fields and strong C++ skills. The position offers a competitive base salary ranging from 124,000 USD to 195,500 USD based on location and experience. #J-18808-Ljbffr NVIDIA Gruppe
- ...Corporation is looking for a passionate Software Engineer to join the TensorRT team in Santa Clara, California. You will... ...in deep learning and work with cutting-edge AI technology, contributing to high-performance AI inference solutions. Your role involves designing and...Performance
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...Performance$152k - $241.5k
...and benchmark GenAI inference on NVIDIA's latest... ...the industry’s performance standards across language... ...directly within TensorRT-LLM, SGLang, and... ...GPU performance engineering and public accountability... ...other emerging AI use cases.... ...learning inference or high-performance systems...Performance$152k - $241.5k
...in the age of Generative AI? Join NVIDIA’s TensorRT team to help lead a first... ...point for out-of-framework inference globally. We are moving... ...of AI agents to produce high-performance, high-quality, modern C++... ...are a systems‑thinking C++ engineer who wants to help scale...Performance$152k - $241.5k
Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software... ...multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning inference. Closely...Performance- ...Role: AI Inference Engineer Location: San Jose, CA Duration: 6 to... ...Overview: We are seeking a highly skilled AI Inference Engineer... ...our team and drive the performance, scalability, and reliability... ..., Triton Inference Server, TensorRT-LLM, TorchServe, or KServe...Performance
$100k
High Speed AI Interconnect Signal Integrity Engineer Tenstorrent is leading the industry on cutting ‑edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI... ...technologies for next‑generation AI inference and training clusters. This...PerformancePermanent employment$184k - $287.5k
...unlimited potential of AI to define the next era... ...Develop reproducible, high-fidelity evaluation frameworks covering performance, quality and developer... ...distributed training, and inference/serving—and with model/... ...Computer Science, Electrical Engineering, or related field (or...Performance$152k - $241.5k
...learning ignited modern AI — the next era of... ...Deep Learning Compiler Engineer. NVIDIA is hiring software... ...the backbone of NVIDIA’s inference engine, spanning across... ...deliver leading inference performance, fast build time,... ...kernel generation with high performance and fast build...Performance$272k - $431.25k
NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing...Performance- NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Software Engineer focused on improving performance of deep learning inference software like TensorRT. The ideal candidate will have a strong foundation in C++ and Python, relevant experience with deep learning frameworks...Performance
$248.71k - $292.6k
...Groq delivers fast, efficient AI inference. Our LPU-based system... ...we are on a mission to make high performance AI compute more accessible... ...Build fast. Sr. Staff Software Engineer - High Performance GPU... ...production (e.g., Triton, TensorRT, or custom GPU services). Deploying...Performance$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...Performance$124k - $195.5k
NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have...Performance$152k - $241.5k
NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant...Performance$184k - $287.5k
...tapping into the unlimited potential of AI to define the next era of computing. An era... ...platforms, identifying fundamental performance limiters. Prioritize and solve performance... ...most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive...Performance$229.9k - $262.4k
...Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) Overview: At Capital One, we are creating responsible and... ...capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring...PerformanceFull timePart timeLocal area$152k - $204k
...Senior Software Engineer, Inference Sunnyvale, CA / Bellevue... ...The Essential Cloud for AI™. Built for pioneers... ...superior infrastructure performance with deep technical... ...custom accelerators for high-efficiency workloads... ...frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve,...PerformancePermanent employmentTemporary workCasual workWork at officeFlexible hoursShift work$120.1k - $225.7k
...Entails End-to-End Inference Optimization: Lead the optimization... ...: Design and implement high-performance inference frameworks;... ...team members to build a robust AI inference technical ecosystem... ...Computer Science, Electronic Engineering, AI, or related fields;...PerformanceRelocation package- ...Devices is seeking a strategic software engineering lead in Santa Clara, California. This role involves improving application performance and engaging in sophisticated software engineering... ...include developing techniques for inference optimization and supporting the ROCm...Performance
- ...time large language model inference? Join NVIDIA’s TensorRT Edge‑LLM team and help... ...next generation of edge AI for automotive and robotics... ...and robotics to deliver high‑performance, production‑ready solutions... ...Science, Electrical/Computer Engineering, or a closely related...Performance
$124k - $195.5k
Deep Learning Software Engineer, TensorRT Performance We are now looking for a Deep Learning... ...performance of NVIDIA’s inference ecosystem. NVIDIA is... ...breakthroughs in areas like Generative AI, Recommenders and Vision... ...employer. As we highly value diversity in our current...Performance$132k - $330k
Software Engineer, AI Inference Codesign The AI inference co-design team's goal is to take research... ...cutting-edge MLIR compiler and solve the performance bottlenecks needed to achieve real-... ...utilization and throughput Implement/improve highly performant micro kernels for Tesla's...PerformanceHourly payFull timeTemporary workFlexible hours- ...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing... ...with distributed, high-performance software design and implementation... ...fields Experience with inference servers/model serving frameworks (such as TensorRT-LLM, vLLM, SGLang, etc.)...PerformanceWork experience placement3 days per week
$184k - $287.5k
2100 NVIDIA USA in Santa Clara is seeking a Senior High Performance AI Engineer to design and optimize cutting-edge AI systems. The role involves collaboration with software and hardware teams to create innovative runtimes and orchestration tools for the CUDA ecosystem...Performance$188k - $275k
...Essential Cloud for AI™. Built for... ...superior infrastructure performance with deep... ...What You'll Do: Inference Platform Team The... ...powering low-latency, high-throughput AI workloads... ...a Staff Software Engineer (IC5) on the... ...as vLLM, Triton, TensorRT-LLM, Ray Serve, or...PerformancePermanent employmentTemporary workCasual workWork at officeFlexible hours- ...generation computing experiences-from AI and data centers, to PCs, gaming... ...for a Senior Staff AI Infra Engineer who is passionate about improving the performance of key applications and benchmarks... ...and accelerate LLM training and inference on AMD GPUs, improving kernel, communication...Performance
- ...AI Platform Engineer - Training & Inference Saviynt's AI-powered identity platform manages and governs human... ...), and NVIDIA Triton (TensorRT/ONNX) as a unified deployment graph... ...memory sharing • Optimise inference performance: configure fractional GPU allocation...Performance
- ...We're a team of engineers, clinicians, and innovators... ...helps care teams perform with greater precision... .... As a Senior AI/ML Research Engineer... ...multimodal stack-where a high-level model... ...Real-time / edge inference optimization (e.g., TensorRT, NVIDIA Jetson)....PerformanceLocal areaWorldwideFlexible hours
$152k - $241.5k
NVIDIA Gruppe is seeking a passionate C++ engineer to join the TensorRT team in Santa Clara, California. This role focuses on architecting an AI-native framework, utilizing advanced... ...learning techniques and optimizing performance through agent-based workflows. If you...Performance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to High-Performance AI Inference Engineer (TensorRT). Be the first to apply!
- senior ai engineer Santa Clara, CA
- ai ml engineer Santa Clara, CA
- ai engineer remote Santa Clara, CA
- ai engineer Santa Clara, CA
- ai prompt engineer Santa Clara, CA
- ai developer Santa Clara, CA
- machine learning ai engineer Santa Clara, CA
- senior performance engineer Santa Clara, CA
- application performance engineer Santa Clara, CA
- performance engineer Santa Clara, CA


