Senior GPU AI Inference Engineer - Triton & Dynamo

NVIDIA

A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach in a fast-paced environment. This role offers competitive salary options based on experience, along with other incentives. Join us in shaping the AI landscape and contribute to important projects in a diverse and inclusive team. #J-18808-Ljbffr NVIDIA Corporation

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior GPU AI Inference Engineer - Triton & Dynamo in Santa Clara, CA vacancy

Senior System Software Engineer - Dynamo-Triton Inference Server
We are looking for a Senior System Software Engineer to work on Dynamo-Triton Inference Server. NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and... ...using GPUs to power a revolution in AI, enabling breakthroughs in problems...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior System Software Engineer — GPU AI Inference (Triton)
NVIDIA Gruppe is seeking a Senior System Software Engineer in Santa Clara, California, to develop world-class GPU-accelerated AI inference serving software. This role involves contributing to feature development and optimizing software for deployment in production environments...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Software Engineer - AI Inference
$152k - $241.5k
...platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer - AI Inference to advance open‑source... ...work. Improve multi‑GPU inference performance and... ...to vLLM, SGLang, PyTorch, Triton, NCCL, Dynamo or adjacent serving/runtime...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
...Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud... .... Build fast. Sr. Staff Software Engineer - High Performance GPU Inference Systems Mission Push the limits... ...systems in production (e.g., Triton, TensorRT, or custom GPU services)....
Senior
I did my part and supported the Regular Toilet
Palo Alto, CA
1 day ago
Principal AI Inference Engineer Open-Source & GPU-Focused
$272k - $431.25k
NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
AI Inference Engineer
...Role: AI Inference Engineer Location: San Jose, CA Duration: 6 to 1... ...intersection of systems engineering, GPU optimization, and... ...frameworks such as vLLM, SGLang, Triton Inference Server, TensorRT-... ...Serving Platform (Dynamo) Contribute to distributed...
Triune Infomatics Inc
San Jose, CA
7 days ago
AI Inference Performance Engineer
$152k - $241.5k
...optimize and benchmark GenAI inference on NVIDIA's latest... ...sits at the intersection of GPU performance engineering and public accountability.... ...workflows, and other emerging AI use cases. Collaborate with... ...cuteDSL, tilelang, OpenAI Triton) or compiler/runtime paths...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI Performance Engineer—GPU & HPC
$184k - $356.5k
NVIDIA Gruppe is seeking a Senior Developer Technology Engineer in Santa Clara, California, to innovate AI workloads through GPU acceleration. This role involves deep research into optimizing algorithms for deep learning and enhancing CPU and GPU architecture designs. Ideal...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI Inference Kernel Engineer
$184k - $287.5k
NVIDIA Gruppe in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a Master's degree and possess over 6 years of experience in ML/DL systems development. The role involves...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI & HPC Engineer for Finance — GPU/CUDA
$152k - $287.5k
NVIDIA Gruppe is seeking a Senior AI Developer Technology Engineer for the Financial Sector to design and optimize parallel algorithms for high-performance AI workloads. The role involves researching GPU acceleration techniques for AI and HPC workloads, collaborating with...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI Engineer - GPU-Driven Multi-Agent Systems
A leading technology company in Santa Clara is seeking a Senior High Performance AI Engineer to build groundbreaking multi-agent systems for the CUDA... ...development, proficiency in C/C++ and Python, and experience with GPU programming. This role offers competitive salaries and...
Senior
Nvidia Corporation
Santa Clara, CA
5 days ago
Senior Software Development Engineer - SGLang and Inference Stack
...computing experiences-from AI and data centers, to... ...in enhancing GPU kernel performance, accelerating... ...SOTA LLM and Multimodal inference at scale across multi-... ...PERSON: Skilled engineer with strong technical... ...in GPGPU C++, Triton, TileLang or DSL development...
Senior
Advanced Micro Devices , Inc.
Santa Clara, CA
4 days ago
Senior Software Development Engineer - LLM Inference Framework
...computing experiences-from AI and data centers, to... ...THE ROLE: As a senior member of the LLM inference framework team, you will... ...intersection of inference engines, distributed systems, and GPU runtime and kernel... ...Collaborate with compiler teams (Triton, LLVM, ROCm) to unblock...
Senior
Advanced Micro Devices , Inc.
Santa Clara, CA
4 days ago
Senior Software Engineer, Inference
$152k - $204k
...Senior Software Engineer, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers... ...-per-token analytics, GPU resource isolation).... ...inference frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve,...
Senior
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
5 days ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Software Engineer I, Inference
$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers... ...-per-token analytics, GPU resource isolation).... ...inference frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve...
Senior
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
5 days ago
Senior High-Performance AI Engineer — GPU & Multi-Agent Systems (Equity)
$184k - $287.5k
2100 NVIDIA USA in Santa Clara is seeking a Senior High Performance AI Engineer to design and optimize cutting-edge AI systems. The role involves collaboration... ...Python programming skills, and hands-on experience with GPU programming. NVIDIA offers a competitive salary range of...
Senior
2100 NVIDIA USA
Santa Clara, CA
3 days ago
Senior Windows AI Platform Engineer — GPU-Driven AI Deployment
NVIDIA Gruppe is looking for an experienced GPU Deployment Engineer to tackle end-to-end AI deployment challenges on the NVIDIA RTX AI platform. The role involves analyzing GPU-accelerated applications, improving user experiences, and collaborating with teams to influence...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale... ...performance inference stacks, optimize GPU kernels and compilers, drive... ...with ML compilers and DSLs (e.g., Triton, TorchDynamo/Inductor, MLIR/LLVM, XLA...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Principal AI Inference Systems Engineer
...computing experiences-from AI and data centers, to PCs, gaming... ...AMD is looking for a Senior Staff AI Infra Engineer who is passionate about improving... ...on AI/ML workloads and GPU-accelerated computing. As... ...accelerate LLM training and inference on AMD GPUs, improving kernel...
Advanced Micro Devices , Inc.
Santa Clara, CA
2 days ago
Senior ML Inference Engineer - Platform
$128.7k - $261.3k
The Model Deployment & Inference Solutions team in GM AV deploys machine... ...equivalent) as part of your engineering workflow. Experience... ...Familiarity with the NVIDIA GPU stack at the integration level... ...CUDA-aware Python, TensorRT, Triton inference server, torch.compile...
Senior
Flexible hours
General Motors
Sunnyvale, CA
4 days ago
Senior AI Inference Compiler Engineer — Equity Eligible
$152k - $241.5k
NVIDIA Gruppe is seeking an AI & Deep Learning Compiler Engineer for its Deep Learning & AI Compiler team in Santa Clara, California. This role involves analyzing and optimizing deep learning networks, as well as developing compiler algorithms to enhance performance on...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior AI & DL Kernel Engineer for Inference & GPUs Remote
$184k - $287.5k
A leading technology company is seeking a Senior Software Engineer for AI and DL Kernel Libraries in Santa Clara, CA. The role involves designing and optimizing kernels for high-impact AI workloads and collaborating with engineers on innovative solutions. Candidates should...
Senior
Remote job
NVIDIA Corporation
Santa Clara, CA
2 days ago
Senior AI Engineer
$209k
...Machine Learning Platform Engineer Immigration... ...performance LLM training GPU infrastructure and cluster... ...Understand the auto scale for inference service and multi-... ...and resource-efficient AI workloads across multi-... ...kernels (e.g., CUDA, Triton); • Systems Programming...
Senior
Work at office
Remote work
1 day per week
Zoom Video Communications
San Jose, CA
5 days ago
Senior High Performance AI Engineer
$184k - $287.5k
...the unlimited potential of AI to define the next era of computing... .... An era in which our GPU acts as the brains of... ...are looking for outstanding Senior High Performance AI Engineer to build groundbreaking multi... ...distributed training, and inference/serving—and with model/agent...
Senior
2100 NVIDIA USA
Santa Clara, CA
5 days ago
Senior AI Infrastructure Engineer
$180k - $240k
...role We are seeking a Senior AI Infrastructure Engineer to design, build, and... ...Architect and optimize multi-GPU setups, ensuring efficient... ...Operator, KubeFlow). Inference Performance Engineering: Deploy... ..., ONNX Runtime, and Triton Inference Server, fine-tuning...
Senior
Odd job
Work at office
Gatik AI
Mountain View, CA
3 days ago
Senior AI Algorithm Engineer in oneDNN
$195.2k - $275.58k
The Software and AI (SAI) organization is seeking a highly skilled Software Development Engineer to contribute to the development and optimization... ...best‑in‑class deep‑learning inference and training throughput on... ...development on Linux GPU optimizations (OpenCL, CUDA,...
Senior
Local area
Remote work
Worldwide
Flexible hours
Shift work
Intel Corporation
Santa Clara, CA
5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior GPU AI Inference Engineer - Triton & Dynamo. Be the first to apply!