Engineering Manager, Inference Benchmarking — AI Perf

$224k - $356.5k

NVIDIA Gruppe

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s an outstanding legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self‑driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIA, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. NVIDIA’s open‑source benchmarking platform, AIPerf, is the growing standard for assessing LLM serving performance across various inference frameworks. Hyperscalers, cloud providers, and enterprises use AIPerf to inform decisions on production inference. This includes choosing GPUs, optimizing costs, reducing latency, improving efficiency, and scaling. AIPerf spans LLM, multimodal, diffusion, and computer vision inference. This position combines hands‑on leadership with expertise in systems engineering, inference infrastructure, and open‑source communities. It has a direct effect on how AI performance is measured and pushed forward. As Technical Lead Manager, you will lead the engineering team within NVIDIA’s Dynamo organization. Your responsibility is to build and advance the platform so AIPerf becomes the leading benchmarking tool for datacenter, local, and edge use cases. It has a direct effect on how AI performance is measured and pushed forward. What you’ll be doing: Driving the technical roadmap for AIPerf’s core infrastructure: load generation, ZMQ‑based microservices, GPU telemetry (DCGM/PyNVML, Prometheus metrics, statistical confidence intervals, and Kubernetes‑native deployment. Taking ownership for the accuracy and statistical soundness of benchmark results that engineering groups throughout the industry depend on to inform production infrastructure decisions. Advising upstream engine integrations involving vLLM, TRT‑LLM, and SGLang in partnership with NVIDIA’s Dynamo and NIM teams to maintain AIPerf’s relevance across emerging hardware, workload categories, and inference configurations. Hiring, mentoring, and growing a team of senior engineers operating in a high‑velocity open‑source environment with active external contributors worldwide. What we need to see: Bachelor’s degree in Computer Science, Electrical Engineering, or related field, or equivalent experience. 8+ overall years of software engineering experience building performance‑critical infrastructure, ML tooling, or distributed systems. 3+ years of engineering leadership experience as a tech lead, TLM, or engineering manager. Deep understanding of LLM inference mechanics — TTFT, ITL, KV caching, Prefill/Decode, speculative decoding — and the ability to reason about measurement correctness and reproducibility. Proven track record of collaborating across multi‑functional groups and delivering production‑quality output in high‑velocity, high‑external‑visibility environments. Ways to stand out from the crowd: Extensive experience with vLLM, TRT‑LLM or SGLang internals along with contributions to their upstream projects. Experience building Kubernetes‑native infrastructure including operators, Helm charts, and GPU observability tooling (DCGM, dcgm‑exporter, PyNVML). Background in competitive benchmarking frameworks such as MLPerf or equivalent industry‑standard evaluation systems. History leading or making meaningful contributions to active open‑source projects with external communities. Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000USD–356,500USD. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until June1,2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA Gruppe

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Engineering Manager, Inference Benchmarking — AI Perf in Santa Clara, CA vacancy

Engineering Manager, Inference Benchmarking AI Perf
$224k - $356.5k
...the unlimited potential of AI to define the next era of... ...NVIDIA’s open-source benchmarking platform, AIPerf, is the... ...performance across various inference frameworks. Hyperscalers,... ...scaling. As Technical Lead Manager, you will lead the engineering team within NVIDIA’s...
Suggested
Local area
Worldwide
NVIDIA
Santa Clara, CA
2 days ago
Inference Benchmarking Engineering Manager
$224k - $356.5k
NVIDIA Corporation is seeking a Technical Lead Manager to lead the engineering team in developing the AIPerf platform, a benchmark tool for LLM and computer vision inference workloads. The ideal candidate will have extensive experience in software engineering and leadership...
Suggested
NVIDIA Corporation
Santa Clara, CA
3 days ago
Principal Engineer - Perf and Benchmarking
$206k - $333k
...The Essential Cloud for AI™. Built for pioneers by... ...for a Principal Engineer to be the technical lead of CoreWeave's Benchmarking & Performance team. You... ...If MLPerf (Training & Inference), Working closely with... ..., and audit trails. Perf Ownership - Lead end-to...
Suggested
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
Sunnyvale, CA
15 days ago
Principal AI Inference Systems Engineer
...generation computing experiences-from AI and data centers, to PCs,... ...for a Senior Staff AI Infra Engineer who is passionate about... ...performance of key applications and benchmarks, with a special focus on AI/... ...accelerate LLM training and inference on AMD GPUs, improving kernel...
Suggested
Advanced Micro Devices , Inc.
Santa Clara, CA
2 days ago
Principal Machine Learning Engineer, Mobile AI Inference Optimization
$278.1k - $347.6k
..., CA, USA Principal Machine Learning Engineer, Mobile AI Inference Optimization Location Mountain View... ..., code review standards, and on-device benchmarking methodology. Partner with platform engineers, product managers, and runtime teams to align ML capabilities...
Suggested
Work at office
Worldwide
Relocation package
Unity Technologies
Mountain View, CA
10 days ago
Engineering Manager, Inference ML Runtime
...Engineering Manager, Inference ML Runtime Sunnyvale CA or Toronto Canada Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the...
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
2 days ago
Principal AI Inference Engineer Open-Source & GPU-Focused
$272k - $431.25k
NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Principal Engineer - Distributed AI Systems Architecture (Heterogeneous Compute)
$255.85k - $361.2k
...are seeking a Principal Engineer to define and... ...generation of distributed AI systems across heterogeneous... ...hardware while managing state, locality, and performance... ...with AI/ML systems, inference infrastructure, or... ...performance optimization and benchmarking. Job Type and...
Local area
Shift work
Intel Corporation
Santa Clara, CA
1 day ago
Machine Learning Engineering Manager, Proactive - On-Device Modeling
$198.3k - $342.8k
...Machine Learning Engineering Manager, Proactive - On-Device Modeling The AI represents a unique opportunity to elevate Apple's products and revolutionize the... ...architectures, transformers, attention mechanisms, and inference optimization ~ Strong software engineering...
Work experience placement
Relocation
Apple
Santa Clara, CA
5 days ago
Engineering Manager, Rendering Infrastructure - Simulation
$185.1k - $284.1k
The Role As the Tech Lead Manager for the Rendering Infrastructure team... ...small, high-leverage group of engineers. The team owns the... ...the team. Champion the use of AI-assisted development tools across... ...codebases (e.g., Nsight, Tracy, perf, custom instrumentation). Comfort...
Remote work
Flexible hours
General Motors
Sunnyvale, CA
14 hours ago
Director, System Software Engineering - Metropolis Accelerated and Inferencing Software
$320k
...Within NVIDIA's Edge AI, Metropolis, and Blueprints (EMB), this team is the execution engine behind NVIDIA’s Vision AI strategy—owning... ...robust, low-latency inference at scale. You have led teams... ...and partners. Performance Benchmarking: Orchestrate efforts to achieve...
NVIDIA
Santa Clara, CA
1 day ago
Engineering Manager, Express AI Foundations
$146.3k - $289.9k
...produce impressive content effortlessly. The AI Foundations team builds the flexible,... .... We are looking for an Engineering Manager to lead and grow a team of engineers building... ...workstreams — including LLM orchestration, inference services, data pipelines, and...
Temporary work
Local area
Immediate start
Worldwide
Flexible hours
Adobe
San Jose, CA
4 days ago
Principal Engineer, Solutions Architect Lead - Industrial & Embedded IoT, Edge AI On‑Prem Appliance
$220.2k - $330.4k
...Technologies, Inc. Job Area: Engineering Group, Engineering... ...edge, focusing on AI, edge computing and connectivity... ...for generative AI inference and computer vision... ..., partners, product management, customer engineering... ...runtimes. Model/system benchmarking and E2E evaluation (latency...
Work experience placement
Work at office
Qualcomm
Santa Clara, CA
1 day ago
Senior Manager, Engineering - Enterprise AI and Automation
$272k - $431.25k
As a Senior Engineering Manager for Agentic Systems & Platform Architecture, you will... ...quality—backed by evaluations, benchmarking, and feedback loops Assess and... ...and GPU‑optimized training and inference workflows Lead integration of the AI Data Platform into NVIDIA’s on‑...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Engineering Manager - ML, Self-Driving Systems
$255.7k - $346k
...the future of physical AI. Founded in 2017 and... ...trust our employees to manage their schedules responsibly... ...We are looking for an Engineering Manager to lead ML... ...hardware, not benchmarks on a leaderboard. Own... ...training code to onboard inference. Experience managing...
Full time
For contractors
For subcontractor
Casual work
Work at office
Remote work
Day shift
Decisive Point
Sunnyvale, CA
3 days ago
LLM Inference Engineering Manager — Hybrid | Equity
A leading technology company in California is seeking an Engineering Manager to lead the development of cutting-edge LLM/VLM technologies. In this hands-on leadership role, you will manage a team responsible for optimizing runtime and frameworks, while collaborating with...
NVIDIA Corporation
Santa Clara, CA
4 days ago
Engineering Manager, AI
At Coram AI, we’re reimagining video security for the modern world. Our cloud-native platform uses computer vision and AI to... ...safer and more connected. We are looking for a technically deep Engineering Manager to lead the AI team at Coram. This team is small, highly...
Shift work
Coram AI
Sunnyvale, CA
2 days ago
Engineering Manager, AI Observability
The AI Observability team at Netflix makes AI, ML, and Agentic... ...Partner with ML researchers, engineers, and platform teams to embed... ...across model training, online inference, and agent orchestration.... ...engineering experience and 3+ years of management experience. Experience...
Hourly pay
Full time
Immediate start
Flexible hours
Netflix, Inc.
Los Gatos, CA
3 days ago
Engineering Manager - ML, Self-Driving Systems
$255.7k - $346k
...States and Europe. We are looking for an Engineering Manager to lead ML teams within SDS Core. This... ...running on customer hardware, not benchmarks on a leaderboard. Own the offboard ML... ...full stack from training code to onboard inference. Experience managing through...
Full time
Applied Intuition
Sunnyvale, CA
2 days ago
Principal Engineer, Inference
$206k - $303k
...CoreWeave is the AI Hyperscaler™, delivering a cloud platform... ...We’re seeking a Principal Engineer to serve as the hands-on technical... ...for our next-generation Inference Platform . As a senior individual... ..., and performance benchmarks across gRPC/ CUDA Graphs, and...
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
more than 2 months ago
Engineering Manager, AI Developer Technology
$224k - $356.5k
Engineering Manager, AI Developer Technology page is loaded **Engineering Manager, AI Developer Technology**locationsUS, CA, Santa ClaraUS, MA... ...and performance optimization of Deep Learning training and inference* Background in parallel programming, e.g., CUDA, OpenMP, MPI...
Temporary work
NVIDIA Corporation
Santa Clara, CA
4 days ago
AI/ML Silicon Validation Manager for Custom TPU
A leading technology company based in Sunnyvale is seeking a Silicon Validation Engineering Manager to lead their validation efforts for custom TPU silicon. This role involves strategizing and executing test plans, managing resources, and ensuring the silicon meets all...
Google Inc.
Sunnyvale, CA
3 days ago
TPU AI Hardware Validation Lead
$156k - $229k
A leading technology company in Sunnyvale seeks a Senior Silicon Validation Engineer to drive cutting-edge TPU technology in AI/ML applications. You'll own silicon validation across the life-cycle, ensuring robust performance and optimization. The ideal candidate has a...
Google Inc.
Sunnyvale, CA
1 day ago
Principal Machine Learning Engineer
...Splunk AI Models Team Splunk, a Cisco company, is building a safer, more resilient... ...excellence of Splunk and Cisco's global engineering capabilities. Our work spans networking,... ..., distributed training pipelines, and inference efficiency to minimize cost and latency...
Flexible hours
Webex Events (formerly Socio)
Santa Clara, CA
3 days ago
Senior Engineering Manager, Enterprise AI
Business Area Engineering Seniority Level Mid-Senior level Job Description At Cloudera, we... ...actionable insights. With as much data under management as the hyperscalers, we’re the preferred... ..., streaming, operational databases, and AI. Cloudera is looking for a Senior...
Work from home
Worldwide
Flexible hours
Nerdleveltech
Santa Clara, CA
4 days ago
Lead Principal Engineer, Enterprise Agentic AI Platform
$272k - $431.25k
...NVIDIA IT’s Enterprise AI & Automation team to... ...business results across engineering, IT, supply chain,... ...from Kubernetes to GPU inference stacks and translate new... ...quality through telemetry, benchmarking, automated evaluation,... ...models, multi-agent management (e.g., LangChain,...
NVIDIA Corporation
Santa Clara, CA
1 day ago
Sr. Engineering Manager - Product Engineering
$160k - $240k
...scale, come make a difference at Fiserv. Job Title Sr. Engineering Manager – Product Engineering About your role: Fiserv is a global... ..., and success metrics beyond just technical requirements. AI-Powered Engineering: Promote the effective use of AI software...
Temporary work
Work at office
Worldwide
Monday to Friday
Fiserv
Sunnyvale, CA
5 days ago
Engineering Manager - ML Platform and Infrastructure
$204k - $343k
...powering the future of physical AI. Founded in 2017 and now... ...flexibility and trust our employees to manage their schedules responsibly.... .... About the role As an Engineering Manager on the ML Platform... ...three critical areas: Training & Inference Orchestration, where we build...
Full time
For contractors
For subcontractor
Casual work
Work at office
Remote work
Day shift
Decisive Point
Sunnyvale, CA
3 days ago
Senior Engineering Manager, Platform Services- Weights & Biases
$188k - $275k
CoreWeave, the AI Hyperscaler™, acquired Weights & Biases to create the most powerful... ...training clusters, agent building, and inference at scale, we’re combining forces to... ...training experiment. Mentor and Grow Engineers: Manage and coach a high-caliber team of backend...
Permanent employment
Temporary work
Casual work
Work at office
Immediate start
Remote work
Flexible hours
Weights & Biases
Sunnyvale, CA
4 days ago
Principal Engineer - AI Agents and Systems
$272k - $431.25k
...users worldwide. We are looking for a Principal Engineer to serve as a key technical leader in deploying advanced AI agent frameworks and local runtimes to Windows... ...on consumer PCs. By combining powerful local inference (Nemotron models) with robust privacy routers...
Local area
Worldwide
NVIDIA
Santa Clara, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Engineering Manager, Inference Benchmarking — AI Perf. Be the first to apply!