Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Engineering Manager, Inference Benchmarking — AI Perf

$224k - $356.5k

NVIDIA Gruppe

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s an outstanding legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self‑driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIA, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. NVIDIA’s open‑source benchmarking platform, AIPerf, is the growing standard for assessing LLM serving performance across various inference frameworks. Hyperscalers, cloud providers, and enterprises use AIPerf to inform decisions on production inference. This includes choosing GPUs, optimizing costs, reducing latency, improving efficiency, and scaling. AIPerf spans LLM, multimodal, diffusion, and computer vision inference. This position combines hands‑on leadership with expertise in systems engineering, inference infrastructure, and open‑source communities. It has a direct effect on how AI performance is measured and pushed forward. As Technical Lead Manager, you will lead the engineering team within NVIDIA’s Dynamo organization. Your responsibility is to build and advance the platform so AIPerf becomes the leading benchmarking tool for datacenter, local, and edge use cases. It has a direct effect on how AI performance is measured and pushed forward. What you’ll be doing: Driving the technical roadmap for AIPerf’s core infrastructure: load generation, ZMQ‑based microservices, GPU telemetry (DCGM/PyNVML, Prometheus metrics, statistical confidence intervals, and Kubernetes‑native deployment. Taking ownership for the accuracy and statistical soundness of benchmark results that engineering groups throughout the industry depend on to inform production infrastructure decisions. Advising upstream engine integrations involving vLLM, TRT‑LLM, and SGLang in partnership with NVIDIA’s Dynamo and NIM teams to maintain AIPerf’s relevance across emerging hardware, workload categories, and inference configurations. Hiring, mentoring, and growing a team of senior engineers operating in a high‑velocity open‑source environment with active external contributors worldwide. What we need to see: Bachelor’s degree in Computer Science, Electrical Engineering, or related field, or equivalent experience. 8+ overall years of software engineering experience building performance‑critical infrastructure, ML tooling, or distributed systems. 3+ years of engineering leadership experience as a tech lead, TLM, or engineering manager. Deep understanding of LLM inference mechanics — TTFT, ITL, KV caching, Prefill/Decode, speculative decoding — and the ability to reason about measurement correctness and reproducibility. Proven track record of collaborating across multi‑functional groups and delivering production‑quality output in high‑velocity, high‑external‑visibility environments. Ways to stand out from the crowd: Extensive experience with vLLM, TRT‑LLM or SGLang internals along with contributions to their upstream projects. Experience building Kubernetes‑native infrastructure including operators, Helm charts, and GPU observability tooling (DCGM, dcgm‑exporter, PyNVML). Background in competitive benchmarking frameworks such as MLPerf or equivalent industry‑standard evaluation systems. History leading or making meaningful contributions to active open‑source projects with external communities. Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000USD–356,500USD. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until June1,2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA Gruppe

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Engineering Manager, Inference Benchmarking — AI Perf in Santa Clara, CA vacancy
  • $224k - $356.5k

     ...the unlimited potential of AI to define the next era of...  ...NVIDIA’s open-source benchmarking platform, AIPerf, is the...  ...performance across various inference frameworks. Hyperscalers,...  ...scaling. As Technical Lead Manager, you will lead the engineering team within NVIDIA’s... 
    Suggested
    Local area
    Worldwide

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $224k - $356.5k

    NVIDIA Corporation is seeking a Technical Lead Manager to lead the engineering team in developing the AIPerf platform, a benchmark tool for LLM and computer vision inference workloads. The ideal candidate will have extensive experience in software engineering and leadership... 
    Suggested

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $206k - $333k

     ...The Essential Cloud for AI™. Built for pioneers by...  ...for a Principal Engineer to be the technical lead of CoreWeave's Benchmarking & Performance team. You...  ...If MLPerf (Training & Inference), Working closely with...  ..., and audit trails. Perf Ownership - Lead end-to... 
    Suggested
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    15 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs,...  ...for a Senior Staff AI Infra Engineer who is passionate about...  ...performance of key applications and benchmarks, with a special focus on AI/...  ...accelerate LLM training and inference on AMD GPUs, improving kernel... 
    Suggested

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    2 days ago
  • $278.1k - $347.6k

     ..., CA, USA Principal Machine Learning Engineer, Mobile AI Inference Optimization Location Mountain View...  ..., code review standards, and on-device benchmarking methodology. Partner with platform engineers, product managers, and runtime teams to align ML capabilities... 
    Suggested
    Work at office
    Worldwide
    Relocation package

    Unity Technologies

    Mountain View, CA
    10 days ago
  •  ...Engineering Manager, Inference ML Runtime Sunnyvale CA or Toronto Canada Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $272k - $431.25k

    NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $255.85k - $361.2k

     ...are seeking a Principal Engineer to define and...  ...generation of distributed AI systems across heterogeneous...  ...hardware while managing state, locality, and performance...  ...with AI/ML systems, inference infrastructure, or...  ...performance optimization and benchmarking. Job Type and... 
    Local area
    Shift work

    Intel Corporation

    Santa Clara, CA
    1 day ago
  • $198.3k - $342.8k

     ...Machine Learning Engineering Manager, Proactive - On-Device Modeling The AI represents a unique opportunity to elevate Apple's products and revolutionize the...  ...architectures, transformers, attention mechanisms, and inference optimization ~ Strong software engineering... 
    Work experience placement
    Relocation

    Apple

    Santa Clara, CA
    5 days ago
  • $185.1k - $284.1k

    The Role As the Tech Lead Manager for the Rendering Infrastructure team...  ...small, high-leverage group of engineers. The team owns the...  ...the team. Champion the use of AI-assisted development tools across...  ...codebases (e.g., Nsight, Tracy, perf, custom instrumentation). Comfort... 
    Remote work
    Flexible hours

    General Motors

    Sunnyvale, CA
    14 hours ago
  • $320k

     ...Within NVIDIA's Edge AI, Metropolis, and Blueprints (EMB), this team is the execution engine behind NVIDIA’s Vision AI strategy—owning...  ...robust, low-latency inference at scale. You have led teams...  ...and partners. Performance Benchmarking: Orchestrate efforts to achieve... 

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $146.3k - $289.9k

     ...produce impressive content effortlessly. The AI Foundations team builds the flexible,...  .... We are looking for an Engineering Manager to lead and grow a team of engineers building...  ...workstreams — including LLM orchestration, inference services, data pipelines, and... 
    Temporary work
    Local area
    Immediate start
    Worldwide
    Flexible hours

    Adobe

    San Jose, CA
    4 days ago
  • $220.2k - $330.4k

     ...Technologies, Inc. Job Area: Engineering Group, Engineering...  ...edge, focusing on AI, edge computing and connectivity...  ...for generative AI inference and computer vision...  ..., partners, product management, customer engineering...  ...runtimes. Model/system benchmarking and E2E evaluation (latency... 
    Work experience placement
    Work at office

    Qualcomm

    Santa Clara, CA
    1 day ago
  • $272k - $431.25k

    As a Senior Engineering Manager for Agentic Systems & Platform Architecture, you will...  ...quality—backed by evaluations, benchmarking, and feedback loops Assess and...  ...and GPU‑optimized training and inference workflows Lead integration of the AI Data Platform into NVIDIA’s on‑... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $255.7k - $346k

     ...the future of physical AI. Founded in 2017 and...  ...trust our employees to manage their schedules responsibly...  ...We are looking for an Engineering Manager to lead ML...  ...hardware, not benchmarks on a leaderboard. Own...  ...training code to onboard inference. Experience managing... 
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Decisive Point

    Sunnyvale, CA
    3 days ago
  • A leading technology company in California is seeking an Engineering Manager to lead the development of cutting-edge LLM/VLM technologies. In this hands-on leadership role, you will manage a team responsible for optimizing runtime and frameworks, while collaborating with... 

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • At Coram AI, we’re reimagining video security for the modern world. Our cloud-native platform uses computer vision and AI to...  ...safer and more connected. We are looking for a technically deep Engineering Manager to lead the AI team at Coram. This team is small, highly... 
    Shift work

    Coram AI

    Sunnyvale, CA
    2 days ago
  • The AI Observability team at Netflix makes AI, ML, and Agentic...  ...Partner with ML researchers, engineers, and platform teams to embed...  ...across model training, online inference, and agent orchestration....  ...engineering experience and 3+ years of management experience. Experience... 
    Hourly pay
    Full time
    Immediate start
    Flexible hours

    Netflix, Inc.

    Los Gatos, CA
    3 days ago
  • $255.7k - $346k

     ...States and Europe. We are looking for an Engineering Manager to lead ML teams within SDS Core. This...  ...running on customer hardware, not benchmarks on a leaderboard. Own the offboard ML...  ...full stack from training code to onboard inference. Experience managing through... 
    Full time

    Applied Intuition

    Sunnyvale, CA
    2 days ago
  • $206k - $303k

     ...CoreWeave is the AI Hyperscaler™, delivering a cloud platform...  ...We’re seeking a Principal Engineer to serve as the hands-on technical...  ...for our next-generation Inference Platform . As a senior individual...  ..., and performance benchmarks across gRPC/ CUDA Graphs, and... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    more than 2 months ago
  • $224k - $356.5k

    Engineering Manager, AI Developer Technology page is loaded **Engineering Manager, AI Developer Technology**locationsUS, CA, Santa ClaraUS, MA...  ...and performance optimization of Deep Learning training and inference* Background in parallel programming, e.g., CUDA, OpenMP, MPI... 
    Temporary work

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • A leading technology company based in Sunnyvale is seeking a Silicon Validation Engineering Manager to lead their validation efforts for custom TPU silicon. This role involves strategizing and executing test plans, managing resources, and ensuring the silicon meets all... 

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $156k - $229k

    A leading technology company in Sunnyvale seeks a Senior Silicon Validation Engineer to drive cutting-edge TPU technology in AI/ML applications. You'll own silicon validation across the life-cycle, ensuring robust performance and optimization. The ideal candidate has a... 

    Google Inc.

    Sunnyvale, CA
    1 day ago
  •  ...Splunk AI Models Team Splunk, a Cisco company, is building a safer, more resilient...  ...excellence of Splunk and Cisco's global engineering capabilities. Our work spans networking,...  ..., distributed training pipelines, and inference efficiency to minimize cost and latency... 
    Flexible hours

    Webex Events (formerly Socio)

    Santa Clara, CA
    3 days ago
  • Business Area Engineering Seniority Level Mid-Senior level Job Description At Cloudera, we...  ...actionable insights. With as much data under management as the hyperscalers, we’re the preferred...  ..., streaming, operational databases, and AI. Cloudera is looking for a Senior... 
    Work from home
    Worldwide
    Flexible hours

    Nerdleveltech

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

     ...NVIDIA IT’s Enterprise AI & Automation team to...  ...business results across engineering, IT, supply chain,...  ...from Kubernetes to GPU inference stacks and translate new...  ...quality through telemetry, benchmarking, automated evaluation,...  ...models, multi-agent management (e.g., LangChain,... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $160k - $240k

     ...scale, come make a difference at Fiserv. Job Title Sr. Engineering Manager – Product Engineering About your role: Fiserv is a global...  ..., and success metrics beyond just technical requirements. AI-Powered Engineering: Promote the effective use of AI software... 
    Temporary work
    Work at office
    Worldwide
    Monday to Friday

    Fiserv

    Sunnyvale, CA
    5 days ago
  • $204k - $343k

     ...powering the future of physical AI. Founded in 2017 and now...  ...flexibility and trust our employees to manage their schedules responsibly....  .... About the role As an Engineering Manager on the ML Platform...  ...three critical areas: Training & Inference Orchestration, where we build... 
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Decisive Point

    Sunnyvale, CA
    3 days ago
  • $188k - $275k

    CoreWeave, the AI Hyperscaler™, acquired Weights & Biases to create the most powerful...  ...training clusters, agent building, and inference at scale, we’re combining forces to...  ...training experiment. Mentor and Grow Engineers: Manage and coach a high-caliber team of backend... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Immediate start
    Remote work
    Flexible hours

    Weights & Biases

    Sunnyvale, CA
    4 days ago
  • $272k - $431.25k

     ...users worldwide. We are looking for a Principal Engineer to serve as a key technical leader in deploying advanced AI agent frameworks and local runtimes to Windows...  ...on consumer PCs. By combining powerful local inference (Nemotron models) with robust privacy routers... 
    Local area
    Worldwide

    NVIDIA

    Santa Clara, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Engineering Manager, Inference Benchmarking — AI Perf. Be the first to apply!