Engineering Manager, Inference Benchmarking — AI Perf
$224k - $356.5kNVIDIA Gruppe
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s an outstanding legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self‑driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIA, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. NVIDIA’s open‑source benchmarking platform, AIPerf, is the growing standard for assessing LLM serving performance across various inference frameworks. Hyperscalers, cloud providers, and enterprises use AIPerf to inform decisions on production inference. This includes choosing GPUs, optimizing costs, reducing latency, improving efficiency, and scaling. AIPerf spans LLM, multimodal, diffusion, and computer vision inference. This position combines hands‑on leadership with expertise in systems engineering, inference infrastructure, and open‑source communities. It has a direct effect on how AI performance is measured and pushed forward. As Technical Lead Manager, you will lead the engineering team within NVIDIA’s Dynamo organization. Your responsibility is to build and advance the platform so AIPerf becomes the leading benchmarking tool for datacenter, local, and edge use cases. It has a direct effect on how AI performance is measured and pushed forward. What you’ll be doing: Driving the technical roadmap for AIPerf’s core infrastructure: load generation, ZMQ‑based microservices, GPU telemetry (DCGM/PyNVML, Prometheus metrics, statistical confidence intervals, and Kubernetes‑native deployment. Taking ownership for the accuracy and statistical soundness of benchmark results that engineering groups throughout the industry depend on to inform production infrastructure decisions. Advising upstream engine integrations involving vLLM, TRT‑LLM, and SGLang in partnership with NVIDIA’s Dynamo and NIM teams to maintain AIPerf’s relevance across emerging hardware, workload categories, and inference configurations. Hiring, mentoring, and growing a team of senior engineers operating in a high‑velocity open‑source environment with active external contributors worldwide. What we need to see: Bachelor’s degree in Computer Science, Electrical Engineering, or related field, or equivalent experience. 8+ overall years of software engineering experience building performance‑critical infrastructure, ML tooling, or distributed systems. 3+ years of engineering leadership experience as a tech lead, TLM, or engineering manager. Deep understanding of LLM inference mechanics — TTFT, ITL, KV caching, Prefill/Decode, speculative decoding — and the ability to reason about measurement correctness and reproducibility. Proven track record of collaborating across multi‑functional groups and delivering production‑quality output in high‑velocity, high‑external‑visibility environments. Ways to stand out from the crowd: Extensive experience with vLLM, TRT‑LLM or SGLang internals along with contributions to their upstream projects. Experience building Kubernetes‑native infrastructure including operators, Helm charts, and GPU observability tooling (DCGM, dcgm‑exporter, PyNVML). Background in competitive benchmarking frameworks such as MLPerf or equivalent industry‑standard evaluation systems. History leading or making meaningful contributions to active open‑source projects with external communities. Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000USD–356,500USD. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until June1,2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA Gruppe
$224k - $356.5k
...the unlimited potential of AI to define the next era of... ...NVIDIA’s open-source benchmarking platform, AIPerf, is the... ...performance across various inference frameworks. Hyperscalers,... ...scaling. As Technical Lead Manager, you will lead the engineering team within NVIDIA’s...SuggestedLocal areaWorldwide$224k - $356.5k
NVIDIA Corporation is seeking a Technical Lead Manager to lead the engineering team in developing the AIPerf platform, a benchmark tool for LLM and computer vision inference workloads. The ideal candidate will have extensive experience in software engineering and leadership...Suggested$206k - $333k
...The Essential Cloud for AI™. Built for pioneers by... ...for a Principal Engineer to be the technical lead of CoreWeave's Benchmarking & Performance team. You... ...If MLPerf (Training & Inference), Working closely with... ..., and audit trails. Perf Ownership - Lead end-to...SuggestedPermanent employmentTemporary workCasual workWork at officeFlexible hours- ...generation computing experiences-from AI and data centers, to PCs,... ...for a Senior Staff AI Infra Engineer who is passionate about... ...performance of key applications and benchmarks, with a special focus on AI/... ...accelerate LLM training and inference on AMD GPUs, improving kernel...Suggested
$278.1k - $347.6k
..., CA, USA Principal Machine Learning Engineer, Mobile AI Inference Optimization Location Mountain View... ..., code review standards, and on-device benchmarking methodology. Partner with platform engineers, product managers, and runtime teams to align ML capabilities...SuggestedWork at officeWorldwideRelocation package- ...Engineering Manager, Inference ML Runtime Sunnyvale CA or Toronto Canada Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the...
$272k - $431.25k
NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing...$255.85k - $361.2k
...are seeking a Principal Engineer to define and... ...generation of distributed AI systems across heterogeneous... ...hardware while managing state, locality, and performance... ...with AI/ML systems, inference infrastructure, or... ...performance optimization and benchmarking. Job Type and...Local areaShift work$198.3k - $342.8k
...Machine Learning Engineering Manager, Proactive - On-Device Modeling The AI represents a unique opportunity to elevate Apple's products and revolutionize the... ...architectures, transformers, attention mechanisms, and inference optimization ~ Strong software engineering...Work experience placementRelocation$185.1k - $284.1k
The Role As the Tech Lead Manager for the Rendering Infrastructure team... ...small, high-leverage group of engineers. The team owns the... ...the team. Champion the use of AI-assisted development tools across... ...codebases (e.g., Nsight, Tracy, perf, custom instrumentation). Comfort...Remote workFlexible hours$320k
...Within NVIDIA's Edge AI, Metropolis, and Blueprints (EMB), this team is the execution engine behind NVIDIA’s Vision AI strategy—owning... ...robust, low-latency inference at scale. You have led teams... ...and partners. Performance Benchmarking: Orchestrate efforts to achieve...$146.3k - $289.9k
...produce impressive content effortlessly. The AI Foundations team builds the flexible,... .... We are looking for an Engineering Manager to lead and grow a team of engineers building... ...workstreams — including LLM orchestration, inference services, data pipelines, and...Temporary workLocal areaImmediate startWorldwideFlexible hours$220.2k - $330.4k
...Technologies, Inc. Job Area: Engineering Group, Engineering... ...edge, focusing on AI, edge computing and connectivity... ...for generative AI inference and computer vision... ..., partners, product management, customer engineering... ...runtimes. Model/system benchmarking and E2E evaluation (latency...Work experience placementWork at office$272k - $431.25k
As a Senior Engineering Manager for Agentic Systems & Platform Architecture, you will... ...quality—backed by evaluations, benchmarking, and feedback loops Assess and... ...and GPU‑optimized training and inference workflows Lead integration of the AI Data Platform into NVIDIA’s on‑...$255.7k - $346k
...the future of physical AI. Founded in 2017 and... ...trust our employees to manage their schedules responsibly... ...We are looking for an Engineering Manager to lead ML... ...hardware, not benchmarks on a leaderboard. Own... ...training code to onboard inference. Experience managing...Full timeFor contractorsFor subcontractorCasual workWork at officeRemote workDay shift- A leading technology company in California is seeking an Engineering Manager to lead the development of cutting-edge LLM/VLM technologies. In this hands-on leadership role, you will manage a team responsible for optimizing runtime and frameworks, while collaborating with...
- At Coram AI, we’re reimagining video security for the modern world. Our cloud-native platform uses computer vision and AI to... ...safer and more connected. We are looking for a technically deep Engineering Manager to lead the AI team at Coram. This team is small, highly...Shift work
- The AI Observability team at Netflix makes AI, ML, and Agentic... ...Partner with ML researchers, engineers, and platform teams to embed... ...across model training, online inference, and agent orchestration.... ...engineering experience and 3+ years of management experience. Experience...Hourly payFull timeImmediate startFlexible hours
$255.7k - $346k
...States and Europe. We are looking for an Engineering Manager to lead ML teams within SDS Core. This... ...running on customer hardware, not benchmarks on a leaderboard. Own the offboard ML... ...full stack from training code to onboard inference. Experience managing through...Full time$206k - $303k
...CoreWeave is the AI Hyperscaler™, delivering a cloud platform... ...We’re seeking a Principal Engineer to serve as the hands-on technical... ...for our next-generation Inference Platform . As a senior individual... ..., and performance benchmarks across gRPC/ CUDA Graphs, and...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$224k - $356.5k
Engineering Manager, AI Developer Technology page is loaded **Engineering Manager, AI Developer Technology**locationsUS, CA, Santa ClaraUS, MA... ...and performance optimization of Deep Learning training and inference* Background in parallel programming, e.g., CUDA, OpenMP, MPI...Temporary work- A leading technology company based in Sunnyvale is seeking a Silicon Validation Engineering Manager to lead their validation efforts for custom TPU silicon. This role involves strategizing and executing test plans, managing resources, and ensuring the silicon meets all...
$156k - $229k
A leading technology company in Sunnyvale seeks a Senior Silicon Validation Engineer to drive cutting-edge TPU technology in AI/ML applications. You'll own silicon validation across the life-cycle, ensuring robust performance and optimization. The ideal candidate has a...- ...Splunk AI Models Team Splunk, a Cisco company, is building a safer, more resilient... ...excellence of Splunk and Cisco's global engineering capabilities. Our work spans networking,... ..., distributed training pipelines, and inference efficiency to minimize cost and latency...Flexible hours
- Business Area Engineering Seniority Level Mid-Senior level Job Description At Cloudera, we... ...actionable insights. With as much data under management as the hyperscalers, we’re the preferred... ..., streaming, operational databases, and AI. Cloudera is looking for a Senior...Work from homeWorldwideFlexible hours
$272k - $431.25k
...NVIDIA IT’s Enterprise AI & Automation team to... ...business results across engineering, IT, supply chain,... ...from Kubernetes to GPU inference stacks and translate new... ...quality through telemetry, benchmarking, automated evaluation,... ...models, multi-agent management (e.g., LangChain,...$160k - $240k
...scale, come make a difference at Fiserv. Job Title Sr. Engineering Manager – Product Engineering About your role: Fiserv is a global... ..., and success metrics beyond just technical requirements. AI-Powered Engineering: Promote the effective use of AI software...Temporary workWork at officeWorldwideMonday to Friday$204k - $343k
...powering the future of physical AI. Founded in 2017 and now... ...flexibility and trust our employees to manage their schedules responsibly.... .... About the role As an Engineering Manager on the ML Platform... ...three critical areas: Training & Inference Orchestration, where we build...Full timeFor contractorsFor subcontractorCasual workWork at officeRemote workDay shift$188k - $275k
CoreWeave, the AI Hyperscaler™, acquired Weights & Biases to create the most powerful... ...training clusters, agent building, and inference at scale, we’re combining forces to... ...training experiment. Mentor and Grow Engineers: Manage and coach a high-caliber team of backend...Permanent employmentTemporary workCasual workWork at officeImmediate startRemote workFlexible hours$272k - $431.25k
...users worldwide. We are looking for a Principal Engineer to serve as a key technical leader in deploying advanced AI agent frameworks and local runtimes to Windows... ...on consumer PCs. By combining powerful local inference (Nemotron models) with robust privacy routers...Local areaWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Engineering Manager, Inference Benchmarking — AI Perf. Be the first to apply!


