Senior Software Engineer, AI Inference Systems
$184k - $287.5kNVIDIA
We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry benchmarks, and scale workloads across multi-GPU, multi-node, and multi-cloud environments. You’ll collaborate across inference, compiler, scheduling, and performance teams to push the frontier of accelerated computing for AI.
What you’ll be doing:
Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation.
Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity while approaching peak hardware utilization.
Define and build inference benchmarking methodologies and tools; contribute both new benchmark and NVIDIA’s submissions to the industry-leading MLPerf Inference benchmarking suite.
Architect the scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.
Conduct and publish original research that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way to integrate research ideas and prototypes into NVIDIA’s software products.
What we need to see:
Bachelor’s degree (or equivalent expeience) in Computer Science (CS), Computer Engineering (CE) or Software Engineering (SE) with 7+ years of experience; alternatively, Master’s degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.
Ways to stand out from the crowd
Experience building and optimizing LLM inference engines (e.g., vLLM, SGLang).
Hands-on work with ML compilers and DSLs (e.g., Triton, TorchDynamo/Inductor, MLIR/LLVM, XLA), GPU libraries (e.g., CUTLASS) and features (e.g., CUDA Graph, Tensor Cores).
Experience contributing to containerization/virtualization technologies such as containerd/CRI-O/CRIU.
Experience with cloud platforms (AWS/GCP/Azure), infrastructure as code, CI/CD, and production observability.
Contributions to open-source projects and/or publications; please include links to GitHub pull requests, published papers and artifacts.
At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development to create groundbreaking technologies that enable anyone to harness the power of AI and benefit from its potential. Our team consists of experts in AI, systems and performance optimization. Our leadership includes world-renowned experts in AI systems who have received multiple academic and industry research awards. If you’re excited to build systems, kernels, and tools that make large-scale AI faster, more efficient, and easier to deploy, we’d love to hear from you.
#LI-Hybrid
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.
You will also be eligible for equity and benefits ( .
Applications for this job will be accepted at least until May 2, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
$152k - $241.5k
...driving advancements in AI and machine learning to... ...talented and motivated engineers to join our TensorRT... ...-leading deep learning inference software for NVIDIA AI accelerators. As a Senior Software Engineer in the... ...Frameworks, Compilers, or System Software. ~ Excellent...Senior$152k - $241.5k
...NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by... ...they run best‑in‑class on NVIDIA GPUs and systems-and by improving the underlying stack that...Senior$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep... ...crowd: Experience developing System Software. Proficiency in Python... ...existing vacancy. NVIDIA uses AI tools in its recruiting processes....Senior- ...computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture... ...LLM and Multimodal inference at scale across multi-GPU... ...across internal GPU software teams and engage with open... ...: Skilled engineer with strong technical and...Senior
$152k - $241.5k
...We are looking for a Senior System Software Engineer to work on Dynamo-Triton Inference Server ( . NVIDIA is hiring software engineers for its GPU-accelerated deep learning... ...the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image...Senior$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology... ...experience building distributed systems or cloud services. ~ Strong coding...SeniorPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology... ...experience building distributed systems or cloud services. Computer Science...SeniorPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$168k - $270.25k
...Senior Engineer For Factory Infrastructure And Automation... ...upon which every new AI-powered application is... ...automation for NVIDIA Inference Microservices (NIMs).... ...heterogeneous hardware and software environments. You will... ...distributed and compute systems, backend services,...Senior$152k - $241.5k
Senior Software Engineer, Quantized Inference page is loaded## Senior Software Engineer, Quantized Inferencelocations... ...across the team: CI, build systems, training infrastructure, pipeline... ...concise, well-tested code; fluent with AI-assisted tooling* Experience with ML...Senior$170.6k - $261.3k
...hardware and battery systems to intuitive design, intelligent software, and next-generation safety... .... Our Embodied AI teams are redefining what... ...a safe stop. As a Senior Software Engineer on the Secondary... .../accelerator-based ML inference, model deployment, and...SeniorLocal areaRemote workWork from homeRelocation packageFlexible hours$152k - $241.5k
...eager to work on cutting-edge AI technology for safety-... ...NVIDIA's TensorRT team as a Senior Software Engineer, and be at the forefront of... ...enabling high-performance AI inference solutions for automotive safety... ...of functions, classes, and systems to support certification and...Senior- ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of... ...is looking for a strategic software engineering lead who is passionate about improving... ...scale-up and scale-out inference. Develop methods and...
$152k - $241.5k
...passionate about redefining how software is built in the age of Generative AI? Join NVIDIA’s TensorRT team... ...entry point for out-of-framework inference globally. We are moving beyond... ...scale. If you are a systems-thinking C++ engineer who wants to help scale out an...Senior$230k - $250k
Cerebras Systems is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience...Senior$184k - $287.5k
...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and optimize... ...accelerated software that powers today's most sophisticated AI applications. Our team is responsible for...SeniorRemote work$135.8k - $237.05k
...Mountain View, CA, USA Senior Backend Engineer, ML Inference Systems Location Mountain View, CA, USA Department AI & Machine Learning Requisition ID JOBREQ-2616050 Role description The opportunity Every day, we connect billions of players with...SeniorWork at officeWorldwideRelocation package- A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach...Senior
$152k - $241.5k
...tapping into the unlimited potential of AI to define the next era of computing. An... .../ Trajectory planning and controls senior software engineer to develop key features for our autonomous... ...applications in model-predictive control systems for vehicle dynamic models....Senior- A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations...Senior
$184k - $287.5k
...into the unlimited potential of AI to define the next era of... ...doing: Develop use cases and system requirements for L3 and L4... ...closely with Data Analytics, Test Engineering, and System Integration &... ...analysis, data analysis, and software architecture. ~ Strong software...Senior$193.3k - $261.5k
...builds AWS Neuron, the software development kit... ...unparalleled ML inference and training performance... ...boundary, our engineers build systematic... ...what's possible in AI acceleration.... ...across the stack from system level optimizations... ...mentorship. Our senior members enjoy one-...SeniorWork experience placementInternshipLocal areaFlexible hours- ...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of... ...such as ONNX Runtime, TensorRT,...). Experience with inference servers/model serving frameworks (such as Triton, TFServ...SeniorWork experience placement3 days per week
$184k - $287.5k
...the unlimited potential of AI to define the next era of computing... ...the world. Join NVIDIA's software infrastructure team to... ...build, and improve software systems for rack, networking, and datacenter... ...and management. As a Senior Software Engineer - Datacenter Systems, you...Senior$155.42k - $205.9k
...the Team: The ML Inference Platform is part of... ...platform that powers GM's AI efforts. We're proud... ...We are seeking a Senior ML Infrastructure engineer to help build and... ...designing distributed systems for ML, strong... ...core platform backend software components. Collaborate...SeniorLocal areaRemote workWork from homeRelocationRelocation packageFlexible hours- We’re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding... ...search, retrieval, and AI-native experiences in... ...focus on building core systems and services that power... ...at scale Strong software engineering skills in languages...SeniorLocal areaWorldwide
$152k - $241.5k
...Automotive Vehicles team is searching for a creative and experienced Software Systems Engineer to help bring NVIDIA's next generation autonomous vehicle... ..., analysis, utility languages such as Python and the use of AI tooling to enhance requirement and test coverage analysis....SeniorOdd job$125k - $191.7k
...This role is categorized as hybrid/Remote Role: As a Senior Software Systems Engineer on the Software Validation team within the AV organization... ...responsible for shaping the future of evaluation methodologies for AI systems and other ADAS features, architecting solutions...SeniorLocal areaRemote workWork from homeFlexible hours$136.5k - $276.5k
...Senior Software Engineer, Systems/Solutions Test This role has been designed as 'Hybrid' with an expectation that you will work on average 2... ...continuous improvement through emerging technologies, including AI-assisted testing workflows. Required Qualifications:...SeniorWork experience placementWork at officeLocal areaImmediate start2 days per week- ...technology company is seeking a skilled engineer to optimize deep learning frameworks and... ...and PyTorch and working closely with GPU software teams. This role promises a dynamic work... ...focus on innovative solutions and advancing AI technologies. #J-18808-Ljbffr Advanced Micro...Senior
$155k - $253k
...Inc. is powering the future of physical AI. Founded in 2017 and now valued at $15 billion... ...: tools and infrastructure, operating systems, and autonomy. Eighteen of the top 20... ...-stack operating system. As a Software Engineer on the team, you will develop, design, and...SeniorFull timeFor contractorsFor subcontractorCasual workWork at officeRemote workDay shift
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer, AI Inference Systems. Be the first to apply!
- graduate software developer Santa Clara, CA
- rust software engineer Santa Clara, CA
- senior software design engineer Santa Clara, CA
- software engineer amazon Santa Clara, CA
- software developer positions Santa Clara, CA
- software engineer full time Santa Clara, CA
- software qa engineer Santa Clara, CA
- new graduate software engineer Santa Clara, CA
- junior software developer Santa Clara, CA
- software engineer Santa Clara, CA

