Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Software Engineer, AI Inference Systems

$184k - $287.5k

NVIDIA

We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry benchmarks, and scale workloads across multi-GPU, multi-node, and multi-cloud environments. You’ll collaborate across inference, compiler, scheduling, and performance teams to push the frontier of accelerated computing for AI.

What you’ll be doing:

  • Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation.

  • Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity while approaching peak hardware utilization.

  • Define and build inference benchmarking methodologies and tools; contribute both new benchmark and NVIDIA’s submissions to the industry-leading MLPerf Inference benchmarking suite.

  • Architect the scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.

  • Conduct and publish original research that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way to integrate research ideas and prototypes into NVIDIA’s software products.

What we need to see:

  • Bachelor’s degree (or equivalent expeience) in Computer Science (CS), Computer Engineering (CE) or Software Engineering (SE) with 7+ years of experience; alternatively, Master’s degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.

  • Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.

  • Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).

  • Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).

  • Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.

  • Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.

Ways to stand out from the crowd

  • Experience building and optimizing LLM inference engines (e.g., vLLM, SGLang).

  • Hands-on work with ML compilers and DSLs (e.g., Triton, TorchDynamo/Inductor, MLIR/LLVM, XLA), GPU libraries (e.g., CUTLASS) and features (e.g., CUDA Graph, Tensor Cores).

  • Experience contributing to containerization/virtualization technologies such as containerd/CRI-O/CRIU.

  • Experience with cloud platforms (AWS/GCP/Azure), infrastructure as code, CI/CD, and production observability.

  • Contributions to open-source projects and/or publications; please include links to GitHub pull requests, published papers and artifacts.

At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development to create groundbreaking technologies that enable anyone to harness the power of AI and benefit from its potential. Our team consists of experts in AI, systems and performance optimization. Our leadership includes world-renowned experts in AI systems who have received multiple academic and industry research awards. If you’re excited to build systems, kernels, and tools that make large-scale AI faster, more efficient, and easier to deploy, we’d love to hear from you.

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits ( .

Applications for this job will be accepted at least until May 2, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Software Engineer, AI Inference Systems in Santa Clara, CA vacancy
  • $152k - $241.5k

     ...driving advancements in AI and machine learning to...  ...talented and motivated engineers to join our TensorRT...  ...-leading deep learning inference software for NVIDIA AI accelerators. As a Senior Software Engineer in the...  ...Frameworks, Compilers, or System Software. ~ Excellent... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by...  ...they run best‑in‑class on NVIDIA GPUs and systems-and by improving the underlying stack that... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep...  ...crowd: Experience developing System Software. Proficiency in Python...  ...existing vacancy. NVIDIA uses AI tools in its recruiting processes.... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture...  ...LLM and Multimodal inference at scale across multi-GPU...  ...across internal GPU software teams and engage with open...  ...: Skilled engineer with strong technical and... 
    Senior

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    5 days ago
  • $152k - $241.5k

     ...We are looking for a Senior System Software Engineer to work on Dynamo-Triton Inference Server ( . NVIDIA is hiring software engineers for its GPU-accelerated deep learning...  ...the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $165k - $242k

     ...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology...  ...experience building distributed systems or cloud services. ~ Strong coding... 
    Senior
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    1 day ago
  • $139k - $204k

     ...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology...  ...experience building distributed systems or cloud services. Computer Science... 
    Senior
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    1 day ago
  • $168k - $270.25k

     ...Senior Engineer For Factory Infrastructure And Automation...  ...upon which every new AI-powered application is...  ...automation for NVIDIA Inference Microservices (NIMs)....  ...heterogeneous hardware and software environments. You will...  ...distributed and compute systems, backend services,... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

    Senior Software Engineer, Quantized Inference page is loaded## Senior Software Engineer, Quantized Inferencelocations...  ...across the team: CI, build systems, training infrastructure, pipeline...  ...concise, well-tested code; fluent with AI-assisted tooling* Experience with ML... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $170.6k - $261.3k

     ...hardware and battery systems to intuitive design, intelligent software, and next-generation safety...  .... Our Embodied AI teams are redefining what...  ...a safe stop. As a Senior Software Engineer on the Secondary...  .../accelerator-based ML inference, model deployment, and... 
    Senior
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  • $152k - $241.5k

     ...eager to work on cutting-edge AI technology for safety-...  ...NVIDIA's TensorRT team as a Senior Software Engineer, and be at the forefront of...  ...enabling high-performance AI inference solutions for automotive safety...  ...of functions, classes, and systems to support certification and... 
    Senior

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of...  ...is looking for a strategic software engineering lead who is passionate about improving...  ...scale-up and scale-out inference. Develop methods and... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...passionate about redefining how software is built in the age of Generative AI? Join NVIDIA’s TensorRT team...  ...entry point for out-of-framework inference globally. We are moving beyond...  ...scale. If you are a systems-thinking C++ engineer who wants to help scale out an... 
    Senior

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $230k - $250k

    Cerebras Systems is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience... 
    Senior

    Cerebras Systems

    Sunnyvale, CA
    1 day ago
  • $184k - $287.5k

     ...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and optimize...  ...accelerated software that powers today's most sophisticated AI applications. Our team is responsible for... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $135.8k - $237.05k

     ...Mountain View, CA, USA Senior Backend Engineer, ML Inference Systems Location Mountain View, CA, USA Department AI & Machine Learning Requisition ID JOBREQ-2616050 Role description The opportunity Every day, we connect billions of players with... 
    Senior
    Work at office
    Worldwide
    Relocation package

    Unity Technologies

    Mountain View, CA
    4 days ago
  • A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...tapping into the unlimited potential of AI to define the next era of computing. An...  .../ Trajectory planning and controls senior software engineer to develop key features for our autonomous...  ...applications in model-predictive control systems for vehicle dynamic models.... 
    Senior

    NVIDIA

    Santa Clara, CA
    4 days ago
  • A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...into the unlimited potential of AI to define the next era of...  ...doing: Develop use cases and system requirements for L3 and L4...  ...closely with Data Analytics, Test Engineering, and System Integration &...  ...analysis, data analysis, and software architecture. ~ Strong software... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $193.3k - $261.5k

     ...builds AWS Neuron, the software development kit...  ...unparalleled ML inference and training performance...  ...boundary, our engineers build systematic...  ...what's possible in AI acceleration....  ...across the stack from system level optimizations...  ...mentorship. Our senior members enjoy one-... 
    Senior
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    3 days ago
  •  ...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of...  ...such as ONNX Runtime, TensorRT,...). Experience with inference servers/model serving frameworks (such as Triton, TFServ... 
    Senior
    Work experience placement
    3 days per week

    D-Matrix

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

     ...the unlimited potential of AI to define the next era of computing...  ...the world. Join NVIDIA's software infrastructure team to...  ...build, and improve software systems for rack, networking, and datacenter...  ...and management. As a Senior Software Engineer - Datacenter Systems, you... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $155.42k - $205.9k

     ...the Team: The ML Inference Platform is part of...  ...platform that powers GM's AI efforts. We're proud...  ...We are seeking a Senior ML Infrastructure engineer to help build and...  ...designing distributed systems for ML, strong...  ...core platform backend software components. Collaborate... 
    Senior
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  • We’re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding...  ...search, retrieval, and AI-native experiences in...  ...focus on building core systems and services that power...  ...at scale Strong software engineering skills in languages... 
    Senior
    Local area
    Worldwide

    MongoDB

    Palo Alto, CA
    1 day ago
  • $152k - $241.5k

     ...Automotive Vehicles team is searching for a creative and experienced Software Systems Engineer to help bring NVIDIA's next generation autonomous vehicle...  ..., analysis, utility languages such as Python and the use of AI tooling to enhance requirement and test coverage analysis.... 
    Senior
    Odd job

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $125k - $191.7k

     ...This role is categorized as hybrid/Remote Role: As a Senior Software Systems Engineer on the Software Validation team within the AV organization...  ...responsible for shaping the future of evaluation methodologies for AI systems and other ADAS features, architecting solutions... 
    Senior
    Local area
    Remote work
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    28 days ago
  • $136.5k - $276.5k

     ...Senior Software Engineer, Systems/Solutions Test This role has been designed as 'Hybrid' with an expectation that you will work on average 2...  ...continuous improvement through emerging technologies, including AI-assisted testing workflows. Required Qualifications:... 
    Senior
    Work experience placement
    Work at office
    Local area
    Immediate start
    2 days per week

    Hewlett Packard Enterprise Development LP

    Sunnyvale, CA
    1 day ago
  •  ...technology company is seeking a skilled engineer to optimize deep learning frameworks and...  ...and PyTorch and working closely with GPU software teams. This role promises a dynamic work...  ...focus on innovative solutions and advancing AI technologies. #J-18808-Ljbffr Advanced Micro... 
    Senior

    Advanced Micro Devices

    Santa Clara, CA
    1 day ago
  • $155k - $253k

     ...Inc. is powering the future of physical AI. Founded in 2017 and now valued at $15 billion...  ...: tools and infrastructure, operating systems, and autonomy. Eighteen of the top 20...  ...-stack operating system. As a Software Engineer on the team, you will develop, design, and... 
    Senior
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Applied Intuition

    Sunnyvale, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer, AI Inference Systems. Be the first to apply!