Senior Deep Learning Frameworks CUDA Software Engineer
$184k - $287.5kNVIDIA
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.
We are looking for a motivated Deep Learning engineer to bring advanced CUDA features and Distributed Runtime technologies into AI stacks, including PyTorch, TRT-LLM, vLLM, SGLang, JAX, etc. You will be working with the team that created core CUDA features and runtimes for scaling Deep Learning and HPC applications. Your customers will have diverse multi-GPU demands, ranging from training on scales up to 100K GPUs to inference down at microsecond latency. CUDA features improve both productivity and performance of AI applications. Your work in AI toolkits will accelerate enabling those for the community. This is an outstanding opportunity for someone with an AI background to advance the state of the art in this space. Are you ready to contribute to the development of innovative technologies and help realize NVIDIA's vision?
What you will be doing:
Integrate new CUDA features and Runtime abstractions in AI frameworks: from PoC to performance analysis to production
Perform deep analysis of AI workloads and frameworks to identify requirements and opportunities to innovate in the lower layers of the stack. Collaborate hands-on with teams working on the latest AI models.
Own and drive improvements in the AI Compiler-Runtime interface to build speed-of-light multi-GPU multi-node solutions.
Design fault-tolerant and elastic solutions for large-scale or dynamic AI workloads.
Influence the roadmap of core CUDA to facilitate building next-gen DL frameworks.
Collaborate with a very dynamic team across multiple time zones.
Collaborate closely with AI researchers, HW and SW architects, kernel and compiler authors and CUDA driver experts to co-design systems and frameworks that enhance performance and programmability.
Develop exploratory tools and runtime systems to profile and accelerate new paradigms in deep learning.
Write clean, effective, and maintainable code, ensuring exploratory prototypes can smoothly transition into open-source releases, upstream framework integrations, internal tools, or closed-source commercial products.
What we need to see:
BS, MS, or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
8+ years of relevant industry experience or equivalent academic experience after completed degree.
Development experience with Deep Learning Frameworks such PyTorch, JAX, and Inference Engines such as TRT-LLM, vLLM, SGLang
Rapid prototyping and development with Python, C++, CUDA or related DSLs
Solid grasp of AI models, parallelisms, and/or compiler technologies (e.g. torch.compile)
Experience conducting performance benchmarking on AI clusters. Familiarity with at least one performance profiler toolchain (PyTorch profiler, NVIDIA Nsight Systems)
Understanding of HPC/AI communication concepts
Good understanding of computer system architecture, HW-SW interactions and operating systems principles (aka systems software fundamentals)
Adaptability and passion to learn new frameworks and tools
Flexibility to work and communicate effectively across different teams and timezones
Ways to stand out from the crowd:
Deep expertise in the performance internals and execution graphs of major deep learning autograd, training and inference frameworks (e.g., PyTorch, JAX, TensorRT, vLLM, sgLang, Nemo, Megatron, MaxText, etc.).
Hands-on experience with CUDA, specific communication libraries (e.g., NCCL, MPI, UCX) and distributed machine learning techniques (e.g., pipeline parallelism, tensor parallelism).
Expertise in one or more of these areas: Training, Distributed inference, MoE, Reinforcement Learning, kernel authoring (on CUDA, Triton, cuTe, etc).
Background in deep learning compilers, both graph-level and codegen (e.g., Triton, XLA, torch compile)
Experience with programming for compute & communication overlap in distributed runtime
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.
You will also be eligible for equity and benefits ( .
Applications for this job will be accepted at least until May 18, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
$152k - $241.5k
...is seeking outstanding senior engineers to work on the CUDA driver, a key component... ...You will join a versatile software engineering team that delivers... ...diverse workloads like deep learning, scientific research,... ...) and deep learning frameworks Your base salary will...Senior$184k - $287.5k
CUDA defines a unified programming model across a range of system configurations... ...characteristic protected by law.We are hiring software engineers to work on the CUDA driver, a core... ...computational workloads, ranging from deep learning, scientific computation, and self-...Senior$184k - $287.5k
Senior Software Engineer, RL Post-Training Frameworks page is loaded## Senior Software Engineer, RL Post-Training Frameworkslocations... ...id: JR2015863Reinforcement learning post-training is driving some of... ...address their need optimizing deep learning frameworks, or building...Senior$152k - $241.5k
...We are now looking for a Senior AI Frameworks Engineer (C++/Python)! NVIDIA's high... ...and industries. Within our software stack, CUTLASS stands out... ...scientific computing and deep learning frameworks. Develop robust... ...programming models (CUDA). NVIDIA is widely considered...Senior$224k - $356.5k
...We are looking for a Senior Deep Learning Software Engineer to design and build our automated inference and... ...developing features in high-level frameworks like PyTorch and JAX, designing and... ...and developing custom GPU kernels in CUDA and/or Triton. This is an exceptional...Senior$152k - $241.5k
...AI and machine learning to solve some of... ...talented and motivated engineers to join our... ...-leading deep learning inference software for NVIDIA AI accelerators... .... As a Senior Software Engineer... ...C++, Python, and CUDA for seamless and... ...developing Deep Learning Frameworks, Compilers, or...Senior$152k - $241.5k
...looking for a highly motivated senior software engineer for an exciting role in... ...Performance Computing and Deep Learning. What you will be doing... ...for Deep Learning frameworks (e.g. NCCL for TensorFlow/... ...you do. Experience with CUDA programming and NVIDIA GPUs...Senior$184k - $287.5k
...for outstanding AI systems engineers to develop groundbreaking technologies... ...in the inference systems software stack! We build innovative... ...engineers at NVIDIA across deep learning frameworks, libraries, kernels, and... ...(especially using CUDA C/C++, cuTile, Triton, or similar...SeniorRemote work$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact... ...build a state-of-the-art inference framework for accelerating Deep Learning... ...Background in GPU kernel programming using CUDA or OpenCL. Experience in...Senior- ...optimizing and developing deep learning frameworks for AMD GPUs. Your work will... ...collaborate across internal GPU software teams and engage with open-... ...PERSON: Skilled engineer with strong technical and... ...Working knowledge of HIP, CUDA, Triton, TileLang or other...Senior
$184k - $287.5k
...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor... ...high-performance deep learning frameworks, including SGLang and vLLM, which... ...including CUTLASS, OAI Triton, NCCL, and CUDA kernels-to implement and optimize...SeniorRemote work$184k - $287.5k
...We are now looking for a Senior Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced... ...LLM, VLLM, SGLang, Triton and CUDA kernels. Work and collaborate with... ...serving and deployment in NVIDIA/OSS LLM frameworks. Scale performance of LLM models...Senior$152k - $241.5k
...We are seeking a Senior Software Engineer to drive integration of the NVIDIA Grove project within Dynamo... ...a set of leading open-source AI frameworks. In this role, you will develop production... ...containers, networking, scheduling, CUDA/GPU utilization, and framework runtime...SeniorRemote work$152k - $241.5k
...advanced computer vision and deep learning. Our team builds large-... ...are looking for a seasoned software engineer to help build video analytics... ...GPU acceleration (such as CUDA, TensorRT, or comparable technologies... ...based on machine learning frameworks such as PyTorch ~ Sound...SeniorWorldwide$152k - $241.5k
...We are now looking for a Senior Deep Learning Software Engineer, TensorRT Performance! NVIDIA is seeking an... ...community to integrate TensorRT into OSS frameworks like TensorRT-EdgeLLM and PyTorch.... ...domain specific languages (e.g. CUDA/TileIR/CuTeDSL/cutlass/Triton)....SeniorRemote work$224k - $356.5k
...advanced computer vision and deep learning. Our team builds large-... ...-world impact. As a System Software Engineer for Vision AI, you will develop... ...GPU acceleration (such as CUDA, TensorRT, or comparable... ...video pipelines, or media frameworks) and integrating vision models...Senior$184k - $287.5k
Senior Software Engineer, AI Storage page is loaded## Senior Software Engineer... .... More recently, GPU deep learning ignited modern deep learning... ...IO.* Develop advanced C++/CUDA libraries and algorithms for... ...optimization(s) in the IO stack, frameworks, and applications.* Work...Senior$152k - $241.5k
...NVIDIA seeks a senior software engineer to join the AI Networking... ...productizing machine learning tools. These include... ...working on distributed Deep Learning, particularly... ...applications, machine learning frameworks, and communication... ...of NVIDIA GPUs, the CUDA library, and deep...Senior$154.1k - $188.3k
## Senior Imaging Software EngineerSunnyvale, California,United... .../or Electrical Engineering with an emphasis... ...machine learning models in production using frameworks such as PyTorch and... ...TensorFlow, including deep learning architectures... ...(e.g., CUDA or OpenCL).* Experience...SeniorFull timeWorldwideFlexible hours$184k - $287.5k
...highly skilled and motivated software engineers to join us and build AI... ...and optimize the inference framework (vLLM) with methods like speculative... ..., distributed systems, deep learning theories. Knowledgeable... ...and performance: CUDA, memory hierarchy, streams,...Senior$184k - $287.5k
Senior DL Software Engineer, Model Optimization and Edge Deployment - Autonomous... ...are seeking a high-caliber Deep Learning Engineer to bridge the gap... ...similar machine learning frameworks.* Sophisticated proficiency... ...specifically TensorRT and CUDA.* Strong understanding of...Senior$152k - $241.5k
...about driving innovation in deep learning and eager to work on... ...NVIDIA's TensorRT team as a Senior Software Engineer, and be at the forefront of... ...deep learning concepts and frameworks Experience with safety-... ...Proficiency with Python and/or CUDA, ideally with experience...Senior$152k - $241.5k
Senior Infrastructure Software Engineer, Deep Learning Libraries page is loaded## Senior Infrastructure Software Engineer... ...multiple products, including , , and CUDA kernel libraries. The mission is... ...of unit and integration test frameworks and experience with crafting them...Senior$152k - $241.5k
...We are hiring senior engineers to work on the CUDA driver, a core component of our platform for accelerating... ...workloads, ranging from across deep learning, scientific computation, and self-driving... ...role incorporates strong system software programming skills, a detailed understanding...Senior$184k - $287.5k
...We are hiring senior engineers to work on the CUDA driver and runtime, core components of our platform for... ...applications, investigates bottlenecks in software or hardware and delivers features... ...workloads, ranging from deep learning, scientific computation, and self-driving...Senior$184k - $287.5k
...parallel computing. More recently, GPU deep learning ignited modern AI — the next era of... ...advancement. Are you a motivated system software engineer with a deep understanding of device... ...software professional to work on the CUDA Driver, a core component of our platform...Senior$184k - $287.5k
...We are looking for a Senior Software Engineer to become part of our storage management plane team... ...Kafka, MongoDB, K8s JavaScript frameworks: React, jQuery, c3j HTML5, CSS3... ...parallel computing. More recently, GPU deep learning ignited modern AI — the next era of...Senior$184k - $287.5k
...We are looking for a skilled Agentic AI Software Engineer to join our team. The ideal candidate... ...and Blueprints into leading agentic AI frameworks and open-source libraries. You will work... ..., latency, and correctness at scale Deep familiarity with the open-source inference...Senior$152k - $241.5k
...about redefining how software is built in the age of... ...entry point for out-of-framework inference globally. We... ...systems-thinking C++ engineer who wants to help scale... ...top of state-of-the-art deep learning breakthroughs, and... ...production software. CUDA & kernel expertise: Experience...Senior$184k - $287.5k
...Senior Software Engineer For Compiler Team NVIDIA's GPUs are at the core of modern AI infrastructure... ...for a compiler team within NVIDIA's deep learning software organization. This team... ...generation for DL compiler and framework integration. Building MLIR-based...SeniorWork experience placement
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Deep Learning Frameworks CUDA Software Engineer. Be the first to apply!
- graduate software developer Santa Clara, CA
- rust software engineer Santa Clara, CA
- senior software design engineer Santa Clara, CA
- software engineer amazon Santa Clara, CA
- software developer positions Santa Clara, CA
- software engineer full time Santa Clara, CA
- software qa engineer Santa Clara, CA
- new graduate software engineer Santa Clara, CA
- junior software developer Santa Clara, CA
- software engineer Santa Clara, CA

