Senior Deep Learning Frameworks CUDA Software Engineer
$184k - $287.5kNVIDIA
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.
We are looking for a motivated Deep Learning engineer to bring advanced CUDA features and Distributed Runtime technologies into AI stacks, including PyTorch, TRT-LLM, vLLM, SGLang, JAX, etc. You will be working with the team that created core CUDA features and runtimes for scaling Deep Learning and HPC applications. Your customers will have diverse multi-GPU demands, ranging from training on scales up to 100K GPUs to inference down at microsecond latency. CUDA features improve both productivity and performance of AI applications. Your work in AI toolkits will accelerate enabling those for the community. This is an outstanding opportunity for someone with an AI background to advance the state of the art in this space. Are you ready to contribute to the development of innovative technologies and help realize NVIDIA's vision?
What you will be doing:
Integrate new CUDA features and Runtime abstractions in AI frameworks: from PoC to performance analysis to production
Perform deep analysis of AI workloads and frameworks to identify requirements and opportunities to innovate in the lower layers of the stack. Collaborate hands-on with teams working on the latest AI models.
Own and drive improvements in the AI Compiler-Runtime interface to build speed-of-light multi-GPU multi-node solutions.
Design fault-tolerant and elastic solutions for large-scale or dynamic AI workloads.
Influence the roadmap of core CUDA to facilitate building next-gen DL frameworks.
Collaborate with a very dynamic team across multiple time zones.
Collaborate closely with AI researchers, HW and SW architects, kernel and compiler authors and CUDA driver experts to co-design systems and frameworks that enhance performance and programmability.
Develop exploratory tools and runtime systems to profile and accelerate new paradigms in deep learning.
Write clean, effective, and maintainable code, ensuring exploratory prototypes can smoothly transition into open-source releases, upstream framework integrations, internal tools, or closed-source commercial products.
What we need to see:
BS, MS, or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
8+ years of relevant industry experience or equivalent academic experience after completed degree.
Development experience with Deep Learning Frameworks such PyTorch, JAX, and Inference Engines such as TRT-LLM, vLLM, SGLang
Rapid prototyping and development with Python, C++, CUDA or related DSLs
Solid grasp of AI models, parallelisms, and/or compiler technologies (e.g. torch.compile)
Experience conducting performance benchmarking on AI clusters. Familiarity with at least one performance profiler toolchain (PyTorch profiler, NVIDIA Nsight Systems)
Understanding of HPC/AI communication concepts
Good understanding of computer system architecture, HW-SW interactions and operating systems principles (aka systems software fundamentals)
Adaptability and passion to learn new frameworks and tools
Flexibility to work and communicate effectively across different teams and timezones
Ways to stand out from the crowd:
Deep expertise in the performance internals and execution graphs of major deep learning autograd, training and inference frameworks (e.g., PyTorch, JAX, TensorRT, vLLM, sgLang, Nemo, Megatron, MaxText, etc.).
Hands-on experience with CUDA, specific communication libraries (e.g., NCCL, MPI, UCX) and distributed machine learning techniques (e.g., pipeline parallelism, tensor parallelism).
Expertise in one or more of these areas: Training, Distributed inference, MoE, Reinforcement Learning, kernel authoring (on CUDA, Triton, cuTe, etc).
Background in deep learning compilers, both graph-level and codegen (e.g., Triton, XLA, torch compile)
Experience with programming for compute & communication overlap in distributed runtime
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.
You will also be eligible for equity and benefits ( .
Applications for this job will be accepted at least until May 18, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
$184k - $287.5k
...We are looking for a motivated Deep Learning engineer to bring advanced CUDA features and Distributed Runtime technologies... ...and Runtime abstractions in AI frameworks: from PoC to performance analysis... ...systems principles (aka systems software fundamentals) ~ Adaptability and...Senior$152k - $241.5k
...is seeking outstanding senior engineers to work on the CUDA driver, a key component... ...You will join a versatile software engineering team that delivers... ...diverse workloads like deep learning, scientific research,... ...) and deep learning frameworks Your base salary will...Senior$184k - $287.5k
Senior Software Engineer, NCCL and CUDA - CSP Engagements page is loaded## Senior Software... ...scale. The role combines deep technical expertise in workloads... ...NCCL and CUDA libraries, frameworks, and system software... ....* Experience with deep learning workloads training and...SeniorRemote work$184k - $287.5k
CUDA defines a unified programming model across a range of system configurations... ...characteristic protected by law.We are hiring software engineers to work on the CUDA driver, a core... ...computational workloads, ranging from deep learning, scientific computation, and self-...Senior$184k - $287.5k
Senior Software Engineer, RL Post-Training Frameworks page is loaded## Senior Software Engineer, RL Post-Training Frameworkslocations... ...id: JR2015863Reinforcement learning post-training is driving some of... ...address their need optimizing deep learning frameworks, or building...Senior$152k - $241.5k
...We are now looking for a Senior AI Frameworks Engineer (C++/Python)! NVIDIA's high... ...and industries. Within our software stack, CUTLASS stands out... ...scientific computing and deep learning frameworks. Develop robust... ...programming models (CUDA). NVIDIA is widely considered...Senior$224k - $356.5k
...We are looking for a Senior Deep Learning Software Engineer to design and build our automated inference and... ...developing features in high-level frameworks like PyTorch and JAX, designing and... ...and developing custom GPU kernels in CUDA and/or Triton. This is an exceptional...Senior$152k - $241.5k
...AI and machine learning to solve some of... ...talented and motivated engineers to join our... ...-leading deep learning inference software for NVIDIA AI accelerators... .... As a Senior Software Engineer... ...C++, Python, and CUDA for seamless and... ...developing Deep Learning Frameworks, Compilers, or...Senior$152k - $241.5k
...looking for a highly motivated senior software engineer for an exciting role in... ...Performance Computing and Deep Learning. What you will be doing... ...for Deep Learning frameworks (e.g. NCCL for TensorFlow/... ...you do. Experience with CUDA programming and NVIDIA GPUs...Senior$134k - $184k
...Senior Software Engineer / Back-End Developer Arlington, VA... ...analytics and machine learning-based solutions to solve... ...opportunities to use their deep technical knowledge... ..., and tooling frameworks ~ Strong problem solving... ...skills in CUDA Familiarity with statistical...SeniorFull timeNight shift- ...unsupervised machine learning technology,... ...powerful decision engine and investigation... ...Our award-winning software platform is powered... ...machine learning, and deep learning to detect... ...Experience in the Spring Framework is a plus ~... ...Experience in CUDA development is a plus...Senior
$184k - $287.5k
...for outstanding AI systems engineers to develop groundbreaking technologies... ...in the inference systems software stack! We build innovative... ...engineers at NVIDIA across deep learning frameworks, libraries, kernels, and... ...(especially using CUDA C/C++, cuTile, Triton, or similar...SeniorRemote work$184k - $287.5k
.... More recently, GPU deep learning ignited modern deep learning... ...Develop advanced C++/CUDA libraries and... ...(s) in the IO stack, frameworks, and applications.... ...willing to take on complex engineering tasks that progress... ...experience in storage software such as Key-Value,...SeniorRemote work$119.8k - $234.7k
...Overview The AI Frameworks team at Microsoft accelerates... ...and GPUs. We build software across the stack,... ...seeking a self-motivated Senior Software Engineer - AI Frameworks who... ..., enjoys diving deep into technical details... ...optimization (e.g., CUDA, Triton, or accelerator...SeniorOngoing contractLocal area$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact... ...build a state-of-the-art inference framework for accelerating Deep Learning... ...Background in GPU kernel programming using CUDA or OpenCL. Experience in...Senior- ...optimizing and developing deep learning frameworks for AMD GPUs. Your work will... ...collaborate across internal GPU software teams and engage with open-... ...PERSON: Skilled engineer with strong technical and... ...Working knowledge of HIP, CUDA, Triton, TileLang or other...Senior
$184k - $287.5k
...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor... ...high-performance deep learning frameworks, including SGLang and vLLM, which... ...including CUTLASS, OAI Triton, NCCL, and CUDA kernels-to implement and optimize...SeniorRemote work$184k - $287.5k
...We are now looking for a Senior Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced... ...LLM, VLLM, SGLang, Triton and CUDA kernels. Work and collaborate with... ...serving and deployment in NVIDIA/OSS LLM frameworks. Scale performance of LLM models...Senior$152k - $241.5k
...We are seeking a Senior Software Engineer to drive integration of the NVIDIA Grove project within Dynamo... ...a set of leading open-source AI frameworks. In this role, you will develop production... ...containers, networking, scheduling, CUDA/GPU utilization, and framework runtime...SeniorRemote work$152k - $241.5k
...advanced computer vision and deep learning. Our team builds large-... ...are looking for a seasoned software engineer to help build video analytics... ...GPU acceleration (such as CUDA, TensorRT, or comparable technologies... ...based on machine learning frameworks such as PyTorch ~ Sound...SeniorWorldwide$152k - $241.5k
...We are now looking for a Senior Deep Learning Software Engineer, TensorRT Performance! NVIDIA is seeking an... ...community to integrate TensorRT into OSS frameworks like TensorRT-EdgeLLM and PyTorch.... ...domain specific languages (e.g. CUDA/TileIR/CuTeDSL/cutlass/Triton)....SeniorRemote work$224k - $356.5k
...advanced computer vision and deep learning. Our team builds large-... ...-world impact. As a System Software Engineer for Vision AI, you will develop... ...GPU acceleration (such as CUDA, TensorRT, or comparable... ...video pipelines, or media frameworks) and integrating vision models...Senior$159k - $207k
...that executes the software and neural... ...intersection of software engineering, machine learning, sensors, and... ...: As a senior engineer in the... ...platforms. Dive deep into the full ML... ...environments, including CUDA, TensorRT, or... ...and/or other ML frameworks. Experience...SeniorWork at office2 days per week$150k - $185k
...Equipped with elite engineering and dynamic innovation... ...creative and can dig deep within ourselves to find... .... JOB SUMMARY: The Senior Principal Software Engineer supports the... ..., tools, and frameworks to improve system capabilities... ...of experience with CUDA, including libraries...SeniorLive inWork at officeLocal areaFlexible hours$220k - $292k
...Senior Software Engineer, Perception Anduril Industries is a defense... ...Vision and Machine Learning Engineering. You will... ...our UAVs, bringing deep expertise in Object Detection... ...and deep learning frameworks such as PyTorch and... .... Proficient in CUDA. US Salary Range...SeniorFull timeWork experience placementImmediate startRemote workRelocation package$184k - $287.5k
...parallel computing. More recently, GPU deep learning ignited modern AI - the next era of... ...advancement. Are you a motivated system software engineer with a deep understanding of device... ...software professional to work on the CUDA Driver, a core component of our platform...SeniorRemote work$220k
Perplexity is looking for an engineer to join their team in San Francisco. You will work on building... ...candidate has 3+ years of experience in software engineering with a focus on ML inference, familiarity with deep learning frameworks, and a strong understanding of GPU...Senior$184k - $287.5k
...looking for an experienced software professional to... ...HPC, through popular frameworks such as NumPy, SciPy,... ...computing, data analytics, deep learning, and professional... ...Math, Electrical Engineering or related field (or... ...Excellent Python, C++ and CUDA programming skills...SeniorRemote work$152k - $287.5k
A technology company is seeking a Senior Deep Learning Framework Communications Engineer in Austin, TX. This role requires extensive experience in software engineering and HPC/AI, particularly with Deep Learning Frameworks like PyTorch. The responsibilities include integrating...Senior$193.3k - $261.5k
...AWS Neuron, the software development kit used... ...to accelerate deep learning and GenAI workloads... ...and application framework that seamlessly integrates... ...boundary, our engineers build systematic... ...mentorship. Our senior members enjoy one... ...Familiarity with CUDA kernels or...SeniorWork experience placementInternshipLocal areaFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Deep Learning Frameworks CUDA Software Engineer. Be the first to apply!
- graduate software developer United States
- rust software engineer United States
- senior software design engineer United States
- software engineer student United States
- software engineer amazon United States
- software developer positions United States
- software engineer full time United States
- software qa engineer United States
- new graduate software engineer United States
- junior software developer United States

