Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Deep Learning Frameworks CUDA Software Engineer

$184k - $287.5k

NVIDIA

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.

We are looking for a motivated Deep Learning engineer to bring advanced CUDA features and Distributed Runtime technologies into AI stacks, including PyTorch, TRT-LLM, vLLM, SGLang, JAX, etc. You will be working with the team that created core CUDA features and runtimes for scaling Deep Learning and HPC applications. Your customers will have diverse multi-GPU demands, ranging from training on scales up to 100K GPUs to inference down at microsecond latency. CUDA features improve both productivity and performance of AI applications. Your work in AI toolkits will accelerate enabling those for the community. This is an outstanding opportunity for someone with an AI background to advance the state of the art in this space. Are you ready to contribute to the development of innovative technologies and help realize NVIDIA's vision?

What you will be doing:

  • Integrate new CUDA features and Runtime abstractions in AI frameworks: from PoC to performance analysis to production

  • Perform deep analysis of AI workloads and frameworks to identify requirements and opportunities to innovate in the lower layers of the stack. Collaborate hands-on with teams working on the latest AI models.

  • Own and drive improvements in the AI Compiler-Runtime interface to build speed-of-light multi-GPU multi-node solutions.

  • Design fault-tolerant and elastic solutions for large-scale or dynamic AI workloads.

  • Influence the roadmap of core CUDA to facilitate building next-gen DL frameworks.

  • Collaborate with a very dynamic team across multiple time zones.

  • Collaborate closely with AI researchers, HW and SW architects, kernel and compiler authors and CUDA driver experts to co-design systems and frameworks that enhance performance and programmability.

  • Develop exploratory tools and runtime systems to profile and accelerate new paradigms in deep learning.

  • Write clean, effective, and maintainable code, ensuring exploratory prototypes can smoothly transition into open-source releases, upstream framework integrations, internal tools, or closed-source commercial products.

What we need to see:

  • BS, MS, or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).

  • 8+ years of relevant industry experience or equivalent academic experience after completed degree.

  • Development experience with Deep Learning Frameworks such PyTorch, JAX, and Inference Engines such as TRT-LLM, vLLM, SGLang

  • Rapid prototyping and development with Python, C++, CUDA or related DSLs

  • Solid grasp of AI models, parallelisms, and/or compiler technologies (e.g. torch.compile)

  • Experience conducting performance benchmarking on AI clusters. Familiarity with at least one performance profiler toolchain (PyTorch profiler, NVIDIA Nsight Systems)

  • Understanding of HPC/AI communication concepts

  • Good understanding of computer system architecture, HW-SW interactions and operating systems principles (aka systems software fundamentals)

  • Adaptability and passion to learn new frameworks and tools

  • Flexibility to work and communicate effectively across different teams and timezones

Ways to stand out from the crowd:

  • Deep expertise in the performance internals and execution graphs of major deep learning autograd, training and inference frameworks (e.g., PyTorch, JAX, TensorRT, vLLM, sgLang, Nemo, Megatron, MaxText, etc.).

  • Hands-on experience with CUDA, specific communication libraries (e.g., NCCL, MPI, UCX) and distributed machine learning techniques (e.g., pipeline parallelism, tensor parallelism).

  • Expertise in one or more of these areas: Training, Distributed inference, MoE, Reinforcement Learning, kernel authoring (on CUDA, Triton, cuTe, etc).

  • Background in deep learning compilers, both graph-level and codegen (e.g., Triton, XLA, torch compile)

  • Experience with programming for compute & communication overlap in distributed runtime

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits ( .

Applications for this job will be accepted at least until May 18, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Vacancy posted 8 hours ago
Similar jobs that could be interesting for youBased on the Senior Deep Learning Frameworks CUDA Software Engineer in United States vacancy
  • $184k - $287.5k

     ...We are looking for a motivated Deep Learning engineer to bring advanced CUDA features and Distributed Runtime technologies...  ...and Runtime abstractions in AI frameworks: from PoC to performance analysis...  ...systems principles (aka systems software fundamentals) ~ Adaptability and... 
    Senior

    NVIDIA

    Santa Clara, CA
    8 hours ago
  • $152k - $241.5k

     ...is seeking outstanding senior engineers to work on the CUDA driver, a key component...  ...You will join a versatile software engineering team that delivers...  ...diverse workloads like deep learning, scientific research,...  ...) and deep learning frameworks Your base salary will... 
    Senior

    NVIDIA

    Santa Clara, CA
    8 hours ago
  • $184k - $287.5k

    Senior Software Engineer, NCCL and CUDA - CSP Engagements page is loaded## Senior Software...  ...scale. The role combines deep technical expertise in workloads...  ...NCCL and CUDA libraries, frameworks, and system software...  ....* Experience with deep learning workloads training and... 
    Senior
    Remote work

    NVIDIA Corporation

    Austin, TX
    3 days ago
  • $184k - $287.5k

    CUDA defines a unified programming model across a range of system configurations...  ...characteristic protected by law.We are hiring software engineers to work on the CUDA driver, a core...  ...computational workloads, ranging from deep learning, scientific computation, and self-... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

    Senior Software Engineer, RL Post-Training Frameworks page is loaded## Senior Software Engineer, RL Post-Training Frameworkslocations...  ...id: JR2015863Reinforcement learning post-training is driving some of...  ...address their need optimizing deep learning frameworks, or building... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...We are now looking for a Senior AI Frameworks Engineer (C++/Python)! NVIDIA's high...  ...and industries. Within our software stack, CUTLASS stands out...  ...scientific computing and deep learning frameworks. Develop robust...  ...programming models (CUDA). NVIDIA is widely considered... 
    Senior

    NVIDIA

    Santa Clara, CA
    8 hours ago
  • $224k - $356.5k

     ...We are looking for a Senior Deep Learning Software Engineer to design and build our automated inference and...  ...developing features in high-level frameworks like PyTorch and JAX, designing and...  ...and developing custom GPU kernels in CUDA and/or Triton. This is an exceptional... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...AI and machine learning to solve some of...  ...talented and motivated engineers to join our...  ...-leading deep learning inference software for NVIDIA AI accelerators...  .... As a Senior Software Engineer...  ...C++, Python, and CUDA for seamless and...  ...developing Deep Learning Frameworks, Compilers, or... 
    Senior

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...looking for a highly motivated senior software engineer for an exciting role in...  ...Performance Computing and Deep Learning. What you will be doing...  ...for Deep Learning frameworks (e.g. NCCL for TensorFlow/...  ...you do. Experience with CUDA programming and NVIDIA GPUs... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $134k - $184k

     ...Senior Software Engineer / Back-End Developer Arlington, VA...  ...analytics and machine learning-based solutions to solve...  ...opportunities to use their deep technical knowledge...  ..., and tooling frameworks ~ Strong problem solving...  ...skills in CUDA Familiarity with statistical... 
    Senior
    Full time
    Night shift

    Science & Technology Research (STR)

    Woburn, MA
    4 days ago
  •  ...unsupervised machine learning technology,...  ...powerful decision engine and investigation...  ...Our award-winning software platform is powered...  ...machine learning, and deep learning to detect...  ...Experience in the Spring Framework is a plus ~...  ...Experience in CUDA development is a plus... 
    Senior

    DataVisor

    Mountain View, CA
    2 days ago
  • $184k - $287.5k

     ...for outstanding AI systems engineers to develop groundbreaking technologies...  ...in the inference systems software stack! We build innovative...  ...engineers at NVIDIA across deep learning frameworks, libraries, kernels, and...  ...(especially using CUDA C/C++, cuTile, Triton, or similar... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     .... More recently, GPU deep learning ignited modern deep learning...  ...Develop advanced C++/CUDA libraries and...  ...(s) in the IO stack, frameworks, and applications....  ...willing to take on complex engineering tasks that progress...  ...experience in storage software such as Key-Value,... 
    Senior
    Remote work

    NVIDIA

    United States
    8 hours ago
  • $119.8k - $234.7k

     ...Overview The AI Frameworks team at Microsoft accelerates...  ...and GPUs. We build software across the stack,...  ...seeking a self-motivated Senior Software Engineer - AI Frameworks who...  ..., enjoys diving deep into technical details...  ...optimization (e.g., CUDA, Triton, or accelerator... 
    Senior
    Ongoing contract
    Local area

    Microsoft Corporation

    Redmond, WA
    4 days ago
  • $152k - $241.5k

     ...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact...  ...build a state-of-the-art inference framework for accelerating Deep Learning...  ...Background in GPU kernel programming using CUDA or OpenCL. Experience in... 
    Senior

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...optimizing and developing deep learning frameworks for AMD GPUs. Your work will...  ...collaborate across internal GPU software teams and engage with open-...  ...PERSON: Skilled engineer with strong technical and...  ...Working knowledge of HIP, CUDA, Triton, TileLang or other... 
    Senior

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor...  ...high-performance deep learning frameworks, including SGLang and vLLM, which...  ...including CUTLASS, OAI Triton, NCCL, and CUDA kernels-to implement and optimize... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    7 days ago
  • $184k - $287.5k

     ...We are now looking for a Senior Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced...  ...LLM, VLLM, SGLang, Triton and CUDA kernels. Work and collaborate with...  ...serving and deployment in NVIDIA/OSS LLM frameworks. Scale performance of LLM models... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...We are seeking a Senior Software Engineer to drive integration of the NVIDIA Grove project within Dynamo...  ...a set of leading open-source AI frameworks. In this role, you will develop production...  ...containers, networking, scheduling, CUDA/GPU utilization, and framework runtime... 
    Senior
    Remote work

    NVIDIA

    United States
    3 days ago
  • $152k - $241.5k

     ...advanced computer vision and deep learning. Our team builds large-...  ...are looking for a seasoned software engineer to help build video analytics...  ...GPU acceleration (such as CUDA, TensorRT, or comparable technologies...  ...based on machine learning frameworks such as PyTorch ~ Sound... 
    Senior
    Worldwide

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...We are now looking for a Senior Deep Learning Software Engineer, TensorRT Performance! NVIDIA is seeking an...  ...community to integrate TensorRT into OSS frameworks like TensorRT-EdgeLLM and PyTorch....  ...domain specific languages (e.g. CUDA/TileIR/CuTeDSL/cutlass/Triton).... 
    Senior
    Remote work

    NVIDIA

    United States
    6 days ago
  • $224k - $356.5k

     ...advanced computer vision and deep learning. Our team builds large-...  ...-world impact. As a System Software Engineer for Vision AI, you will develop...  ...GPU acceleration (such as CUDA, TensorRT, or comparable...  ...video pipelines, or media frameworks) and integrating vision models... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $159k - $207k

     ...that executes the software and neural...  ...intersection of software engineering, machine learning, sensors, and...  ...: As a senior engineer in the...  ...platforms. Dive deep into the full ML...  ...environments, including CUDA, TensorRT, or...  ...and/or other ML frameworks. Experience... 
    Senior
    Work at office
    2 days per week

    Motional

    Boston, MA
    8 hours ago
  • $150k - $185k

     ...Equipped with elite engineering and dynamic innovation...  ...creative and can dig deep within ourselves to find...  .... JOB SUMMARY: The Senior Principal Software Engineer supports the...  ..., tools, and frameworks to improve system capabilities...  ...of experience with CUDA, including libraries... 
    Senior
    Live in
    Work at office
    Local area
    Flexible hours

    Trace3

    Colorado Springs, CO
    3 days ago
  • $220k - $292k

     ...Senior Software Engineer, Perception Anduril Industries is a defense...  ...Vision and Machine Learning Engineering. You will...  ...our UAVs, bringing deep expertise in Object Detection...  ...and deep learning frameworks such as PyTorch and...  .... Proficient in CUDA. US Salary Range... 
    Senior
    Full time
    Work experience placement
    Immediate start
    Remote work
    Relocation package

    anduril

    Costa Mesa, CA
    4 days ago
  • $184k - $287.5k

     ...parallel computing. More recently, GPU deep learning ignited modern AI - the next era of...  ...advancement. Are you a motivated system software engineer with a deep understanding of device...  ...software professional to work on the CUDA Driver, a core component of our platform... 
    Senior
    Remote work

    NVIDIA

    United States
    4 days ago
  • $220k

    Perplexity is looking for an engineer to join their team in San Francisco. You will work on building...  ...candidate has 3+ years of experience in software engineering with a focus on ML inference, familiarity with deep learning frameworks, and a strong understanding of GPU... 
    Senior

    Perplexity

    San Francisco, CA
    1 day ago
  • $184k - $287.5k

     ...looking for an experienced software professional to...  ...HPC, through popular frameworks such as NumPy, SciPy,...  ...computing, data analytics, deep learning, and professional...  ...Math, Electrical Engineering or related field (or...  ...Excellent Python, C++ and CUDA programming skills... 
    Senior
    Remote work

    NVIDIA

    United States
    1 day ago
  • $152k - $287.5k

    A technology company is seeking a Senior Deep Learning Framework Communications Engineer in Austin, TX. This role requires extensive experience in software engineering and HPC/AI, particularly with Deep Learning Frameworks like PyTorch. The responsibilities include integrating... 
    Senior

    NVIDIA Corporation

    Austin, TX
    1 day ago
  • $193.3k - $261.5k

     ...AWS Neuron, the software development kit used...  ...to accelerate deep learning and GenAI workloads...  ...and application framework that seamlessly integrates...  ...boundary, our engineers build systematic...  ...mentorship. Our senior members enjoy one...  ...Familiarity with CUDA kernels or... 
    Senior
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Deep Learning Frameworks CUDA Software Engineer. Be the first to apply!