Senior Software Engineer, CUDA Deep Learning Systems
$184k - $287.5kNVIDIA
We are looking for an experienced and highly motivated software professional to work on pioneering initiatives and projects at the intersection of CUDA and Deep Learning Systems. As the complexity and scale of artificial intelligence continue to grow, the intersection of advanced deep learning architectures, massive-scale distributed computing, and low-level hardware optimization has never been more critical. Our team is dedicated to exploring and prototyping next-generation ideas that bridge the gap between deep learning algorithms and CUDA, pushing the boundaries of what is possible on modern accelerator architectures. Join our dynamic, research-oriented team to help unlock maximum hardware performance for emerging AI workloads. You will be a crucial member of a highly technical group exploring uncharted territories in model optimization, custom kernel development, and cluster-scale AI systems design. If you are passionate about the fundamentals of deep learning and thrive on squeezing every ounce of performance out of advanced computing systems from a single GPU to supercomputer clusters, we want you on our team! What you will be doing: Explore, research, and prototype novel systems optimizations for advanced deep learning models at the intersection of high-level DL frameworks and low-level CUDA through modeling, simulation, and silicon prototyping. Architect and optimize distributed computing systems that scale seamlessly from a single node to massive, cluster‑scale supercomputing environments. Design, implement, and optimize custom high‑performance CUDA kernels tailored to emerging neural network architectures and workloads. Analyze complex hardware‑software interactions to identify and resolve performance bottlenecks in both training and inference pipelines. Collaborate closely with AI researchers, HW and SW architects, kernel and compiler authors and CUDA driver experts to co‑design systems and algorithms that improve accelerator compute utilization, memory bandwidth, cross‑node network communication efficiency and programmability. Develop exploratory tools and runtime systems to profile and accelerate new paradigms in deep learning. Write clean, effective, and maintainable code, ensuring exploratory prototypes can smoothly transition into open‑source releases, upstream framework integrations, internal tools, or closed‑source commercial products. What we need to see: BS, MS, or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience). 8+ years of relevant industry experience or equivalent academic experience after degree achievement. Strong proficiency in C++ and Python programming. Solid background in the fundamentals of Deep Learning with a focus on transformers. Strong understanding of distributed computing principles, multi‑node scaling, and the unique performance challenges of cluster‑scale execution. Proven experience in systems programming, computer architecture, and low‑level systems performance optimization. Familiarity with deep learning accelerator architectures such as the GPU and hands‑on experience with CUDA programming and kernel optimization. A strong analytical approach with experience using profiling tools to deeply understand software performance on hardware. Experience profiling and optimizing innovative vision models, generative AI architectures, or diffusion models. Background in deep learning compilers, both graph‑level and codegen (e.g., Triton, XLA, torch compile). Ways to stand out from the crowd: Deep expertise in the performance internals and execution graphs of major deep learning autograd, training and inference frameworks (e.g., PyTorch, JAX, TensorRT, vLLM, sgLang, Nemo, Megatron, MaxText, etc.). Hands‑on experience with CUDA, communication libraries (e.g., NCCL, MPI, UCX) and distributed machine learning techniques (e.g., pipeline parallelism, tensor parallelism). Knowledge of numerical methods, low‑precision arithmetic (e.g., NVFP4, MXFP4, FP8, INT8), and their implications on deep learning model accuracy and performance. Familiarity with systems requirements for Reinforcement Learning (RL) or highly parallel simulation environments and/or research background in machine learning systems or adjacent fields. Experience with machine learning, especially agentic systems, applied to systems problems. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until July 1, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA pioneered accelerated computing. Today, our AI infrastructure powers global intelligence, transforming every industry. Learn more about NVIDIA. #J-18808-Ljbffr NVIDIA
$184k - $287.5k
...computing. More recently, GPU deep learning ignited modern AI —... ...features related to CUDA’s memory model and... ...for multiple operating systems Qualifications BS or MS... ...Science, Electrical Engineering or related field (or equivalent... ..., or Windows Systems Software development...Senior- NVIDIA Corporation is seeking a Senior System Software Engineer in Santa Clara, CA, to work on the CUDA driver to accelerate general-purpose computation on GPUs. This... ...related to NVIDIA's hardware for tasks like deep learning and gaming. The ideal candidate will possess strong...Senior
$152k - $241.5k
We are hiring senior engineers to work on the CUDA driver, a core component of our platform... ...workloads, ranging from deep learning, scientific computation,... ...model across a range of system configurations and... ...incorporates strong system software programming skills, a detailed...Senior- We are hiring senior engineers to work on the CUDA driver and runtime, core components... ...investigates bottlenecks in software or hardware and delivers... ...workloads, ranging from deep learning, scientific computation,... ...model across a range of system configurations and hardware...Senior
$272k - $431.25k
...We are hiring senior engineers to work on the CUDA driver, a core component of our platform... ...workloads, ranging from deep learning, scientific computation,... ...model across a range of system configurations and... ...years of relevant systems software development experience...Suggested$184k - $287.5k
NVIDIA is seeking a Senior Software Engineer, NCCL and CUDA specialization to join our Cloud Service... ...exposure to PCIe and NVLINK. Deep understanding of operating systems and data‑center system architecture... .... Experience with deep learning workloads training and inferencing...Senior$184k - $287.5k
We are hiring software engineers to work on the CUDA driver, a core component of our platform for accelerating... ...workloads, ranging from deep learning, scientific computation, and self-driving... ...programming model across a range of system configurations and hardware capabilities...Senior$152k - $241.5k
NVIDIA is seeking outstanding senior engineers to work on the CUDA driver, a key component... ...will join a versatile software engineering team that delivers... ...diverse workloads like deep learning, scientific research,... ...networking software. Your system-level expertise and creativity...Senior$184k - $287.5k
...seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models... ..., distributed systems, deep learning theories. Knowledgeable and passionate... ...programming and performance: CUDA, memory hierarchy, streams,...Senior- ...professional to lead the development of new features for the CUDA APIs. The role involves coordinating efforts across... ...and maintaining high-quality code, and driving software development across multiple operating systems. Applicants should possess a BS or MS in Computer...Senior
- NVIDIA Corporation in Santa Clara is seeking an experienced software professional passionate about deep learning systems optimization. This role involves pioneering projects at the intersection of CUDA and AI, focusing on unlocking hardware performance for advanced workloads...Senior
$152k - $287.5k
...Santa Clara, CA is seeking a Senior Engineer to develop CUDA features for GPU... ...hardware architects and software teams to design and implement... ...various applications like deep learning and autonomous vehicles.... ...abilities, and experience in systems software development. A...Senior$152k - $241.5k
...passionate about redefining how software is built in the age of... ...scale. If you are a systems‑thinking C++ engineer who wants to help scale out... ...on top of state‑of‑the‑art deep learning breakthroughs, and improve... ...for production software. CUDA & kernel expertise: Experience...Senior$184k - $287.5k
NVIDIA Gruppe is seeking a motivated system software engineer in California to work on the CUDA Driver, essential for accelerating computations on GPUs. The ideal candidate will have strong C/C++ skills, with 7+ years of development experience, and a background in OS interfaces...Senior$184k - $287.5k
Overview Senior system software engineer position focused on the CUDA Driver, a core component accelerating general‑purpose GPU computation. The role involves enhancing CUDA’s memory model and multi‑node scalability for leading AI and other high‑performance workloads....Senior$152k - $241.5k
Senior System Software Engineer - Software Development Kit page is loaded## Senior System Software Engineer... ...C/C++, or MISRA.* Experience with CUDA, TensorRT* Track record integrating... ...of artificial intelligence. NVIDIA’s deep learning platform has already made a major...SeniorImmediate start$184k - $287.5k
...are looking for a dedicated engineer for the Senior Systems Software Engineer role, focusing on... ...software stacks (CUDA). Experience with modern... ...in C/C++/Python/Bash. Deep expertise in systems architecture... ...performance computing or deep learning in engineering or academic...SeniorRemote work$184k - $287.5k
...doing: Develop use cases and system requirements for L3 and L4... ...closely with Data Analytics, Test Engineering, and System Integration &... ..., data analysis, and software architecture. Strong software... ...trade-offs between End-to-End deep learning approaches, classical modular...Senior$168k - $270.25k
Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior Software Engineer, Distributed Systems... ...NVIDIA GPUs. You will apply your deep technical expertise to design an... ...intersecting our prowess in deep learning and computing, with industry-leading...SeniorRemote work$152k - $241.5k
...motivated Performance Engineer to influence the... ...CPU, networking) and software components in the... ...understanding of computer system architecture,... ...and passion to learn new areas and tools... .... Familiarity with CUDA programming and/or... ...GPUs. Experience with deep learning frameworks...Senior- NVIDIA Gruppe is hiring a senior engineer to work on the CUDA driver, a key component of NVIDIA's platform. This role will involve working closely... ...degree and at least 5 years of relevant experience in systems software development, with excellent skills in C programming...Senior
- Senior Systems Software Engineer - GPU Performance at Scale We are looking for a dedicated... ...computing software stacks (CUDA). Experience with modern... ...experience in C/C++/Python/Bash. Deep expertise in systems... ...performance computing or deep learning in engineering or academic...Senior
$224k - $356.5k
...and medical devices. Our software platforms are central to this... ...globally! We are hiring a Senior Systems Software Engineer to join our team as a... ...expert focused on optimizing deep learning inference for autonomous vehicles... ...fundamentals, CUDA, and low‑level performance...SeniorImmediate start$124k - $195.5k
We need a dedicated and motivated System Software Engineer who is passionate about AI infrastructure... ..., and software stacks. Profile deep‑learning workloads, identify bottlenecks, and... ...Understanding of CPU/GPU architecture plus CUDA, cuDNN, TensorRT‑LLM, Triton, NCCL. Excellent...Senior- Overview NVIDIA is seeking a Senior Software Engineer to join our CSP Engagements team, focusing on system software for datacenter... ...GB200. This role combines deep technical expertise in embedded... ...with GPU computing (CUDA) and deep learning workloads. Expertise in Out...Senior
$152k - $241.5k
...delivery of practical systems that operate within tight... ...is looking for a Sr. Software Engineer specializing in... ...multimodal representation learning, model adaptation, domain... ...proven experience in deep learning, machine... ...tools such as TensorRT, CUDA, cuDNN, Triton, DeepStream...SeniorImmediate startShift work$224k - $356.5k
...building innovative server systems for GPU accelerated applications, such as Deep Learning. Data Center SW team... ...the end‑to‑end software and firmware stack for... ...We are looking for a Senior Software Architect who... ...work with world‑class engineering teams, product management...SeniorShift work- We are now looking for a Senior Deep Learning Software Engineer, PyTorch. NVIDIA is hiring software engineers to... .... Strong understanding of systems software and interfaces. Demonstrated... ...learning modeling trends. Background with CUDA Programming as well as Python. Demonstrated...Senior
$152k - $241.5k
...looking for a highly motivated senior software engineer for an exciting role in our... ...heterogeneous computing systems that power disruptive... ...Performance Computing and Deep Learning. What you will be doing: Design... ...what you do. Experience with CUDA programming and NVIDIA GPUs...Senior$184k - $287.5k
...computing. More recently, GPU deep learning ignited modern deep... ...Develop advanced C++/CUDA libraries and... ...willing to take on complex engineering tasks that progress towards... ..., Object storage systems, Databases, Vector Databases... ...experience in storage software such as Key-Value,...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer, CUDA Deep Learning Systems. Be the first to apply!
- software engineer amazon Santa Clara, CA
- experienced software developer Santa Clara, CA
- federal - software developer Santa Clara, CA
- software developer internship Santa Clara, CA
- senior software engineer Santa Clara, CA
- software developer fintech Santa Clara, CA
- part time software developer remote Santa Clara, CA
- software developer intern Santa Clara, CA
- software data engineer Santa Clara, CA
- software engineer Santa Clara, CA


