Senior Software Engineer, CUDA Deep Learning Systems
$184k - $287.5kNVIDIA Corporation
We are looking for an experienced and highly motivated software professional to work on pioneering initiatives and projects at the intersection of CUDA and Deep Learning Systems. As the complexity and scale of artificial intelligence continue to grow, the intersection of advanced deep learning architectures, massive-scale distributed computing, and low-level hardware optimization has never been more critical. Our team is dedicated to exploring and prototyping next-generation ideas that bridge the gap between deep learning algorithms and CUDA, pushing the boundaries of what is possible on modern accelerator architectures. Join our dynamic, research-oriented team to help unlock maximum hardware performance for emerging AI workloads. You will be a crucial member of a highly technical group exploring uncharted territories in model optimization, custom kernel development, and cluster-scale AI systems design. If you are passionate about the fundamentals of deep learning and thrive on squeezing every ounce of performance out of advanced computing systems from a single GPU to supercomputer clusters, we want you on our team! What you will be doing: Explore, research, and prototype novel systems optimizations for advanced deep learning models at the intersection of high-level DL frameworks and low-level CUDA through modeling, simulation, and silicon prototyping. Architect and optimize distributed computing systems that scale seamlessly from a single node to massive, cluster‑scale supercomputing environments. Design, implement, and optimize custom high‑performance CUDA kernels tailored to emerging neural network architectures and workloads. Analyze complex hardware‑software interactions to identify and resolve performance bottlenecks in both training and inference pipelines. Collaborate closely with AI researchers, HW and SW architects, kernel and compiler authors and CUDA driver experts to co‑design systems and algorithms that improve accelerator compute utilization, memory bandwidth, cross‑node network communication efficiency and programmability. Develop exploratory tools and runtime systems to profile and accelerate new paradigms in deep learning. Write clean, effective, and maintainable code, ensuring exploratory prototypes can smoothly transition into open‑source releases, upstream framework integrations, internal tools, or closed‑source commercial products. What we need to see: BS, MS, or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience). 8+ years of relevant industry experience or equivalent academic experience after degree achievement. Strong proficiency in C++ and Python programming. Solid background in the fundamentals of Deep Learning with a focus on transformers. Strong understanding of distributed computing principles, multi‑node scaling, and the unique performance challenges of cluster‑scale execution. Proven experience in systems programming, computer architecture, and low‑level systems performance optimization. Familiarity with deep learning accelerator architectures such as the GPU and hands‑on experience with CUDA programming and kernel optimization. A strong analytical approach with experience using profiling tools to deeply understand software performance on hardware. Experience profiling and optimizing innovative vision models, generative AI architectures, or diffusion models. Background in deep learning compilers, both graph‑level and codegen (e.g., Triton, XLA, torch compile). Ways to stand out from the crowd: Deep expertise in the performance internals and execution graphs of major deep learning autograd, training and inference frameworks (e.g., PyTorch, JAX, TensorRT, vLLM, sgLang, Nemo, Megatron, MaxText, etc.). Hands‑on experience with CUDA, communication libraries (e.g., NCCL, MPI, UCX) and distributed machine learning techniques (e.g., pipeline parallelism, tensor parallelism). Knowledge of numerical methods, low‑precision arithmetic (e.g., NVFP4, MXFP4, FP8, INT8), and their implications on deep learning model accuracy and performance. Familiarity with systems requirements for Reinforcement Learning (RL) or highly parallel simulation environments and/or research background in machine learning systems or adjacent fields. Experience with machine learning, especially agentic systems, applied to systems problems. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until July 1, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA pioneered accelerated computing. Today, our AI infrastructure powers global intelligence, transforming every industry. Learn more about NVIDIA. #J-18808-Ljbffr NVIDIA Corporation
$184k - $287.5k
...parallel computing. More recently, GPU deep learning ignited modern AI — the next era of... ...advancement. We are hiring software engineers to work on the CUDA driver for Windows. CUDA is NVIDIA’... ...programming model across a range of system configurations and hardware capabilities...Senior- NVIDIA Corporation is seeking a Senior System Software Engineer in Santa Clara, CA, to work on the CUDA driver to accelerate general-purpose computation on GPUs. This... ...related to NVIDIA's hardware for tasks like deep learning and gaming. The ideal candidate will possess strong...Senior
$152k - $241.5k
We are hiring senior engineers to work on the CUDA driver, a core component of our platform... ...workloads, ranging from deep learning, scientific computation,... ...model across a range of system configurations and... ...incorporates strong system software programming skills, a detailed...Senior- We are hiring senior engineers to work on the CUDA driver and runtime, core components... ...investigates bottlenecks in software or hardware and delivers... ...workloads, ranging from deep learning, scientific computation,... ...model across a range of system configurations and hardware...Senior
$272k - $431.25k
...We are hiring senior engineers to work on the CUDA driver, a core component of our platform... ...workloads, ranging from deep learning, scientific computation,... ...model across a range of system configurations and... ...years of relevant systems software development experience...Suggested$184k - $287.5k
NVIDIA is seeking a Senior Software Engineer, NCCL and CUDA specialization to join our Cloud Service... ...exposure to PCIe and NVLINK. Deep understanding of operating systems and data‑center system architecture... .... Experience with deep learning workloads training and inferencing...Senior$184k - $287.5k
We are hiring software engineers to work on the CUDA driver, a core component of our platform for accelerating... ...workloads, ranging from deep learning, scientific computation, and self-driving... ...programming model across a range of system configurations and hardware capabilities...Senior$152k - $241.5k
NVIDIA is seeking outstanding senior engineers to work on the CUDA driver, a key component... ...will join a versatile software engineering team that delivers... ...diverse workloads like deep learning, scientific research,... ...networking software. Your system-level expertise and creativity...Senior$184k - $287.5k
...seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models... ..., distributed systems, deep learning theories. Knowledgeable and passionate... ...programming and performance: CUDA, memory hierarchy, streams,...Senior- ...professional to lead the development of new features for the CUDA APIs. The role involves coordinating efforts across... ...and maintaining high-quality code, and driving software development across multiple operating systems. Applicants should possess a BS or MS in Computer...Senior
- NVIDIA Corporation in Santa Clara is seeking an experienced software professional passionate about deep learning systems optimization. This role involves pioneering projects at the intersection of CUDA and AI, focusing on unlocking hardware performance for advanced workloads...Senior
$152k - $287.5k
...Santa Clara, CA is seeking a Senior Engineer to develop CUDA features for GPU... ...hardware architects and software teams to design and implement... ...various applications like deep learning and autonomous vehicles.... ...abilities, and experience in systems software development. A...Senior- ...libraries like NCCL, NVSHMEM, UCX for Deep Learning and HPC. We are looking for a motivated Performance engineer to influence the roadmap of... ...understanding of computer system architecture, HW‑SW... ...deployments. Familiarity with CUDA programming and/or GPUs. Experience...Senior
$152k - $241.5k
...passionate about redefining how software is built in the age of... ...scale. If you are a systems‑thinking C++ engineer who wants to help scale out... ...on top of state‑of‑the‑art deep learning breakthroughs, and improve... ...for production software. CUDA & kernel expertise: Experience...Senior$184k - $287.5k
NVIDIA Gruppe is seeking a motivated system software engineer in California to work on the CUDA Driver, essential for accelerating computations on GPUs. The ideal candidate will have strong C/C++ skills, with 7+ years of development experience, and a background in OS interfaces...Senior$184k - $287.5k
Overview Senior system software engineer position focused on the CUDA Driver, a core component accelerating general‑purpose GPU computation. The role involves enhancing CUDA’s memory model and multi‑node scalability for leading AI and other high‑performance workloads....Senior$152k - $241.5k
...of innovation and excellence. As a Senior System Software Engineer, you’ll become a part of a top-tier... ...Cert C/C++, or MISRA. Experience with CUDA, TensorRT Track record integrating and... ...artificial intelligence. NVIDIA’s deep learning platform has already made a major impact...Senior$184k - $287.5k
...are looking for a dedicated engineer for the Senior Systems Software Engineer role, focusing on... ...software stacks (CUDA). Experience with modern... ...in C/C++/Python/Bash. Deep expertise in systems architecture... ...performance computing or deep learning in engineering or academic...Senior$184k - $287.5k
...doing: Develop use cases and system requirements for L3 and L4... ...closely with Data Analytics, Test Engineering, and System Integration &... ..., data analysis, and software architecture. Strong software... ...trade-offs between End-to-End deep learning approaches, classical modular...Senior$168k - $270.25k
Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior Software Engineer, Distributed Systems... ...NVIDIA GPUs. You will apply your deep technical expertise to design an... ...intersecting our prowess in deep learning and computing, with industry-leading...SeniorRemote work- NVIDIA Gruppe is hiring a senior engineer to work on the CUDA driver, a key component of NVIDIA's platform. This role will involve working closely... ...degree and at least 5 years of relevant experience in systems software development, with excellent skills in C programming...Senior
$152k - $241.5k
...motivated Performance Engineer to influence the... ...CPU, networking) and software components in the... ...understanding of computer system architecture,... ...and passion to learn new areas and tools... .... Familiarity with CUDA programming and/or... ...GPUs. Experience with deep learning frameworks...Senior- Senior Systems Software Engineer - GPU Performance at Scale We are looking for a dedicated... ...computing software stacks (CUDA). Experience with modern... ...experience in C/C++/Python/Bash. Deep expertise in systems... ...performance computing or deep learning in engineering or academic...Senior
$224k - $356.5k
...and medical devices. Our software platforms are central to this... ...globally! We are hiring a Senior Systems Software Engineer to join our team as a... ...expert focused on optimizing deep learning inference for autonomous vehicles... ...fundamentals, CUDA, and low‑level performance...SeniorImmediate start$184k - $287.5k
NVIDIA Corporation is seeking a motivated System Software Engineer in Santa Clara, California, to enhance features for its advanced hardware. Applicants... ...candidate will collaborate across teams to improve CUDA APIs and functionality, with a rewarding salary range from USD...Senior- Overview NVIDIA is seeking a Senior Software Engineer to join our CSP Engagements team, focusing on system software for datacenter... ...GB200. This role combines deep technical expertise in embedded... ...with GPU computing (CUDA) and deep learning workloads. Expertise in Out...Senior
$224k - $356.5k
...building innovative server systems for GPU accelerated applications, such as Deep Learning. Data Center SW team... ...the end‑to‑end software and firmware stack for... ...We are looking for a Senior Software Architect who... ...work with world‑class engineering teams, product management...SeniorShift work$152k - $241.5k
...delivery of practical systems that operate within tight... ...is looking for a Sr. Software Engineer specializing in... ...multimodal representation learning, model adaptation, domain... ...proven experience in deep learning, machine... ...tools such as TensorRT, CUDA, cuDNN, Triton, DeepStream...SeniorImmediate startShift work- We are now looking for a Senior Deep Learning Software Engineer, PyTorch. NVIDIA is hiring software engineers to... .... Strong understanding of systems software and interfaces. Demonstrated... ...learning modeling trends. Background with CUDA Programming as well as Python. Demonstrated...Senior
- Senior Deep Learning Software Engineer, PyTorch Overview NVIDIA is hiring software engineers to design and build... .... Strong understanding of systems software and interfaces. Demonstrated... ...learning modeling trends. Background with CUDA programming and Python. Demonstrated...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer, CUDA Deep Learning Systems. Be the first to apply!
- software engineer amazon Santa Clara, CA
- experienced software developer Santa Clara, CA
- federal - software developer Santa Clara, CA
- software developer internship Santa Clara, CA
- senior software engineer Santa Clara, CA
- software developer fintech Santa Clara, CA
- part time software developer remote Santa Clara, CA
- software developer intern Santa Clara, CA
- software data engineer Santa Clara, CA
- software engineer Santa Clara, CA


