Senior Software Engineer, DGX Cloud Production Engineering
$184k - $287.5kNVIDIA
NVIDIA DGX Cloud is building and operating large-scale GPU infrastructure for AI research and production workloads. We are looking for Senior Software Engineers to help build the automation, tooling, and operational systems that make GPU clusters reliable, scalable, and safe to run. This role is part of a production engineering team focused on Kubernetes-based infrastructure, GPU cluster operations, reliability, automation, GitOps, and Day 2 operability across DGX Cloud environments.
What you'll be doing:- Build and operate automation for large-scale GPU clusters across NVIDIA Cloud Partners (NCP) and on-prem environments.
- Develop tools and services for provisioning, validation, upgrades, monitoring, repair, and cluster lifecycle operations.
- Improve Day 0 / Day 1 / Day 2 workflows for cluster bringup, handoff, and production operations.
- Reduce manual production touches through APIs, GitOps, automation, and agent-assisted workflows.
- Participate in on-call, incident response, debugging, and durable follow-up work.
- Partner with platform, storage, networking, security, and workload teams to make infrastructure production-ready.
- 8+ years of experience building or operating production infrastructure.
- Strong programming skills in Python, Go, or similar.
- Experience with Linux, Kubernetes, containers, cloud infrastructure, or infrastructure automation.
- Ability to troubleshoot distributed systems in production.
- Clear communication and ability to work across teams.
- BS/MS in Computer Science or equivalent experience.
- Experience with GPU infrastructure, Kubernetes operators, GitOps, Terraform, ArgoCD, or fleet automation.
- Experience with SLOs, on-call, incident response, observability, and reliability practices.
- Exposure to BMaaS, VMaaS, managed Kubernetes, or multi-cloud infrastructure.
NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
$184k - $287.5k
...Overview NVIDIA DGX Cloud is building and operating large-scale GPU infrastructure for AI research and production workloads. We are looking for Senior Software Engineers to help build the automation, tooling, and operational systems that make GPU clusters reliable, scalable...SeniorSoftware$356.5k
...NVIDIA Gruppe is seeking an experienced AI infrastructure software engineer to join its DGX Cloud AI Efficiency Team in Santa Clara, California. This role focuses on developing the infrastructure for optimizing AI workloads and ensuring high availability and efficiency...SeniorSoftware$272k - $431.25k
...NVIDIA DGX Cloud is scaling GPU infrastructure across internal, partner... ...We are looking for Principal Software Engineers to help shape the technical direction for production engineering, Kubernetes-based... ...GPU clusters. This role is for senior technical leaders who can define...Software$184k - $287.5k
## Senior Software Engineer, DGX Cloud AI InfrastructureApplylocations: US, CA, Santa Clara: US, TX, Austin: US, OR, Remote: US, WA, Remote: US, WA... ...-attribution capabilities that keep large clusters productive. This is a hands-on senior individual-contributor role...SeniorSoftwareRemote work$224k - $356.5k
...on the world.As part of the DGX Cloud organization, the Attestation... ..., silicon, and cloud engineering teams to turn embedded hardware... ...security, silicon, platform, and software teams to deliver end-to-end... ...APIs and microservices in production.* Experience with cloud-native...SeniorSoftwareRemote work$384k
...NVIDIA is seeking a Senior Director, System Software Engineering, to lead strategy and execution for capacity management in DGX Cloud, building the capacity foundation for NVIDIA's internal... ...closely with architecture, security, product, and developer platform leaders to...SeniorSoftware$184k - $287.5k
...Joining NVIDIA's DGX Cloud Lepton Team means contributing to the leading cloud product that powers innovative AI research and developers... ...seeking an AI infrastructure software engineer to join our team. You'll be... ...AI in production. As a senior DGX Cloud AI Infrastructure...SeniorSoftware$152k - $241.5k
...outstanding, passionate, and dedicated Senior AI Infrastructure Engineer to join our DGX Cloud group. This engineering role... ...build and maintain large-scale production systems with high efficiency and... ...using a combination of software and systems engineering practices...SeniorSoftware$168k - $264.5k
...Senior Network Engineer – Cloud Network Infrastructure NVIDIA is seeking an experienced Senior Network Engineer to develop and manage a robust cloud network infrastructure that supports NVIDIA's software development workflows and tools. The role focuses on designing, implementing...SeniorSoftware$320k
...leading tech company is seeking a seasoned individual to spearhead DGX Cloud strategy, focusing on GPU lifecycle and operational health.... ..., collaborating with stakeholders, and managing full software and system lifecycles. If you're passionate about technology and...SeniorSoftware$168k - $264.5k
...NVIDIA is looking for a Senior Network Engineer to develop a cloud network infrastructure. The goal is to craft... ...efficient network to support NVIDIA software development workflows and tools,... ...resource management flow and developer productivity tools. The network is serving the...SeniorSoftwareRemote work$136k - $224.25k
...NVIDIA is looking for a Senior Network Reliability Engineer to support and maintain our cloud and datacenter network infrastructures... ...the needs across the whole software stack for NVIDIA, from Graphics... ...alerts within defined SLAs, triage production impacting network incidents,...SeniorSoftwareRemote workShift work$152k - $241.5k
...NVIDIA Corporation in Santa Clara is seeking a Sr. Software Engineer to architect a simulation platform for next-generation DGX products. The role involves enhancing simulator components and collaborating with global teams on performance improvements and bug fixes. The...SeniorSoftware$184k - $287.5k
...Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing... ...seeking an AI infrastructure software engineer to join our team. You'll be... ...availability of AI systems. As a senior DGX Cloud AI Infrastructure... ...Enhance infrastructure and products underpinning NVIDIA's AI...SeniorSoftware$112k - $137k
...A leading cybersecurity company in Santa Clara seeks an experienced Software Testing Engineer to design and validate cloud security products. The ideal candidate holds a Bachelor's in Computer Science and has over 10 years of experience in software testing, particularly...SeniorSoftwareWork experience placement- ...for application microservices deployed in both on-prem and on Cloud. Setup test tools to validate environment, application and solutions... ...science or equivalent with 1+ years hands on professional software development experience with a variety of different testing...SeniorSoftware
$272k - $431.25k
...We are looking for a Principal Software Engineer to join our DGX Cloud team and build the foundational systems... ...coaching, mentoring, and encouraging senior engineers, elevating the technical... ...on the customer experience and product requirements, translating deep technical...Software$200k - $322k
...As a Senior Technical Program Manager passionate about Cloud Security, you will drive the DGX Cloud infrastructure security program... ..., platform, and product teams. This role... ...execution roadmaps and the software development... ...Compliance, SRE, and Engineering to continually advance...SeniorSoftware$200k - $322k
...NVIDIA’s DGX Cloud is redefining how organizations deploy and scale... .... We’re looking for a Senior Technical Program Manager to... ...impact role interfacing with engineering, product, operations, finance, and... ...management of large-scale software or infrastructure projects...SeniorSoftware$200k - $322k
...NVIDIA is seeking a Senior Technical Program Manager to... ...Trust Services programs for DGX Cloud. DGX Cloud powers large-... ...infrastructure security, product security, compliance, engineering execution, and partner readiness... ...firmware, platform, and software teams. Establish...SeniorSoftware$168k - $258.75k
...Senior Technical Program Manager, DGX Cloud Software Products and Services page is loaded## Senior Technical Program Manager, DGX Cloud Software Products and Serviceslocations... ...by data and research. You will work closely with engineering, SRE, operations, and researchers to develop...SeniorSoftware$184k - $287.5k
...Senior Software Engineer, Cloud-Native Stack – CSP Engagements page is loaded Senior Software Engineer, Cloud... ...cloud-native stack for datacenter products like GB200. In this role, You will define... ...Jobs (5) Senior Software Engineer, DGX Cloud Lepton Marketplace locations 2...SeniorSoftwareFull time$200k - $322k
...NVIDIA's DGX Cloud (DGXC) powers AI for strategic research and product workloads. The company seeks a Senior Technical Program Manager (TPM) to lead... ...NVIDIA’s next-generation AI software platforms. In this role,... ...managing high-impact engineering programs within a...SeniorSoftware- ...the world. The NVIDIA Cloud Accelerator team develops... ...a Technical Marketing Engineer passionate about AI Infrastructure... ...Data Center Management software. You will help our... ...Marketing Engineering, Product, Engineering, and Field... ...Base Command Manager, DGX Cloud, Run:ai, GPU...Software
$159k - $231k
Senior Data Center Operations Engineer, Google Cloud Sunnyvale, CA, USA Qualifications Bachelor’s degree in Electrical... ...logical, mechanical, electrical, software, thermal, etc. Ability to read... ...this role, you will support new product engineering within Google’s hardware...SeniorSoftwareFull timeWork at officeWorldwide$200k - $322k
...DGX Cloud Team is looking for a Senior Technical Program Manager (TPM) to guide complex,... ...position involves leading software-related initiatives across... ...for managing high-impact engineering programs within a dynamic... ..., infrastructure, product, and platform engineering...SeniorSoftwareShift work$184k - $287.5k
...We are looking for a Senior Software Engineer to become part of our storage management plane team. The management plane is a web-based application crafted to provide our storage customers the capabilities to handle and supervise our distributed storage infrastructure....SeniorSoftwareRemote work- ...Fortinet, Inc. is hiring for a software engineering role based in Santa Clara, California. The position requires strong programming skills, with an emphasis on Python and extensive experience with AWS or Azure. You will contribute to developing and maintaining GenAI/ML...SeniorSoftware
- ...NVIDIA Gruppe in Santa Clara is seeking a Sr. Software Engineer to develop and enhance simulation platforms for their DGX Server systems. The role involves working with cross-functional teams to optimize performance and build effective software solutions. Ideal candidates...SeniorSoftware
- ...A leading cybersecurity firm is seeking a Software Engineer to develop containerized microservices and build scalable systems. The role involves... ...development with expertise in Go and SQL, as well as cloud platform familiarity (AWS, Azure, GCP). This position offers...SeniorSoftware
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer, DGX Cloud Production Engineering. Be the first to apply!
- software engineer internship remote Santa Clara, CA
- new grad software engineer Santa Clara, CA
- software engineer staff Santa Clara, CA
- integration software engineer Santa Clara, CA
- machine learning software engineer Santa Clara, CA
- senior robotics software engineer Santa Clara, CA
- software engineer entry level Santa Clara, CA
- software development engineer aws Santa Clara, CA
- startup software engineer Santa Clara, CA
- rust software engineer Santa Clara, CA

