Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Software Engineer, DGX Cloud Production Engineering

$184k - $287.5k

NVIDIA

NVIDIA DGX Cloud is building and operating large-scale GPU infrastructure for AI research and production workloads. We are looking for Senior Software Engineers to help build the automation, tooling, and operational systems that make GPU clusters reliable, scalable, and safe to run. This role is part of a production engineering team focused on Kubernetes-based infrastructure, GPU cluster operations, reliability, automation, GitOps, and Day 2 operability across DGX Cloud environments.

What you'll be doing:
  • Build and operate automation for large-scale GPU clusters across NVIDIA Cloud Partners (NCP) and on-prem environments.
  • Develop tools and services for provisioning, validation, upgrades, monitoring, repair, and cluster lifecycle operations.
  • Improve Day 0 / Day 1 / Day 2 workflows for cluster bringup, handoff, and production operations.
  • Reduce manual production touches through APIs, GitOps, automation, and agent-assisted workflows.
  • Participate in on-call, incident response, debugging, and durable follow-up work.
  • Partner with platform, storage, networking, security, and workload teams to make infrastructure production-ready.
What we need to see:
  • 8+ years of experience building or operating production infrastructure.
  • Strong programming skills in Python, Go, or similar.
  • Experience with Linux, Kubernetes, containers, cloud infrastructure, or infrastructure automation.
  • Ability to troubleshoot distributed systems in production.
  • Clear communication and ability to work across teams.
  • BS/MS in Computer Science or equivalent experience.
Ways to stand out from the crowd:
  • Experience with GPU infrastructure, Kubernetes operators, GitOps, Terraform, ArgoCD, or fleet automation.
  • Experience with SLOs, on-call, incident response, observability, and reliability practices.
  • Exposure to BMaaS, VMaaS, managed Kubernetes, or multi-cloud infrastructure.

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. We have some of the most forward-thinking and hard-working people on the planet working for us. If you're creative, hard-working and self-motivated, we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until June 8, 2026.

This posting is for an existing vacancy.


NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Software Engineer, DGX Cloud Production Engineering in Santa Clara, CA vacancy
  • $184k - $287.5k

     ...Overview NVIDIA DGX Cloud is building and operating large-scale GPU infrastructure for AI research and production workloads. We are looking for Senior Software Engineers to help build the automation, tooling, and operational systems that make GPU clusters reliable, scalable... 
    Senior
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    10 hours ago
  • $356.5k

     ...NVIDIA Gruppe is seeking an experienced AI infrastructure software engineer to join its DGX Cloud AI Efficiency Team in Santa Clara, California. This role focuses on developing the infrastructure for optimizing AI workloads and ensuring high availability and efficiency... 
    Senior
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    10 hours ago
  • $272k - $431.25k

     ...NVIDIA DGX Cloud is scaling GPU infrastructure across internal, partner...  ...We are looking for Principal Software Engineers to help shape the technical direction for production engineering, Kubernetes-based...  ...GPU clusters. This role is for senior technical leaders who can define... 
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    11 hours ago
  • $184k - $287.5k

    ## Senior Software Engineer, DGX Cloud AI InfrastructureApplylocations: US, CA, Santa Clara: US, TX, Austin: US, OR, Remote: US, WA, Remote: US, WA...  ...-attribution capabilities that keep large clusters productive. This is a hands-on senior individual-contributor role... 
    Senior
    Software
    Remote work

    NVIDIA

    Santa Clara, CA
    11 hours ago
  • $224k - $356.5k

     ...on the world.As part of the DGX Cloud organization, the Attestation...  ..., silicon, and cloud engineering teams to turn embedded hardware...  ...security, silicon, platform, and software teams to deliver end-to-end...  ...APIs and microservices in production.* Experience with cloud-native... 
    Senior
    Software
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $384k

     ...NVIDIA is seeking a Senior Director, System Software Engineering, to lead strategy and execution for capacity management in DGX Cloud, building the capacity foundation for NVIDIA's internal...  ...closely with architecture, security, product, and developer platform leaders to... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...Joining NVIDIA's DGX Cloud Lepton Team means contributing to the leading cloud product that powers innovative AI research and developers...  ...seeking an AI infrastructure software engineer to join our team. You'll be...  ...AI in production. As a senior DGX Cloud AI Infrastructure... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $152k - $241.5k

     ...outstanding, passionate, and dedicated Senior AI Infrastructure Engineer to join our DGX Cloud group. This engineering role...  ...build and maintain large-scale production systems with high efficiency and...  ...using a combination of software and systems engineering practices... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $168k - $264.5k

     ...Senior Network Engineer – Cloud Network Infrastructure NVIDIA is seeking an experienced Senior Network Engineer to develop and manage a robust cloud network infrastructure that supports NVIDIA's software development workflows and tools. The role focuses on designing, implementing... 
    Senior
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    11 hours ago
  • $320k

     ...leading tech company is seeking a seasoned individual to spearhead DGX Cloud strategy, focusing on GPU lifecycle and operational health....  ..., collaborating with stakeholders, and managing full software and system lifecycles. If you're passionate about technology and... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    11 hours ago
  • $168k - $264.5k

     ...NVIDIA is looking for a Senior Network Engineer to develop a cloud network infrastructure. The goal is to craft...  ...efficient network to support NVIDIA software development workflows and tools,...  ...resource management flow and developer productivity tools. The network is serving the... 
    Senior
    Software
    Remote work

    NVIDIA

    Santa Clara, CA
    12 days ago
  • $136k - $224.25k

     ...NVIDIA is looking for a Senior Network Reliability Engineer to support and maintain our cloud and datacenter network infrastructures...  ...the needs across the whole software stack for NVIDIA, from Graphics...  ...alerts within defined SLAs, triage production impacting network incidents,... 
    Senior
    Software
    Remote work
    Shift work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...NVIDIA Corporation in Santa Clara is seeking a Sr. Software Engineer to architect a simulation platform for next-generation DGX products. The role involves enhancing simulator components and collaborating with global teams on performance improvements and bug fixes. The... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing...  ...seeking an AI infrastructure software engineer to join our team. You'll be...  ...availability of AI systems. As a senior DGX Cloud AI Infrastructure...  ...Enhance infrastructure and products underpinning NVIDIA's AI... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $112k - $137k

     ...A leading cybersecurity company in Santa Clara seeks an experienced Software Testing Engineer to design and validate cloud security products. The ideal candidate holds a Bachelor's in Computer Science and has over 10 years of experience in software testing, particularly... 
    Senior
    Software
    Work experience placement

    Fortinet

    Santa Clara, CA
    1 day ago
  •  ...for application microservices deployed in both on-prem and on Cloud. Setup test tools to validate environment, application and solutions...  ...science or equivalent with 1+ years hands on professional software development experience with a variety of different testing... 
    Senior
    Software

    Rootshell Enterprise Technologies

    Santa Clara, CA
    3 days ago
  • $272k - $431.25k

     ...We are looking for a Principal Software Engineer to join our DGX Cloud team and build the foundational systems...  ...coaching, mentoring, and encouraging senior engineers, elevating the technical...  ...on the customer experience and product requirements, translating deep technical... 
    Software

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $200k - $322k

     ...As a Senior Technical Program Manager passionate about Cloud Security, you will drive the DGX Cloud infrastructure security program...  ..., platform, and product teams. This role...  ...execution roadmaps and the software development...  ...Compliance, SRE, and Engineering to continually advance... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $200k - $322k

     ...NVIDIA’s DGX Cloud is redefining how organizations deploy and scale...  .... We’re looking for a Senior Technical Program Manager to...  ...impact role interfacing with engineering, product, operations, finance, and...  ...management of large-scale software or infrastructure projects... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $200k - $322k

     ...NVIDIA is seeking a Senior Technical Program Manager to...  ...Trust Services programs for DGX Cloud. DGX Cloud powers large-...  ...infrastructure security, product security, compliance, engineering execution, and partner readiness...  ...firmware, platform, and software teams. Establish... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $168k - $258.75k

     ...Senior Technical Program Manager, DGX Cloud Software Products and Services page is loaded## Senior Technical Program Manager, DGX Cloud Software Products and Serviceslocations...  ...by data and research. You will work closely with engineering, SRE, operations, and researchers to develop... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    11 hours ago
  • $184k - $287.5k

     ...Senior Software Engineer, Cloud-Native Stack – CSP Engagements page is loaded Senior Software Engineer, Cloud...  ...cloud-native stack for datacenter products like GB200. In this role, You will define...  ...Jobs (5) Senior Software Engineer, DGX Cloud Lepton Marketplace locations 2... 
    Senior
    Software
    Full time

    NVIDIA

    Santa Clara, CA
    11 hours ago
  • $200k - $322k

     ...NVIDIA's DGX Cloud (DGXC) powers AI for strategic research and product workloads. The company seeks a Senior Technical Program Manager (TPM) to lead...  ...NVIDIA’s next-generation AI software platforms. In this role,...  ...managing high-impact engineering programs within a... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    5 days ago
  •  ...the world. The NVIDIA Cloud Accelerator team develops...  ...a Technical Marketing Engineer passionate about AI Infrastructure...  ...Data Center Management software. You will help our...  ...Marketing Engineering, Product, Engineering, and Field...  ...Base Command Manager, DGX Cloud, Run:ai, GPU... 
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $159k - $231k

    Senior Data Center Operations Engineer, Google Cloud Sunnyvale, CA, USA Qualifications Bachelor’s degree in Electrical...  ...logical, mechanical, electrical, software, thermal, etc. Ability to read...  ...this role, you will support new product engineering within Google’s hardware... 
    Senior
    Software
    Full time
    Work at office
    Worldwide

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $200k - $322k

     ...DGX Cloud Team is looking for a Senior Technical Program Manager (TPM) to guide complex,...  ...position involves leading software-related initiatives across...  ...for managing high-impact engineering programs within a dynamic...  ..., infrastructure, product, and platform engineering... 
    Senior
    Software
    Shift work

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...We are looking for a Senior Software Engineer to become part of our storage management plane team. The management plane is a web-based application crafted to provide our storage customers the capabilities to handle and supervise our distributed storage infrastructure.... 
    Senior
    Software
    Remote work

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...Fortinet, Inc. is hiring for a software engineering role based in Santa Clara, California. The position requires strong programming skills, with an emphasis on Python and extensive experience with AWS or Azure. You will contribute to developing and maintaining GenAI/ML... 
    Senior
    Software

    Fortinet

    Santa Clara, CA
    10 hours ago
  •  ...NVIDIA Gruppe in Santa Clara is seeking a Sr. Software Engineer to develop and enhance simulation platforms for their DGX Server systems. The role involves working with cross-functional teams to optimize performance and build effective software solutions. Ideal candidates... 
    Senior
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    10 hours ago
  •  ...A leading cybersecurity firm is seeking a Software Engineer to develop containerized microservices and build scalable systems. The role involves...  ...development with expertise in Go and SQL, as well as cloud platform familiarity (AWS, Azure, GCP). This position offers... 
    Senior
    Software

    Illumio

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer, DGX Cloud Production Engineering. Be the first to apply!