Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Software Engineer, DGX Cloud Production Engineering

$184k - $287.5k

NVIDIA

NVIDIA DGX Cloud is building and operating large-scale GPU infrastructure for AI research and production workloads. We are looking for Senior Software Engineers to help build the automation, tooling, and operational systems that make GPU clusters reliable, scalable, and safe to run. This role is part of a production engineering team focused on Kubernetes-based infrastructure, GPU cluster operations, reliability, automation, GitOps, and Day 2 operability across DGX Cloud environments.

What you'll be doing:
  • Build and operate automation for large-scale GPU clusters across NVIDIA Cloud Partners (NCP) and on-prem environments.
  • Develop tools and services for provisioning, validation, upgrades, monitoring, repair, and cluster lifecycle operations.
  • Improve Day 0 / Day 1 / Day 2 workflows for cluster bringup, handoff, and production operations.
  • Reduce manual production touches through APIs, GitOps, automation, and agent-assisted workflows.
  • Participate in on-call, incident response, debugging, and durable follow-up work.
  • Partner with platform, storage, networking, security, and workload teams to make infrastructure production-ready.
What we need to see:
  • 8+ years of experience building or operating production infrastructure.
  • Strong programming skills in Python, Go, or similar.
  • Experience with Linux, Kubernetes, containers, cloud infrastructure, or infrastructure automation.
  • Ability to troubleshoot distributed systems in production.
  • Clear communication and ability to work across teams.
  • BS/MS in Computer Science or equivalent experience.
Ways to stand out from the crowd:
  • Experience with GPU infrastructure, Kubernetes operators, GitOps, Terraform, ArgoCD, or fleet automation.
  • Experience with SLOs, on-call, incident response, observability, and reliability practices.
  • Exposure to BMaaS, VMaaS, managed Kubernetes, or multi-cloud infrastructure.

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. We have some of the most forward-thinking and hard-working people on the planet working for us. If you're creative, hard-working and self-motivated, we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until May 22, 2026.

This posting is for an existing vacancy.


NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Software Engineer, DGX Cloud Production Engineering in Santa Clara, CA vacancy
  • NVIDIA Corporation is seeking a Senior Software Engineer to join its DGX Cloud Production Engineering team in Santa Clara, CA. This role focuses on building automation and operational systems for large-scale GPU clusters, ensuring reliability and scalability. The ideal... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $272k - $431.25k

     ...NVIDIA DGX Cloud is scaling GPU infrastructure across internal, partner...  ...We are looking for Principal Software Engineers to help shape the technical direction for production engineering, Kubernetes-based...  ...clusters. This role is for senior technical leaders who can define... 
    Software

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $224k - $356.5k

     ...the world. As part of the DGX Cloud organization, the...  ...security, silicon, and cloud engineering teams to turn embedded hardware...  ...security, silicon, platform, and software teams to deliver end-to-end...  ...REST APIs and microservices in production. ~ Experience with cloud-... 
    Senior
    Software
    Remote work

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $136k - $224.25k

    ## Senior Network Reliability Engineer - DGX CloudApplylocations: US, CA, Santa Clara: US, Remotetime...  ...support and maintain our cloud and datacenter network...  ...needs across the whole software stack for NVIDIA, from...  ...defined SLAs, triage production impacting network incidents... 
    Senior
    Software
    Remote work
    Shift work

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $320k

     ...leading tech company is seeking a seasoned individual to spearhead DGX Cloud strategy, focusing on GPU lifecycle and operational health....  ..., collaborating with stakeholders, and managing full software and system lifecycles. If you're passionate about technology and... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $168k - $264.5k

    NVIDIA Corporation is seeking a Senior Network Engineer to develop a cloud network infrastructure that supports software development workflows. This role involves designing, implementing, and troubleshooting network stacks, with a focus on automation. Key qualifications... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $272k - $431.25k

    NVIDIA Corporation is looking for a Principal Software Engineer for DGX Cloud Production Engineering to define technical strategies and lead efforts in large-scale GPU operations. The successful candidate will have over 15 years of experience in distributed systems, with... 
    Software
    Remote job

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $384k

    NVIDIA is seeking a Senior Director, System Software Engineering, to lead strategy and execution for capacity management in DGX Cloud, building the capacity foundation for NVIDIA's internal...  ...closely with architecture, security, product, and developer platform leaders to... 
    Senior
    Software
    Full time

    NVIDIA

    Santa Clara, CA
    19 hours ago
  • $168k - $264.5k

    NVIDIA is looking for a Senior Network Engineer to develop a cloud network infrastructure. The goal is to craft...  ...efficient network to support NVIDIA software development workflows and tools,...  ...resource management flow and developer productivity tools. The network is serving the... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing...  ...seeking an AI infrastructure software engineer to join our team. You'll be...  ...availability of AI systems. As a senior DGX Cloud AI Infrastructure...  ...Enhance infrastructure and products underpinning NVIDIA's AI... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...for application microservices deployed in both on-prem and on Cloud. Setup test tools to validate environment, application and solutions...  ...science or equivalent with 1+ years hands on professional software development experience with a variety of different testing... 
    Senior
    Software

    Rootshell Enterprise Technologies

    Santa Clara, CA
    1 day ago
  • $272k - $431.25k

     ...We are looking for a Principal Software Engineer to join our DGX Cloud team and build the foundational systems...  ...coaching, mentoring, and encouraging senior engineers, elevating the technical...  ...on the customer experience and product requirements, translating deep technical... 
    Software

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $112k - $137k

    A leading cybersecurity company in Santa Clara seeks an experienced Software Testing Engineer to design and validate cloud security products. The ideal candidate holds a Bachelor's in Computer Science and has over 10 years of experience in software testing, particularly... 
    Senior
    Software
    Work experience placement

    Fortinet, Inc.

    Santa Clara, CA
    1 day ago
  • $147k - $237.5k

    Palo Alto Networks, Inc. is seeking a Software Engineer in Test to design, develop, and deliver...  ...technologies within our Prisma Access Cloud Service team. You will automate and run...  ...development teams to ensure high-quality products in cybersecurity. The ideal candidate is... 
    Senior
    Software

    Palo Alto Networks, Inc.

    Santa Clara, CA
    3 days ago
  • $200k - $322k

    Senior Technical Program Manager - DGX Cloud Infra Security page is loaded## Senior Technical...  ...infrastructure, platform, and product teams. This role ensures...  ...roadmaps and the software development lifecycle. It...  ...Security, Compliance, SRE, and Engineering to continually advance... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $200k - $322k

    ## Senior Technical Program Manager, DGX Cloud Software - Product and ServicesApplylocations: US, CA, Santa Clara: US, WA, Seattletime type: Full timeposted on...  ...You will be responsible for managing high-impact engineering programs within a dynamic, fast-paced roadmap,... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $168k - $258.75k

    Senior Technical Program Manager, DGX Cloud Software Products and Services page is loaded## Senior Technical Program Manager, DGX Cloud Software Products and Serviceslocations...  ...by data and research. You will work closely with engineering, SRE, operations, and researchers to develop... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...working for us and, due to extraordinary growth, our elite engineering teams are fast-growing fast. If you're a creative and autonomous...  ...other characteristic protected by law.We are looking for a Senior Software Engineer to become part of our storage management plane team... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  •  ...CrowdStrike, Inc. is seeking a Cloud Software Engineer to join the Falcon Complete AI Engineering Team in Sunnyvale, California. In this role, you will design, build, and deploy distributed cloud ecosystems using technologies such as Golang and Python. The ideal candidate... 
    Senior
    Software

    CrowdStrike

    Sunnyvale, CA
    4 days ago
  • $140k - $215k

     ...CrowdStrike, Inc. is seeking a Cloud Software Engineer to join its Falcon Complete AI Engineering Team in Sunnyvale, California. This role...  ...candidate will have experience with big data, microservices, and production Kubernetes, and will thrive in a collaborative environment... 
    Senior
    Software

    Koitecc Solutions

    Sunnyvale, CA
    2 days ago
  • $320k

    Director, Site Reliability and Software Engineering - DGX Cloud page is loaded## Director, Site Reliability and Software Engineering - DGX Cloudlocations...  ...distributed NVIDIA GPU cloud clusters and contribute to product strategy. You will be the leader for all aspects of... 
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • Fortinet, Inc. is hiring for a software engineering role based in Santa Clara, California. The position requires strong programming skills, with an emphasis on Python and extensive experience with AWS or Azure. You will contribute to developing and maintaining GenAI/ML... 
    Senior
    Software

    Fortinet, Inc.

    Santa Clara, CA
    2 days ago
  • $230k - $250k

    Cerebras Systems is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience... 
    Senior
    Software

    Cerebras Systems

    Sunnyvale, CA
    1 day ago
  • $224k - $356.5k

     ...a passionate member to join our Engineering Team in GeForce NOW as a Senior Systems Software Engineer. In this role, you will...  ...crafting and guiding the future of Cloud Gaming. GeForce NOW is NVIDIA’s...  ...closely with other teams on new products or features/improvements of... 
    Senior
    Software
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $174k - $252k

    A leading technology company in California is looking for a Senior Software Engineer for its Distributed Cloud Hosted, Infrastructure team. This role entails developing next-generation technologies and requires strong software development skills, especially in distributed... 
    Senior
    Software

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • LVIS in Palo Alto, CA is seeking a Senior DevOps Engineer to manage cloud-based medical analysis software. The ideal candidate will have extensive experience with AWS, Linux, and CI/CD practices. This full-time position focuses on optimizing resource management and enhances... 
    Senior
    Software
    Full time

    Carlsbad Tech

    Palo Alto, CA
    3 days ago
  • $264.51k - $298.62k

     ...focuses on developing cutting-edge software solutions for Network Management Systems. They are seeking a Software Engineer with expertise in Java and cloud technologies. The ideal candidate will...  ...of modules for the Versa Director product, utilizing skills in ReactJS,... 
    Senior
    Software
    Remote job

    Versa Networks

    Santa Clara, CA
    2 days ago
  • $262k - $365k

    A global tech company in Sunnyvale seeks a Senior Staff Software Engineer to provide technical leadership on critical projects and influence engineering teams. The role demands extensive experience in C++ and software architecture. With comprehensive benefits and competitive... 
    Senior
    Software

    Google Inc.

    Sunnyvale, CA
    4 days ago
  •  ...innovative security platform. This role demands over 12 years of software experience, particularly in Go and SQL. Candidates should...  ...continuous learning and have strong knowledge of AWS or similar cloud services. The position promotes collaboration and offers opportunities... 
    Senior
    Software

    Illumio

    Sunnyvale, CA
    2 days ago
  • $147k - $222.29k

    A multinational software company is seeking a Sr. DevOps Engineer for their Palo Alto, CA office. In this role, you'll be responsible for the efficient delivery of customer environments in the Cloud while ensuring adherence to Service Level Agreements. The ideal candidate... 
    Senior
    Software
    Work at office

    SAP Belgium NV/SA

    Palo Alto, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer, DGX Cloud Production Engineering. Be the first to apply!