Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Technical Program Manager, DGX Cloud Software Products and Services

NVIDIA

Technical Program Manager (IC5)

NVIDIA's DGX Cloud (DGXC) powers AI for strategic research and product workloads. The company seeks an expert Technical Program Manager (IC5) to lead strategic programs emphasizing resilience, reliability, and goodput. This role requires collaboration across multiple teams. It involves driving improvements in resilience, service stability, and operational scale. The TPM also guides architectural decisions related to resilience reference architecture. The TPM leads programs spanning DGXC infrastructure, Resilience Tools, and core platform services to deliver fault-tolerant, high-availability training and inference environments at scale.

We are looking for a TPM who is analytical, technically skilled, and comfortable working with cloud infrastructure, software, operations, and environments driven by data and research. You will work closely with engineering, SRE, operations, and researchers to develop scalable resilience strategies, improve operational performance, and assist in building open, modular software components and reference stacks for DGX Cloud at scale.

What You'll Be Doing:

  • Lead cross-functional programs that improve resilience, reliability, operational scale, and fleet-wide goodput across DGX Cloud.
  • Partner across infrastructure, platform, site reliability, operational, and tenant teams to identify systemic risks, resolve cross-stack dependencies, and improve end-to-end service stability.
  • Drive the definition and adoption of resilience reference stacks, operational standards, and scalable guidelines that strengthen service readiness and recovery.
  • Partner with engineering teams and researchers to support the development and delivery of open, modular software components for resilience, facilitating reusable and extensible capabilities across the platform.
  • Build and scale resilience tooling and operational mechanisms that improve observability, failure detection and attribution, root cause analysis, recovery orchestration, and operational readiness.
  • Define, measure, and improve goodput, using data-driven insights to increase usable fleet capacity, workload efficiency, and customer outcomes at scale.
  • Establish clear metrics, dashboards, and operating cadences to track program health, reliability posture, operational maturity, and performance.

What We Need To See:

  • MS EE or CS degree, or equivalent experience.
  • 8+ years of experience in program management of large-scale software or infrastructure projects.
  • Proven track record of leading complex cross-functional programs in cloud, infrastructure, distributed systems, or platform environments.
  • Strong analytical skills with the ability to assess issues across infrastructure, software, and operational layers.
  • Excellent organizational skills and ability to use project management tools (e.g. Jira, Aha!, Confluence) and distributed version control systems (e.g. Git).
  • Solid understanding of reliability engineering, resilience development, and service performance metrics, including goodput, efficiency, and utilization.
  • Experience working alongside engineering, SRE, operations, and technical collaborators to advance projects in ambiguous, high-complexity environments.
  • Outstanding communication and presentation skills for diverse technical and non-technical audiences with strong problem-solving and conflict management skills.

Ways To Stand Out From The Crowd:

  • Background in computer science, machine learning, deep learning, open-source software, and GPU technology, AI infrastructure, or large-scale compute platforms.
  • Experience with large-scale AI training environments (e.g., distributed training frameworks, checkpointing, NCCL, Slurm or other schedulers).
  • Prior experience in the management of customer workflows using large scale distributed computing and working with AI researchers or directly training and evaluating AI models.
  • Proven ability to harness AI-enabled workflows and tools to improve program management efficiency, decision-making, execution visibility, and operational efficiency.

Widely considered to be one of the technology world's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family.

NVIDIA
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Technical Program Manager, DGX Cloud Software Products and Services in Santa Clara, CA vacancy
  •  ...Senior Technical Program Manager NVIDIA's DGX Cloud (DGXC) powers AI for strategic research and product workloads. The company seeks a Senior Technical...  ...NVIDIA's next-generation AI software platforms. In this role,...  ...across platform services, cloud infrastructure, and... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $200k - $322k

     ...experienced and skilled Technical Program Manager for NVIDIA’s DGX Cloud Infrastructure Team. We...  ...link between global cloud service providers and NVIDIA...  ...planning for all phases of the product life cycle, manage risks...  ...of large programs, software engineering projects in... 
    Senior
    Software
    Worldwide

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $200k - $322k

    NVIDIA’s DGX Cloud is redefining how organizations...  ...We’re looking for a Senior Technical Program Manager to drive storage‑related...  ...with engineering, product, operations, finance...  ..., operations, cloud service providers, clusters...  ...of large‑scale software or infrastructure projects... 
    Senior
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...Senior Technical Program Manager As a Senior Technical Program...  ...passionate about Cloud Security, you will drive the DGX Cloud infrastructure...  ...with Cloud Service Providers (CSPs)...  ...infrastructure, platform, and product teams. This role...  ...roadmaps and the software development... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $168k - $258.75k

     .... We are looking for a Technical Program Manager (TPM) to join our DGX Cloud team and help drive AI...  ...Engineering, Infrastructure, and Software teams to manage...  ...to CSP (Cloud Service Providers) and NCPs (NVIDIA...  ...ensuring adherence to our Product Lifecycle (PLC) process... 
    Senior
    Software
    Worldwide

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $185k - $203k

    Senior Technical Program Manager, Salesforce & Cloud Sunnyvale, California, United States At GFiber...  ...Fiber Webpass internet services to homes and businesses...  ...projects are related to software developed by the...  ...GCP. Present health of production systems to leadership.... 
    Senior
    Software
    Full time
    For contractors

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $227k - $320k

    Senior Technical Program Manager II, Infrastructure, Google Cloud corporate_fare Google place Sunnyvale, CA, USA...  ...’s why Googlers build products that help create...  ...computing power to global services, and providing the essential...  ...the future. From software to hardware our teams... 
    Senior
    Software
    Full time
    Work experience placement
    Worldwide
    Shift work

    Google Inc.

    Sunnyvale, CA
    5 days ago
  • A leading global technology firm in Santa Clara seeks a Program Manager to lead service product development in semiconductor equipment. The ideal candidate will have over 5 years of experience, especially in Dielectric Deposition technologies, and will drive project execution... 
    Senior
    Full time

    Nutanix

    Santa Clara, CA
    3 days ago
  •  ...9/10/2023 Apple is seeking an Engineering Program Manager to join the Cloud Products and Platform program team within Apple Services Engineering. The role involves overseeing...  ...delivery of projects. This role requires strong technical project management experience, the ability... 
    Senior

    Career-Mover

    Cupertino, CA
    5 days ago
  • $193k - $347.2k

     ..., United States Software and Services The Apple Services...  ...countries. Our Program Managers partner with...  ...areas of Apple Cloud Infrastructure....  ...team is seeking a senior engineering program...  ...engineering, product, and business teams...  ...across complex technical dependencies and... 
    Senior
    Software
    Relocation
    Flexible hours

    Apple Inc.

    Cupertino, CA
    1 day ago
  • $141k - $229k

     ...Summary Key Responsibilities Product Roadmap & Strategy: Create,...  ...the product roadmap for Technical Services, optimizing PSA (eg: Planview...  ...scoring. Translate Vision into Software: Rapidly move from idea to...  ...Proven experience as a Product Manager in a technology‑focused... 
    Senior
    Software
    Full time
    Work at office
    Shift work

    Palo Alto Networks, Inc.

    Santa Clara, CA
    2 days ago
  • $138k - $198k

     ...Technical Program Manager II, Capacity Delivery, Cloud Networking Mid Experience driving progress...  ...That's why Googlers build products that help create...  ...computing power to global services, and providing the essential...  ...build the future. From software to hardware our teams... 
    Software
    Worldwide

    Google

    Sunnyvale, CA
    1 day ago
  • $163k - $237k

     ...Technical Program Manager III, NPI Hardware, Cloud AI Systems Mid Experience driving progress,...  ...programs in both hardware and software development lifecycles. Experience managing New Product Introduction for...  ...business and Google (1P) services. In this role, you will... 
    Software
    Work at office

    Google

    Sunnyvale, CA
    1 day ago
  • $224k - $356.5k

     ...As part of the DGX Cloud organization,...  ...the Attestation Services team is...  ...platform, and software teams to deliver...  ...computing. Strong programming proficiency in...  ...in production. Experience with...  ...development and management. Demonstrated...  ...multi-functional technical projects from... 
    Senior
    Software
    Remote work

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $168k - $258.75k

    ## Senior Technical Program Manager - VLSIApplylocations: US, CA, Santa Claratime type: Full timeposted...  ...and is at the heart of our products and services. Our work opens up new universes to...  ...and partners in ASIC, Architecture, Software, Systems and Operations to handle... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $192k - $279k

    Senior Technical Program Manager, Silicon Google - Sunnyvale, CA, USA Requirements...  ...teams (system, product, finance) to drive...  ...Googlers, Google Cloud customers, and billions...  ...power to global services, and providing the essential...  ...the future. From software to hardware our... 
    Senior
    Software
    Worldwide

    Google Inc.

    Sunnyvale, CA
    4 days ago
  •  ...Senior Technical Program Manager, Launchpad San Francisco, CA; Seattle, WA; New York...  ...the underlying technology services as well as the engineers...  ...fastest-growing hardware-software business. In this role, you...  ...with an emerging hardware product. You will act as the "Execution... 
    Senior
    Software
    Work at office
    Local area
    Remote work

    DoorDash

    Sunnyvale, CA
    3 days ago
  • $192k - $279k

     ...Senior Product Manager At Google, we put our users first. The world...  ...by connecting the technical and business worlds. You...  ...Google Distributed Cloud (GDC) is a set of hardware + software solutions that bring modern...  ...technologies and Google services to on-prem data centers... 
    Senior
    Software

    Google

    Sunnyvale, CA
    1 day ago
  •  ...Senior Technical Program Manager Hardware Infrastructure is seeking a...  ...critical systems and services that support analytics...  ..., we support software teams specifically through...  ...development of new products. Our mission is to accelerate...  ...principles and cloud cost optimization... 
    Senior
    Software

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $192k - $279k

     ...Senior Product Manager, Compute, Google Cloud At Google, we put our users first. The world...  ...launch by connecting the technical and business worlds. You...  ...computing power to global services, and providing the...  ...build the future. From software to hardware our teams are... 
    Senior
    Software
    Worldwide

    Google

    Sunnyvale, CA
    4 days ago
  • $192k - $279k

     ...Senior Technical Program Manager A problem isn't truly solved until it...  ...why Googlers build products that help create opportunities...  ...Googlers, Google Cloud customers, and...  ...computing power to global services, and providing the...  ...the future. From software to hardware our... 
    Senior
    Software
    Worldwide

    Google

    Sunnyvale, CA
    4 days ago
  • $173.28k - $259.6k

     ...Senior Principal Technical Program Manager Marvell's semiconductor solutions...  ...Across enterprise, cloud and AI, and carrier...  ...functional engineering, product, and operations...  ...a recognition and service awards to celebrate...  ...technology and/or software subject to U.S. export... 
    Senior
    Software
    Permanent employment
    Internship
    Work from home

    Marvell

    Santa Clara, CA
    4 days ago
  •  ...Cerebras Systems Sr. Technical Program Manager Cerebras Systems...  ...-based hyperscale cloud inference services. This order of...  ...operational risks to senior leadership Required...  ...are serious about software make their own hardware...  ...teams build better products and companies. We... 
    Senior
    Software

    Cerebras Systems

    Sunnyvale, CA
    3 days ago
  • $148k - $235.75k

     ...hard-working leader to join NVIDIA’s DGX Program Management team, focusing on delivery...  ...advancement. As a partner with engineering, product, QA, provide technical teams in the end-to-end...  ...timely documentation that aligns with software releases and supports our impact across... 
    Senior
    Software

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $216.15k - $262k

     ...Senior Staff TPM For Vera Rubin Generation...  ..., and cloud services. If you want...  ...introduction. Not manage a workstream...  ...-level program, not a SKU-level...  ...parallel with active production deployments....  ...of that: the technical depth to...  ...effects. Deep software/firmware lifecycle... 
    Senior
    Software
    Temporary work

    Crusoe

    Sunnyvale, CA
    3 days ago
  • $182.4k - $273.6k

     ...are seeking a highly skilled Technical Program Manager (TPM) to drive complex software engineering projects that push...  ...risks in the development of our products and services Facilitate effective...  ...physical devices (in addition to cloud-based software deployments) Key... 
    Senior
    Software
    Full time
    For contractors
    Work at office

    Carbon, Inc.

    Sunnyvale, CA
    5 hours ago
  • $272k - $431.25k

     ...and execution for cloud services that provide...  ...infrastructure, security, product, and engineering...  ..., artifact management, and deployment workflows...  ..., compliance, software supply chain...  ...engineering managers and senior individual...  ...architecture and technical direction while empowering... 
    Senior
    Software

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $96.8k - $336k

     ...a highly motivated and experienced Technical Program/Product Manager (TPM) to join the fastest growing area...  ...acumen with a technical or software engineering background to drive the...  ...among firmware, software, product, and service engineering teams to ensure we build... 
    Senior
    Software
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla Motors, Inc.

    Palo Alto, CA
    5 days ago
  • $192k - $279k

    Senior Technical Program Manager, Strategic Infrastructure Planning Initiatives...  ...why Googlers build products that help create...  ...Googlers, Google Cloud customers, and billions...  ...power to global services, and providing the essential...  ...the future. From software to hardware our... 
    Senior
    Software
    Temporary work
    Worldwide

    Google Inc.

    Sunnyvale, CA
    5 days ago
  •  ...Senior Technical Program Manager - Foundations Engineering San Francisco...  ...technology services, as well as the engineers...  ...which DoorDash's entire product engineering org runs...  ...Computer Science, Software Engineering, or a related...  ...data engineering, cloud infrastructure, or... 
    Senior
    Software
    Hourly pay
    Work at office
    Local area
    Remote work
    Flexible hours

    DoorDash

    Sunnyvale, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Technical Program Manager, DGX Cloud Software Products and Services. Be the first to apply!