Senior Technical Program Manager, DGX Cloud Software Products and Services
NVIDIA
Technical Program Manager (IC5)
NVIDIA's DGX Cloud (DGXC) powers AI for strategic research and product workloads. The company seeks an expert Technical Program Manager (IC5) to lead strategic programs emphasizing resilience, reliability, and goodput. This role requires collaboration across multiple teams. It involves driving improvements in resilience, service stability, and operational scale. The TPM also guides architectural decisions related to resilience reference architecture. The TPM leads programs spanning DGXC infrastructure, Resilience Tools, and core platform services to deliver fault-tolerant, high-availability training and inference environments at scale.
We are looking for a TPM who is analytical, technically skilled, and comfortable working with cloud infrastructure, software, operations, and environments driven by data and research. You will work closely with engineering, SRE, operations, and researchers to develop scalable resilience strategies, improve operational performance, and assist in building open, modular software components and reference stacks for DGX Cloud at scale.
What You'll Be Doing:
- Lead cross-functional programs that improve resilience, reliability, operational scale, and fleet-wide goodput across DGX Cloud.
- Partner across infrastructure, platform, site reliability, operational, and tenant teams to identify systemic risks, resolve cross-stack dependencies, and improve end-to-end service stability.
- Drive the definition and adoption of resilience reference stacks, operational standards, and scalable guidelines that strengthen service readiness and recovery.
- Partner with engineering teams and researchers to support the development and delivery of open, modular software components for resilience, facilitating reusable and extensible capabilities across the platform.
- Build and scale resilience tooling and operational mechanisms that improve observability, failure detection and attribution, root cause analysis, recovery orchestration, and operational readiness.
- Define, measure, and improve goodput, using data-driven insights to increase usable fleet capacity, workload efficiency, and customer outcomes at scale.
- Establish clear metrics, dashboards, and operating cadences to track program health, reliability posture, operational maturity, and performance.
What We Need To See:
- MS EE or CS degree, or equivalent experience.
- 8+ years of experience in program management of large-scale software or infrastructure projects.
- Proven track record of leading complex cross-functional programs in cloud, infrastructure, distributed systems, or platform environments.
- Strong analytical skills with the ability to assess issues across infrastructure, software, and operational layers.
- Excellent organizational skills and ability to use project management tools (e.g. Jira, Aha!, Confluence) and distributed version control systems (e.g. Git).
- Solid understanding of reliability engineering, resilience development, and service performance metrics, including goodput, efficiency, and utilization.
- Experience working alongside engineering, SRE, operations, and technical collaborators to advance projects in ambiguous, high-complexity environments.
- Outstanding communication and presentation skills for diverse technical and non-technical audiences with strong problem-solving and conflict management skills.
Ways To Stand Out From The Crowd:
- Background in computer science, machine learning, deep learning, open-source software, and GPU technology, AI infrastructure, or large-scale compute platforms.
- Experience with large-scale AI training environments (e.g., distributed training frameworks, checkpointing, NCCL, Slurm or other schedulers).
- Prior experience in the management of customer workflows using large scale distributed computing and working with AI researchers or directly training and evaluating AI models.
- Proven ability to harness AI-enabled workflows and tools to improve program management efficiency, decision-making, execution visibility, and operational efficiency.
Widely considered to be one of the technology world's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family.
NVIDIA- ...Senior Technical Program Manager NVIDIA's DGX Cloud (DGXC) powers AI for strategic research and product workloads. The company seeks a Senior Technical... ...NVIDIA's next-generation AI software platforms. In this role,... ...across platform services, cloud infrastructure, and...SeniorSoftware
$200k - $322k
...experienced and skilled Technical Program Manager for NVIDIA’s DGX Cloud Infrastructure Team. We... ...link between global cloud service providers and NVIDIA... ...planning for all phases of the product life cycle, manage risks... ...of large programs, software engineering projects in...SeniorSoftwareWorldwide$200k - $322k
NVIDIA’s DGX Cloud is redefining how organizations... ...We’re looking for a Senior Technical Program Manager to drive storage‑related... ...with engineering, product, operations, finance... ..., operations, cloud service providers, clusters... ...of large‑scale software or infrastructure projects...SeniorSoftware- ...Senior Technical Program Manager As a Senior Technical Program... ...passionate about Cloud Security, you will drive the DGX Cloud infrastructure... ...with Cloud Service Providers (CSPs)... ...infrastructure, platform, and product teams. This role... ...roadmaps and the software development...SeniorSoftware
$168k - $258.75k
.... We are looking for a Technical Program Manager (TPM) to join our DGX Cloud team and help drive AI... ...Engineering, Infrastructure, and Software teams to manage... ...to CSP (Cloud Service Providers) and NCPs (NVIDIA... ...ensuring adherence to our Product Lifecycle (PLC) process...SeniorSoftwareWorldwide$185k - $203k
Senior Technical Program Manager, Salesforce & Cloud Sunnyvale, California, United States At GFiber... ...Fiber Webpass internet services to homes and businesses... ...projects are related to software developed by the... ...GCP. Present health of production systems to leadership....SeniorSoftwareFull timeFor contractors$227k - $320k
Senior Technical Program Manager II, Infrastructure, Google Cloud corporate_fare Google place Sunnyvale, CA, USA... ...’s why Googlers build products that help create... ...computing power to global services, and providing the essential... ...the future. From software to hardware our teams...SeniorSoftwareFull timeWork experience placementWorldwideShift work- A leading global technology firm in Santa Clara seeks a Program Manager to lead service product development in semiconductor equipment. The ideal candidate will have over 5 years of experience, especially in Dielectric Deposition technologies, and will drive project execution...SeniorFull time
- ...9/10/2023 Apple is seeking an Engineering Program Manager to join the Cloud Products and Platform program team within Apple Services Engineering. The role involves overseeing... ...delivery of projects. This role requires strong technical project management experience, the ability...Senior
$193k - $347.2k
..., United States Software and Services The Apple Services... ...countries. Our Program Managers partner with... ...areas of Apple Cloud Infrastructure.... ...team is seeking a senior engineering program... ...engineering, product, and business teams... ...across complex technical dependencies and...SeniorSoftwareRelocationFlexible hours$141k - $229k
...Summary Key Responsibilities Product Roadmap & Strategy: Create,... ...the product roadmap for Technical Services, optimizing PSA (eg: Planview... ...scoring. Translate Vision into Software: Rapidly move from idea to... ...Proven experience as a Product Manager in a technology‑focused...SeniorSoftwareFull timeWork at officeShift work$138k - $198k
...Technical Program Manager II, Capacity Delivery, Cloud Networking Mid Experience driving progress... ...That's why Googlers build products that help create... ...computing power to global services, and providing the essential... ...build the future. From software to hardware our teams...SoftwareWorldwide$163k - $237k
...Technical Program Manager III, NPI Hardware, Cloud AI Systems Mid Experience driving progress,... ...programs in both hardware and software development lifecycles. Experience managing New Product Introduction for... ...business and Google (1P) services. In this role, you will...SoftwareWork at office$224k - $356.5k
...As part of the DGX Cloud organization,... ...the Attestation Services team is... ...platform, and software teams to deliver... ...computing. Strong programming proficiency in... ...in production. Experience with... ...development and management. Demonstrated... ...multi-functional technical projects from...SeniorSoftwareRemote work$168k - $258.75k
## Senior Technical Program Manager - VLSIApplylocations: US, CA, Santa Claratime type: Full timeposted... ...and is at the heart of our products and services. Our work opens up new universes to... ...and partners in ASIC, Architecture, Software, Systems and Operations to handle...SeniorSoftware$192k - $279k
Senior Technical Program Manager, Silicon Google - Sunnyvale, CA, USA Requirements... ...teams (system, product, finance) to drive... ...Googlers, Google Cloud customers, and billions... ...power to global services, and providing the essential... ...the future. From software to hardware our...SeniorSoftwareWorldwide- ...Senior Technical Program Manager, Launchpad San Francisco, CA; Seattle, WA; New York... ...the underlying technology services as well as the engineers... ...fastest-growing hardware-software business. In this role, you... ...with an emerging hardware product. You will act as the "Execution...SeniorSoftwareWork at officeLocal areaRemote work
$192k - $279k
...Senior Product Manager At Google, we put our users first. The world... ...by connecting the technical and business worlds. You... ...Google Distributed Cloud (GDC) is a set of hardware + software solutions that bring modern... ...technologies and Google services to on-prem data centers...SeniorSoftware- ...Senior Technical Program Manager Hardware Infrastructure is seeking a... ...critical systems and services that support analytics... ..., we support software teams specifically through... ...development of new products. Our mission is to accelerate... ...principles and cloud cost optimization...SeniorSoftware
$192k - $279k
...Senior Product Manager, Compute, Google Cloud At Google, we put our users first. The world... ...launch by connecting the technical and business worlds. You... ...computing power to global services, and providing the... ...build the future. From software to hardware our teams are...SeniorSoftwareWorldwide$192k - $279k
...Senior Technical Program Manager A problem isn't truly solved until it... ...why Googlers build products that help create opportunities... ...Googlers, Google Cloud customers, and... ...computing power to global services, and providing the... ...the future. From software to hardware our...SeniorSoftwareWorldwide$173.28k - $259.6k
...Senior Principal Technical Program Manager Marvell's semiconductor solutions... ...Across enterprise, cloud and AI, and carrier... ...functional engineering, product, and operations... ...a recognition and service awards to celebrate... ...technology and/or software subject to U.S. export...SeniorSoftwarePermanent employmentInternshipWork from home- ...Cerebras Systems Sr. Technical Program Manager Cerebras Systems... ...-based hyperscale cloud inference services. This order of... ...operational risks to senior leadership Required... ...are serious about software make their own hardware... ...teams build better products and companies. We...SeniorSoftware
$148k - $235.75k
...hard-working leader to join NVIDIA’s DGX Program Management team, focusing on delivery... ...advancement. As a partner with engineering, product, QA, provide technical teams in the end-to-end... ...timely documentation that aligns with software releases and supports our impact across...SeniorSoftware$216.15k - $262k
...Senior Staff TPM For Vera Rubin Generation... ..., and cloud services. If you want... ...introduction. Not manage a workstream... ...-level program, not a SKU-level... ...parallel with active production deployments.... ...of that: the technical depth to... ...effects. Deep software/firmware lifecycle...SeniorSoftwareTemporary work$182.4k - $273.6k
...are seeking a highly skilled Technical Program Manager (TPM) to drive complex software engineering projects that push... ...risks in the development of our products and services Facilitate effective... ...physical devices (in addition to cloud-based software deployments) Key...SeniorSoftwareFull timeFor contractorsWork at office$272k - $431.25k
...and execution for cloud services that provide... ...infrastructure, security, product, and engineering... ..., artifact management, and deployment workflows... ..., compliance, software supply chain... ...engineering managers and senior individual... ...architecture and technical direction while empowering...SeniorSoftware$96.8k - $336k
...a highly motivated and experienced Technical Program/Product Manager (TPM) to join the fastest growing area... ...acumen with a technical or software engineering background to drive the... ...among firmware, software, product, and service engineering teams to ensure we build...SeniorSoftwareHourly payFull timeTemporary workFlexible hours$192k - $279k
Senior Technical Program Manager, Strategic Infrastructure Planning Initiatives... ...why Googlers build products that help create... ...Googlers, Google Cloud customers, and billions... ...power to global services, and providing the essential... ...the future. From software to hardware our...SeniorSoftwareTemporary workWorldwide- ...Senior Technical Program Manager - Foundations Engineering San Francisco... ...technology services, as well as the engineers... ...which DoorDash's entire product engineering org runs... ...Computer Science, Software Engineering, or a related... ...data engineering, cloud infrastructure, or...SeniorSoftwareHourly payWork at officeLocal areaRemote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Technical Program Manager, DGX Cloud Software Products and Services. Be the first to apply!
- technical business manager Santa Clara, CA
- senior technical product manager Santa Clara, CA
- technical manager Santa Clara, CA
- technical services manager Santa Clara, CA
- senior technical manager Santa Clara, CA
- director of technical services Santa Clara, CA
- technical integration manager Santa Clara, CA
- technical superintendent Santa Clara, CA
- technical program manager Santa Clara, CA
- technical supervisor Santa Clara, CA
