Technical Program Manager- AI Cluster Validation
Advanced Micro Devices , Inc.
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. Technical Program Manager- AI Cluster Validation THE ROLE We are seeking a Technical Program Manager to lead execution of AI cluster engineering programs with deep focus on GPU platforms, rack-level solutions, and AI Cluster validation. This role is responsible for driving end-to-end delivery from GPU + server integration through rack bring-up, scale testing, failure analysis, and system debug closure, ensuring platform readiness for hyperscale and enterprise AI deployments. This role operates at the intersection of hardware, firmware, networking, and scale-test execution, and requires strong technical depth combined with disciplined program execution. THE PERSON You are a hands-on TPM who thrives in complex, fast-moving ecosystems, and can connect deep technical details to crisp program plans, executive reporting, and customer outcomes. You are comfortable driving execution in bring-up and EVT/DVT/PVT working closely with engineers to root-cause issues, unblock debug, and make data-driven tradeoffs to keep programs moving. You bring urgency, ownership, and clarity to ambiguous problem spaces and can communicate effectively from lab floor to executive review. KEY RESPONSIBILITIES Program Leadership & Execution
LOCATION Austin, TX This role is not eligible for visa sponsorship. #LI-JE1 Benefits offered are described: AMD benefits at a glance. AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process. AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here. This posting is for an existing vacancy.
At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. Technical Program Manager- AI Cluster Validation THE ROLE We are seeking a Technical Program Manager to lead execution of AI cluster engineering programs with deep focus on GPU platforms, rack-level solutions, and AI Cluster validation. This role is responsible for driving end-to-end delivery from GPU + server integration through rack bring-up, scale testing, failure analysis, and system debug closure, ensuring platform readiness for hyperscale and enterprise AI deployments. This role operates at the intersection of hardware, firmware, networking, and scale-test execution, and requires strong technical depth combined with disciplined program execution. THE PERSON You are a hands-on TPM who thrives in complex, fast-moving ecosystems, and can connect deep technical details to crisp program plans, executive reporting, and customer outcomes. You are comfortable driving execution in bring-up and EVT/DVT/PVT working closely with engineers to root-cause issues, unblock debug, and make data-driven tradeoffs to keep programs moving. You bring urgency, ownership, and clarity to ambiguous problem spaces and can communicate effectively from lab floor to executive review. KEY RESPONSIBILITIES Program Leadership & Execution
- Define, plan, and drive program plans for AI infrastructure systems validation and readiness, including server integration, rack bring-up, and cluster-scale deployment readiness.
- Create and maintain core PM artifacts: schedules, dependency maps, resource forecasts, risk/issue logs, and program dashboards/status reports.
- Identify and drive mitigation plans for issues/risks, including cross-team escalations and corrective actions across multiple engineering areas.
- Drive regular execution reviews with engineering teams and provide concise, data-driven updates to senior leadership.
- Own program execution for GPU-based AI platforms, spanning system bring-up, qualification, scale readiness, and deployment validation across server, rack, and cluster levels.
- Drive alignment across GPU, CPU, firmware, BIOS/BMC, and system teams to ensure readiness for scale testing and customer workloads.
- Track platform issues, and debug dependencies; ensure risks are clearly documented, owned, and mitigated.
- Own program planning and execution for multi-node and multi-rack scale testing, including test strategy, scheduling, coverage tracking, and readiness gates.
- Lead end-to-end delivery of rack-level AI solutions, including compute trays, switch trays, cabling, power, cooling, and management infrastructure.
- Ensure rack bring-up plans are executable, resourced, and gated with clear entry/exit criteria across EVT, DVT, and scale phases.
- Drive coordination across lab operations, infrastructure, and engineering teams to unblock rack access, power, networking, and test readiness.
- Partner with scale, performance, and automation teams to ensure workloads, stress tests, and regressions plans are ready before hardware arrives.
- Act as the execution lead for platform debug, coordinating across engineering teams to ensure fast triage, root-cause analysis, and resolution of system-level issues.
- Track high-impact failures (GPU, HSIO, FW, rack, network) through debug forums ensuring clear ownership and closure plans.
- Balance debug depth vs. program timelines, escalating tradeoffs when needed and ensuring leadership has a clear view of risk and impact.
- Experience leading complex hardware or AI infrastructure programs with ownership across bring-up, validation, and deployment phases.
- Strong technical understanding of GPU-based AI systems, rack architectures, and datacenter infrastructure.
- Proven ability to manage ambiguity, drive debug execution, and lead cross-functional teams without direct authority.
- Strong written and verbal communication skills, including executive-level status reporting.
- Proficiency with program management and execution tools (Jira, Confluence, dashboards, Excel/PowerPoint).
- Hands-on experience with GPU cluster scale testing, system stress, or performance validation.
- Familiarity with rack-level bring-up, power/cooling constraints, networking, and failure modes at scale.
- Experience working through hardware/firmware debug cycles in pre-production or customer-facing environments.
- Bachelor's or master's degree in systems, EE, CS, or related engineering discipline.
- PMP, Scrum Master, or equivalent program management training.
LOCATION Austin, TX This role is not eligible for visa sponsorship. #LI-JE1 Benefits offered are described: AMD benefits at a glance. AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process. AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here. This posting is for an existing vacancy.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Technical Program Manager- AI Cluster Validation in Austin, TX vacancy
- ...generation computing experiences-from AI and data centers, to PCs,... ...We are seeking an experienced Technical Program Manager to drive end-to-end execution of AI cluster engineering programs spanning GPU... ...to rack and cluster-level validation You bring strong ownership, structured...SuggestedWork at office
- ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded... ...Engineering organization, the Technical Program Manager (TPM) will lead the management and execution... ...THE PERSON: In this role, the Validation Technical Program Manager will serve...Suggested
- ...next-generation computing experiences-from AI and data centers, to PCs, gaming and... ...career. The Role We are seeking a Program Manager with strong analytical, problem-solving,... ...guidance from management and senior technical stakeholders Apply project management...SuggestedWork at office
$109.2k - $223.4k
...capacity. We are hiring an IC5 Technical Program Manager on a central execution team... ...commissioning and GPU/cluster handover to operations.... ..., Energization, Ingestion, Validation, and regional build teams... ...life-saving care. And with AI embedded across our products...SuggestedTemporary workFlexible hours- ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded... ...of AI, solid system integration and validation is paramount. Our AI rack-scale... ...this role, you will serve as a critical technical program manager in a dynamic, fast paced environment....Suggested
- ...the next generation of AI breakthroughs and... ...an experienced Network Program Manager to lead cross-functional... ...This role combines technical depth in networking and... ...Coordinate qualification, validation, and production... ...scale GPU/accelerator cluster infrastructure Knowledge...Flexible hours
$151k - $297k
...embrace innovation, and unleash AI. Our industry-leading developer data... ...together to help our users manage MongoDB at global scale. We are responsible... ..., globally distributed MongoDB clusters in just minutes. As a Staff Technical Program Manager, you will own the...Local areaWorldwideFlexible hours$100k - $115k
...Technical Program Manager (Bioinformatics Projects) Dallas or Austin, TX (hybrid) Location: Dallas... ...provides award-winning software and AI solutions for cell and gene therapy leaders... ...review cycles with customers to validate deliverables and ensure expectations are...Contract work- ...Customer-Facing Technical Program Manager Autonomize AI is revolutionizing healthcare by streamlining knowledge workflows with AI. We reduce administrative... .... Work with customer and internal stakeholders to validate solution architecture and integration designs (e.g.,...
- ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded... ..., the Staff Product Development Technical Program Manager will lead the management and execution... ...and architecture through design, validation, production release, ramp, and end-of-...Contract work
- ...Staff Technical Program Manager, Lab /Data Center Austin, Texas, United States About Us Graphcore... ...will unlock the next generation of AI breakthroughs and power the widespread... .... Support commissioning and validation to ensure operational readiness. Identify...For contractorsFlexible hours
$148.7k - $201.2k
...(GSCTP) organization is looking for a Technical Program Manager to lead vendor onboarding automation and... ...partners, designing and deploying AI-powered tools, driving system integrations... ..., from risk assessment and compliance validation through remediation tracking,...Local areaWorldwideFlexible hoursShift workDay shift- ...mission is to design, deliver, and scale production-grade Agentic AI workflows that execute highly complex, meaningful tasks... ...scale, this is the team. About the Role As the Technical Program Manager for our Agentic AI team, you are the connective tissue between...Work experience placementLocal area
$116k - $145k
...Join to apply for the Senior Technical Program Manager role at DigitalOcean . We are looking for a Senior Technical Program Manager (TPM) who is... ...dynamic team dedicated to revolutionizing cloud computing and AI through operational excellence and seamless execution. You will...Local areaRemote workFlexible hours- ...Senior Technical Program Manager Onsite - Austin, TX Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our flagship humanoid robot, Apollo, is built to collaborate thoughtfully with people, starting...Local areaShift work
$148.7k - $201.2k
...delivery at every customer's doorstep. Our team builds innovative IoT and AI-powered solutions that serve millions of customers worldwide. We are seeking an experienced Sr. Technical Program Manager to join our KfB Software team and drive the delivery of our most...WorldwideFlexible hours$78.5k - $108k
...that literally connect our world – like AI and IoT. If you want to push the boundaries... ...employees. We’re committed to providing programs and support that encourage personal and professional... ...; analyzes possible solutions using technical experience and judgment and precedents...Full timeRelocation$148.7k - $201.2k
...As part of the AWS Applied AI Solutions organization, we have a vision to provide... ...used by millions of companies worldwide to manage day-to-day operations. We will... ...and easy to use. We are looking for a Technical Program Manager to join our team that is building...WorldwideFlexible hours$116k - $159.5k
...that literally connect our world - like AI and IoT. If you want to push the boundaries... ...employees. We're committed to providing programs and support that encourage personal and professional... ...commodities, but will time to time manage other Semiconductor Projects. The right...Full timeRelocation$100 per hour
...Integrations & Technical Implementation LinkedIn Top 40 U.S. Startups (2025) | Time & Statista... ...(2026) Who We Are At Subject.ai, we're building AI-powered, personalized... ...for a sharp, systems-minded Technical Program Manager to join our Operations team and help scale...Full timeContract workSummer workWork at officeRelocationMonday to FridayFlexible hours$124k - $186k
...Total Visits, March 2025) Day to Day Manage program plans to ensure timely, high-quality... ...Skilled in handling multi-year, highly technical programs (e.g., infrastructure migration... ...submitting a resume for that opening. AI Notice Indeed is committed to ensuring...Work experience placementLocal area$148.7k - $201.2k
..., a chance to be in the vanguard of a program that will revolutionize Prime Video and... ...We seek an experienced and motivated technical program management leader to deliver critical programs delighting... ...- Be comfortable adopting and driving AI best practices in the org A day in...Flexible hoursNight shift$167.28k - $196.8k
...is accessible to everyone. We are looking for a strong *Technical Program Manager *to join the Base team with emphasis in the Base Chain and Protocol... ...~ Demonstrates the ability to responsibly use generative AI tools and copilots (e.g., LibreChat, Gemini, Glean) in daily...Local area$167.28k - $196.8k
...goal is to identify, measure, manage, mitigate, and report risk associated... ...’s funds and data safe. As a Program Manager in the Security... ...Managers, Program Managers, and Technical Program Managers who work... ...security programs by leveraging AI tools, automation, and retrospectives...Temporary workLocal area$131.6k - $210.3k
...collaboratively with Product Development, Product Management, Operations & Infrastructure, Cyber-... ..., execution, and delivery. Technical Program Manager for Visa Commercial Solutions... ...processes. Leverage and actively use Gen AI tools for PLM and SDLC enablement...Work experience placementWork at officeLocal areaFree visa- ...We are seeking a highly skilled Technical Program Manager (TPM) to join our Engineering organization. The TPM will play a critical role in ensuring... ...decision is always made by our team. You may opt out of AI screening without affecting your candidacy. For additional details...Contract workFor contractorsLocal areaImmediate startWorldwide
$148.7k - $201.2k
...24/7, especially for high-profile, exclusive content. With AI as a transformative force, we're at an inflection point that... ...What we're looking for: We need a highly talented Senior Technical Program Manager to build resilient, highly available, and operationally excellent...Flexible hours- ...that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded... ..., we advance your career. THE ROLE: The Senior Technical Program Manager - Strategic Initiatives role is a customer-focused, solutions...Afternoon shiftEarly shift
- ...Technical Product Manager III Location: Austin, TX (onsite preferred) OR Dallas, TX Duration:... ...~7+ years of experience in product or program management, product marketing, business... ...optimization Exposure to generative AI tools and ability to identify opportunities...Contract work
- ...Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our... ...better. JOB SUMMARY We are looking for a seasoned Staff Technical Program Manager to lead complex, cross-functional programs that span...Local areaShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Technical Program Manager- AI Cluster Validation. Be the first to apply!
Related searches
- senior technical manager Austin, TX
- technical director engineering Austin, TX
- sr technical product manager Austin, TX
- technical account manager Austin, TX
- technical writing manager Austin, TX
- technical services manager Austin, TX
- technical supervisor Austin, TX
- technical program manager Austin, TX
- technical product manager Austin, TX
- technical coordinator Austin, TX

