Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Technical Program Manager- AI Cluster Validation

Advanced Micro Devices , Inc.

WHAT YOU DO AT AMD CHANGES EVERYTHING


At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

Technical Program Manager- AI Cluster Validation

THE ROLE

We are seeking a Technical Program Manager to lead execution of AI cluster engineering programs with deep focus on GPU platforms, rack-level solutions, and AI Cluster validation. This role is responsible for driving end-to-end delivery from GPU + server integration through rack bring-up, scale testing, failure analysis, and system debug closure, ensuring platform readiness for hyperscale and enterprise AI deployments.

This role operates at the intersection of hardware, firmware, networking, and scale-test execution, and requires strong technical depth combined with disciplined program execution.

THE PERSON

You are a hands-on TPM who thrives in complex, fast-moving ecosystems, and can connect deep technical details to crisp program plans, executive reporting, and customer outcomes. You are comfortable driving execution in bring-up and EVT/DVT/PVT working closely with engineers to root-cause issues, unblock debug, and make data-driven tradeoffs to keep programs moving. You bring urgency, ownership, and clarity to ambiguous problem spaces and can communicate effectively from lab floor to executive review.

KEY RESPONSIBILITIES

Program Leadership & Execution
  • Define, plan, and drive program plans for AI infrastructure systems validation and readiness, including server integration, rack bring-up, and cluster-scale deployment readiness.
  • Create and maintain core PM artifacts: schedules, dependency maps, resource forecasts, risk/issue logs, and program dashboards/status reports.
  • Identify and drive mitigation plans for issues/risks, including cross-team escalations and corrective actions across multiple engineering areas.
  • Drive regular execution reviews with engineering teams and provide concise, data-driven updates to senior leadership.
GPU & Platform Execution
  • Own program execution for GPU-based AI platforms, spanning system bring-up, qualification, scale readiness, and deployment validation across server, rack, and cluster levels.
  • Drive alignment across GPU, CPU, firmware, BIOS/BMC, and system teams to ensure readiness for scale testing and customer workloads.
  • Track platform issues, and debug dependencies; ensure risks are clearly documented, owned, and mitigated.
AI Rack / Cluster Validation
  • Own program planning and execution for multi-node and multi-rack scale testing, including test strategy, scheduling, coverage tracking, and readiness gates.
  • Lead end-to-end delivery of rack-level AI solutions, including compute trays, switch trays, cabling, power, cooling, and management infrastructure.
  • Ensure rack bring-up plans are executable, resourced, and gated with clear entry/exit criteria across EVT, DVT, and scale phases.
  • Drive coordination across lab operations, infrastructure, and engineering teams to unblock rack access, power, networking, and test readiness.
  • Partner with scale, performance, and automation teams to ensure workloads, stress tests, and regressions plans are ready before hardware arrives.
Debug, Failure Analysis & Risk Management
  • Act as the execution lead for platform debug, coordinating across engineering teams to ensure fast triage, root-cause analysis, and resolution of system-level issues.
  • Track high-impact failures (GPU, HSIO, FW, rack, network) through debug forums ensuring clear ownership and closure plans.
  • Balance debug depth vs. program timelines, escalating tradeoffs when needed and ensuring leadership has a clear view of risk and impact.
REQUIRED QUALIFICATIONS
  • Experience leading complex hardware or AI infrastructure programs with ownership across bring-up, validation, and deployment phases.
  • Strong technical understanding of GPU-based AI systems, rack architectures, and datacenter infrastructure.
  • Proven ability to manage ambiguity, drive debug execution, and lead cross-functional teams without direct authority.
  • Strong written and verbal communication skills, including executive-level status reporting.
  • Proficiency with program management and execution tools (Jira, Confluence, dashboards, Excel/PowerPoint).
PREFERRED QUALIFICATIONS
  • Hands-on experience with GPU cluster scale testing, system stress, or performance validation.
  • Familiarity with rack-level bring-up, power/cooling constraints, networking, and failure modes at scale.
  • Experience working through hardware/firmware debug cycles in pre-production or customer-facing environments.
ACADEMIC CREDENTIALS
  • Bachelor's or master's degree in systems, EE, CS, or related engineering discipline.
  • PMP, Scrum Master, or equivalent program management training.

LOCATION

Austin, TX

This role is not eligible for visa sponsorship.

#LI-JE1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here.

This posting is for an existing vacancy.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Technical Program Manager- AI Cluster Validation in Austin, TX vacancy
  •  ...generation computing experiences-from AI and data centers, to PCs,...  ...We are seeking an experienced Technical Program Manager to drive end-to-end execution of AI cluster engineering programs spanning GPU...  ...to rack and cluster-level validation You bring strong ownership, structured... 
    Suggested
    Work at office

    Advanced Micro Devices , Inc.

    Austin, TX
    8 hours ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded...  ...Engineering organization, the Technical Program Manager (TPM) will lead the management and execution...  ...THE PERSON: In this role, the Validation Technical Program Manager will serve... 
    Suggested

    Advanced Micro Devices , Inc.

    Austin, TX
    5 days ago
  •  ...next-generation computing experiences-from AI and data centers, to PCs, gaming and...  ...career. The Role We are seeking a Program Manager with strong analytical, problem-solving,...  ...guidance from management and senior technical stakeholders Apply project management... 
    Suggested
    Work at office

    Advanced Micro Devices , Inc.

    Austin, TX
    4 days ago
  • $109.2k - $223.4k

     ...capacity. We are hiring an IC5 Technical Program Manager on a central execution team...  ...commissioning and GPU/cluster handover to operations....  ..., Energization, Ingestion, Validation, and regional build teams...  ...life-saving care. And with AI embedded across our products... 
    Suggested
    Temporary work
    Flexible hours

    Oracle

    Austin, TX
    5 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded...  ...of AI, solid system integration and validation is paramount. Our AI rack-scale...  ...this role, you will serve as a critical technical program manager in a dynamic, fast paced environment.... 
    Suggested

    Advanced Micro Devices , Inc.

    Austin, TX
    2 days ago
  •  ...the next generation of AI breakthroughs and...  ...an experienced Network Program Manager to lead cross-functional...  ...This role combines technical depth in networking and...  ...Coordinate qualification, validation, and production...  ...scale GPU/accelerator cluster infrastructure Knowledge... 
    Flexible hours

    Graphcore

    Austin, TX
    4 days ago
  • $151k - $297k

     ...embrace innovation, and unleash AI. Our industry-leading developer data...  ...together to help our users manage MongoDB at global scale. We are responsible...  ..., globally distributed MongoDB clusters in just minutes. As a Staff Technical Program Manager, you will own the... 
    Local area
    Worldwide
    Flexible hours

    MongoDB

    Austin, TX
    5 days ago
  • $100k - $115k

     ...Technical Program Manager (Bioinformatics Projects) Dallas or Austin, TX (hybrid) Location: Dallas...  ...provides award-winning software and AI solutions for cell and gene therapy leaders...  ...review cycles with customers to validate deliverables and ensure expectations are... 
    Contract work

    Form Bio

    Austin, TX
    3 days ago
  •  ...Customer-Facing Technical Program Manager Autonomize AI is revolutionizing healthcare by streamlining knowledge workflows with AI. We reduce administrative...  .... Work with customer and internal stakeholders to validate solution architecture and integration designs (e.g.,... 

    Autonomize AI

    Austin, TX
    6 hours ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded...  ..., the Staff Product Development Technical Program Manager will lead the management and execution...  ...and architecture through design, validation, production release, ramp, and end-of-... 
    Contract work

    Advanced Micro Devices , Inc.

    Austin, TX
    3 days ago
  •  ...Staff Technical Program Manager, Lab /Data Center Austin, Texas, United States About Us Graphcore...  ...will unlock the next generation of AI breakthroughs and power the widespread...  .... Support commissioning and validation to ensure operational readiness. Identify... 
    For contractors
    Flexible hours

    Graphcore

    Austin, TX
    5 days ago
  • $148.7k - $201.2k

     ...(GSCTP) organization is looking for a Technical Program Manager to lead vendor onboarding automation and...  ...partners, designing and deploying AI-powered tools, driving system integrations...  ..., from risk assessment and compliance validation through remediation tracking,... 
    Local area
    Worldwide
    Flexible hours
    Shift work
    Day shift

    Amazon

    Austin, TX
    2 days ago
  •  ...mission is to design, deliver, and scale production-grade Agentic AI workflows that execute highly complex, meaningful tasks...  ...scale, this is the team. About the Role As the Technical Program Manager for our Agentic AI team, you are the connective tissue between... 
    Work experience placement
    Local area

    Light & Wonder

    Austin, TX
    4 days ago
  • $116k - $145k

     ...Join to apply for the Senior Technical Program Manager role at DigitalOcean . We are looking for a Senior Technical Program Manager (TPM) who is...  ...dynamic team dedicated to revolutionizing cloud computing and AI through operational excellence and seamless execution. You will... 
    Local area
    Remote work
    Flexible hours

    DigitalOcean

    Austin, TX
    4 days ago
  •  ...Senior Technical Program Manager Onsite - Austin, TX Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our flagship humanoid robot, Apollo, is built to collaborate thoughtfully with people, starting... 
    Local area
    Shift work

    Apptronik

    Austin, TX
    8 hours ago
  • $148.7k - $201.2k

     ...delivery at every customer's doorstep. Our team builds innovative IoT and AI-powered solutions that serve millions of customers worldwide. We are seeking an experienced Sr. Technical Program Manager to join our KfB Software team and drive the delivery of our most... 
    Worldwide
    Flexible hours

    Amazon

    Austin, TX
    3 days ago
  • $78.5k - $108k

     ...that literally connect our world – like AI and IoT. If you want to push the boundaries...  ...employees. We’re committed to providing programs and support that encourage personal and professional...  ...; analyzes possible solutions using technical experience and judgment and precedents... 
    Full time
    Relocation

    Applied Materials

    Austin, TX
    5 days ago
  • $148.7k - $201.2k

     ...As part of the AWS Applied AI Solutions organization, we have a vision to provide...  ...used by millions of companies worldwide to manage day-to-day operations. We will...  ...and easy to use. We are looking for a Technical Program Manager to join our team that is building... 
    Worldwide
    Flexible hours

    Amazon

    Austin, TX
    6 hours ago
  • $116k - $159.5k

     ...that literally connect our world - like AI and IoT. If you want to push the boundaries...  ...employees. We're committed to providing programs and support that encourage personal and professional...  ...commodities, but will time to time manage other Semiconductor Projects. The right... 
    Full time
    Relocation

    Applied Materials

    Austin, TX
    4 hours ago
  • $100 per hour

     ...Integrations & Technical Implementation LinkedIn Top 40 U.S. Startups (2025) | Time & Statista...  ...(2026) Who We Are At Subject.ai, we're building AI-powered, personalized...  ...for a sharp, systems-minded Technical Program Manager to join our Operations team and help scale... 
    Full time
    Contract work
    Summer work
    Work at office
    Relocation
    Monday to Friday
    Flexible hours

    Subject

    Austin, TX
    8 hours ago
  • $124k - $186k

     ...Total Visits, March 2025) Day to Day Manage program plans to ensure timely, high-quality...  ...Skilled in handling multi-year, highly technical programs (e.g., infrastructure migration...  ...submitting a resume for that opening. AI Notice Indeed is committed to ensuring... 
    Work experience placement
    Local area

    Indeed

    Austin, TX
    1 day ago
  • $148.7k - $201.2k

     ..., a chance to be in the vanguard of a program that will revolutionize Prime Video and...  ...We seek an experienced and motivated technical program management leader to deliver critical programs delighting...  ...- Be comfortable adopting and driving AI best practices in the org A day in... 
    Flexible hours
    Night shift

    Amazon

    Austin, TX
    4 days ago
  • $167.28k - $196.8k

     ...is accessible to everyone. We are looking for a strong *Technical Program Manager *to join the Base team with emphasis in the Base Chain and Protocol...  ...~ Demonstrates the ability to responsibly use generative AI tools and copilots (e.g., LibreChat, Gemini, Glean) in daily... 
    Local area

    Coinbase

    Austin, TX
    4 days ago
  • $167.28k - $196.8k

     ...goal is to identify, measure, manage, mitigate, and report risk associated...  ...’s funds and data safe. As a Program Manager in the Security...  ...Managers, Program Managers, and Technical Program Managers who work...  ...security programs by leveraging AI tools, automation, and retrospectives... 
    Temporary work
    Local area

    Coinbase

    Austin, TX
    5 days ago
  • $131.6k - $210.3k

     ...collaboratively with Product Development, Product Management, Operations & Infrastructure, Cyber-...  ..., execution, and delivery. Technical Program Manager for Visa Commercial Solutions...  ...processes. Leverage and actively use Gen AI tools for PLM and SDLC enablement... 
    Work experience placement
    Work at office
    Local area
    Free visa

    Visa

    Austin, TX
    2 days ago
  •  ...We are seeking a highly skilled Technical Program Manager (TPM) to join our Engineering organization. The TPM will play a critical role in ensuring...  ...decision is always made by our team. You may opt out of AI screening without affecting your candidacy. For additional details... 
    Contract work
    For contractors
    Local area
    Immediate start
    Worldwide

    CyrusOne

    Austin, TX
    7 hours ago
  • $148.7k - $201.2k

     ...24/7, especially for high-profile, exclusive content. With AI as a transformative force, we're at an inflection point that...  ...What we're looking for: We need a highly talented Senior Technical Program Manager to build resilient, highly available, and operationally excellent... 
    Flexible hours

    Amazon

    Austin, TX
    5 days ago
  •  ...that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded...  ..., we advance your career. THE ROLE: The Senior Technical Program Manager - Strategic Initiatives role is a customer-focused, solutions... 
    Afternoon shift
    Early shift

    Advanced Micro Devices , Inc.

    Austin, TX
    5 hours ago
  •  ...Technical Product Manager III Location: Austin, TX (onsite preferred) OR Dallas, TX Duration:...  ...~7+ years of experience in product or program management, product marketing, business...  ...optimization Exposure to generative AI tools and ability to identify opportunities... 
    Contract work

    Apex Systems

    Austin, TX
    2 days ago
  •  ...Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our...  ...better. JOB SUMMARY We are looking for a seasoned Staff Technical Program Manager to lead complex, cross-functional programs that span... 
    Local area
    Shift work

    Synthesia

    Austin, TX
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Technical Program Manager- AI Cluster Validation. Be the first to apply!