Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Technical Program Manager- AI Cluster Validation

Advanced Micro Devices , Inc.

WHAT YOU DO AT AMD CHANGES EVERYTHING


At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

Technical Program Manager- AI Cluster Validation

THE ROLE

We are seeking a Technical Program Manager to lead execution of AI cluster engineering programs with deep focus on GPU platforms, rack-level solutions, and AI Cluster validation. This role is responsible for driving end-to-end delivery from GPU + server integration through rack bring-up, scale testing, failure analysis, and system debug closure, ensuring platform readiness for hyperscale and enterprise AI deployments.

This role operates at the intersection of hardware, firmware, networking, and scale-test execution, and requires strong technical depth combined with disciplined program execution.

THE PERSON

You are a hands-on TPM who thrives in complex, fast-moving ecosystems, and can connect deep technical details to crisp program plans, executive reporting, and customer outcomes. You are comfortable driving execution in bring-up and EVT/DVT/PVT working closely with engineers to root-cause issues, unblock debug, and make data-driven tradeoffs to keep programs moving. You bring urgency, ownership, and clarity to ambiguous problem spaces and can communicate effectively from lab floor to executive review.

KEY RESPONSIBILITIES

Program Leadership & Execution
  • Define, plan, and drive program plans for AI infrastructure systems validation and readiness, including server integration, rack bring-up, and cluster-scale deployment readiness.
  • Create and maintain core PM artifacts: schedules, dependency maps, resource forecasts, risk/issue logs, and program dashboards/status reports.
  • Identify and drive mitigation plans for issues/risks, including cross-team escalations and corrective actions across multiple engineering areas.
  • Drive regular execution reviews with engineering teams and provide concise, data-driven updates to senior leadership.
GPU & Platform Execution
  • Own program execution for GPU-based AI platforms, spanning system bring-up, qualification, scale readiness, and deployment validation across server, rack, and cluster levels.
  • Drive alignment across GPU, CPU, firmware, BIOS/BMC, and system teams to ensure readiness for scale testing and customer workloads.
  • Track platform issues, and debug dependencies; ensure risks are clearly documented, owned, and mitigated.
AI Rack / Cluster Validation
  • Own program planning and execution for multi-node and multi-rack scale testing, including test strategy, scheduling, coverage tracking, and readiness gates.
  • Lead end-to-end delivery of rack-level AI solutions, including compute trays, switch trays, cabling, power, cooling, and management infrastructure.
  • Ensure rack bring-up plans are executable, resourced, and gated with clear entry/exit criteria across EVT, DVT, and scale phases.
  • Drive coordination across lab operations, infrastructure, and engineering teams to unblock rack access, power, networking, and test readiness.
  • Partner with scale, performance, and automation teams to ensure workloads, stress tests, and regressions plans are ready before hardware arrives.
Debug, Failure Analysis & Risk Management
  • Act as the execution lead for platform debug, coordinating across engineering teams to ensure fast triage, root-cause analysis, and resolution of system-level issues.
  • Track high-impact failures (GPU, HSIO, FW, rack, network) through debug forums ensuring clear ownership and closure plans.
  • Balance debug depth vs. program timelines, escalating tradeoffs when needed and ensuring leadership has a clear view of risk and impact.
REQUIRED QUALIFICATIONS
  • Experience leading complex hardware or AI infrastructure programs with ownership across bring-up, validation, and deployment phases.
  • Strong technical understanding of GPU-based AI systems, rack architectures, and datacenter infrastructure.
  • Proven ability to manage ambiguity, drive debug execution, and lead cross-functional teams without direct authority.
  • Strong written and verbal communication skills, including executive-level status reporting.
  • Proficiency with program management and execution tools (Jira, Confluence, dashboards, Excel/PowerPoint).
PREFERRED QUALIFICATIONS
  • Hands-on experience with GPU cluster scale testing, system stress, or performance validation.
  • Familiarity with rack-level bring-up, power/cooling constraints, networking, and failure modes at scale.
  • Experience working through hardware/firmware debug cycles in pre-production or customer-facing environments.
ACADEMIC CREDENTIALS
  • Bachelor's or master's degree in systems, EE, CS, or related engineering discipline.
  • PMP, Scrum Master, or equivalent program management training.

LOCATION

Austin, TX

This role is not eligible for visa sponsorship.

#LI-JE1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here.

This posting is for an existing vacancy.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Technical Program Manager- AI Cluster Validation in Austin, TX vacancy
  • $200k - $275k

     ...Technical Program Manager For Deployments As a Technical Program Manager for Deployments, you will...  ...bring data center infrastructure and AI clusters online. You will operate in a highly...  ...with SMEs across multiple domains to validate plans and resolve gaps ~ Comfortable... 
    Suggested
    Contract work
    For contractors
    Local area

    Fluidstack

    Austin, TX
    1 day ago
  •  ...next-generation computing experiences-from AI and data centers, to PCs, gaming and...  ...career. The Role We are seeking a Program Manager with strong analytical, problem-solving,...  ...guidance from management and senior technical stakeholders Apply project management... 
    Suggested
    Work at office

    Advanced Micro Devices , Inc.

    Austin, TX
    4 days ago
  •  ...ideal candidate has led programs productionizing ML-...  ...Classification Systems, Clustering, Label Propagation,...  ...established background in managing programs, scaling...  ...business requirements into technical specifications. You...  ...progress on Cognite’s AI product portfolio. You... 
    Suggested

    Cognite

    Austin, TX
    3 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded...  ...of AI, solid system integration and validation is paramount. Our AI rack-scale...  ...this role, you will serve as a critical technical program manager in a dynamic, fast paced environment.... 
    Suggested

    Advanced Micro Devices , Inc.

    Austin, TX
    1 day ago
  •  ...Technical Program Manager About Us Visa is a world leader in payments technology, facilitating transactions...  ...architecture and proposed designs to validate that solutions meet business needs,...  ...shift-left" goals such as automation, AI adoption, security, and quality.... 
    Suggested
    Work at office
    Local area
    Shift work

    Visa

    Austin, TX
    2 days ago
  • $160k - $200k

     ...demand for new Cloud and AI infrastructure. Fleet is led...  ..., within scope, and to technical standards while managing risk, dependencies, and scale...  .... Job Responsibilities: Program & Deployment Management Own...  ...cooling if applicable) Validate deployment meets data center... 

    Tract Capital Management, LP

    Austin, TX
    4 days ago
  • $151k - $297k

     ...embrace innovation, and unleash AI. Our industry-leading developer data...  ...together to help our users manage MongoDB at global scale. We are responsible...  ..., globally distributed MongoDB clusters in just minutes. As a Staff Technical Program Manager, you will own the... 
    Local area
    Worldwide
    Flexible hours

    MongoDB

    Austin, TX
    3 days ago
  • $100k

     ...leading the industry on cutting-edge AI technology, revolutionizing performance...  ...software. We are seeking an experienced Technical Program Manager to lead cross-functional product...  ...preferred). Experience in product design, validation, or engineering, combined with 3+... 
    Permanent employment

    Tenstorrent

    Austin, TX
    4 days ago
  • $100k - $115k

     ...Technical Program Manager (Bioinformatics Projects) Dallas or Austin, TX (hybrid) Location: Dallas...  ...provides award-winning software and AI solutions for cell and gene therapy leaders...  ...review cycles with customers to validate deliverables and ensure expectations are... 
    Contract work

    Form Bio

    Austin, TX
    3 days ago
  •  ...Customer-Facing Technical Program Manager Autonomize AI is revolutionizing healthcare by streamlining knowledge workflows with AI. We reduce administrative...  .... Work with customer and internal stakeholders to validate solution architecture and integration designs (e.g.,... 

    Autonomize AI

    Austin, TX
    6 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded...  ..., the Staff Product Development Technical Program Manager will lead the management and execution...  ...and architecture through design, validation, production release, ramp, and end-of-... 
    Contract work

    Advanced Micro Devices , Inc.

    Austin, TX
    3 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded...  ...currently looking for a Manufacturing Technical Program Manager who will be part of a team driving end...  ...operates at the intersection of ASIC validation, firmware/software enablement,... 
    Contract work
    Work experience placement
    Flexible hours

    Advanced Micro Devices , Inc.

    Austin, TX
    9 days ago
  •  ...generation computing experiences—from AI and data centers, to PCs, gaming and embedded...  ...Role We are seeking an experienced Technical Program Manager – Server Customer Engineering (...  ...across hardware, firmware, software, validation, and partners to deliver production‑ready... 

    Advanced Micro Devices , Inc.

    Austin, TX
    9 days ago
  • $96k - $132k

     ...that literally connect our world - like AI and IoT. If you want to push the...  ...employees. We're committed to providing programs and support that encourage personal and...  ...about our benefits. The NPI Technical Program Manager (TPM) is an individual contributor role... 
    Full time
    Relocation

    Applied Materials

    Austin, TX
    9 days ago
  • $167.28k - $196.8k

     ...is accessible to everyone. We are looking for a strong *Technical Program Manager *to join the Base team with emphasis in the Base Chain and Protocol...  ...~ Demonstrates the ability to responsibly use generative AI tools and copilots (e.g., LibreChat, Gemini, Glean) in daily... 
    Local area

    Coinbase

    Austin, TX
    9 days ago
  • $167.28k - $196.8k

     ...goal is to identify, measure, manage, mitigate, and report risk associated...  ...’s funds and data safe. As a Program Manager in the Security...  ...Managers, Program Managers, and Technical Program Managers who work...  ...security programs by leveraging AI tools, automation, and retrospectives... 
    Temporary work
    Local area

    Coinbase

    Austin, TX
    10 days ago
  •  ...Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our...  ...JOB SUMMARY We are looking for an experienced Senior Technical Program Manager to lead large cross-functional teams through a full product... 
    Local area
    Shift work

    Synthesia

    Austin, TX
    1 day ago
  • $116k - $159.5k

     ...literally connect our world – like AI and IoT. If you want to push...  ...We're committed to providing programs and support that encourage...  ..., processes and resources. Manages project schedule and task details...  ...complete projects. Provides technical input to team members to achieve... 
    Full time
    Worldwide
    Relocation

    Applied Materials

    Austin, TX
    1 day ago
  •  ...We are seeking a highly skilled Technical Program Manager (TPM) to join our Engineering organization. The TPM will play a critical role in ensuring...  ...decision is always made by our team. You may opt out of AI screening without affecting your candidacy. For additional details... 
    Contract work
    For contractors
    Local area
    Immediate start
    Worldwide

    CyrusOne

    Austin, TX
    1 day ago
  • $66k - $110.5k

     ...eBay's Global Platforms organization is looking for a Technical Program Manager to help drive planning, coordination, and execution across critical...  .... We use cookies to enhance your experience and may use AI tools for administrative tasks in the hiring process. To learn... 
    Immediate start
    Remote work

    eBay Inc.

    Austin, TX
    4 days ago
  • $115.3k - $264.1k

     ...Technical Program Manager – Rack Level Protection Oracle's Global Physical Security (GPS) Systems Team is responsible for securing the corporation...  ...from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers... 
    Contract work
    Temporary work
    Remote work
    Worldwide
    Flexible hours

    Oracle

    Austin, TX
    2 days ago
  • $78.5k - $108k

     ...that literally connect our world – like AI and IoT. If you want to push the boundaries...  ...employees. We’re committed to providing programs and support that encourage personal and professional...  ...; analyzes possible solutions using technical experience and judgment and precedents... 
    Full time
    Relocation

    Applied Materials

    Austin, TX
    5 days ago
  • $102.3k - $209.5k

     ...action and doesn't hesitate to roll up your sleeves to move a program forward. Has a technical foundation that allows you to function in a highly...  ...Embraces a growth mindset, learns fast, and approaches AI infrastructure with curiosity Thrives in a space that... 
    Temporary work
    Flexible hours

    Oracle

    Austin, TX
    4 days ago
  •  ...that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded...  ..., we advance your career. THE ROLE: The Senior Technical Program Manager - Strategic Initiatives role is a customer-focused, solutions... 
    Afternoon shift
    Early shift

    Advanced Micro Devices , Inc.

    Austin, TX
    1 day ago
  • $90.1k - $209.5k

     ...investigating security incidents, managing incident responses, and...  ...threats in real time and support program operations and processes. The...  ...an experienced Security Technical Program Manager who is proactive...  ...to life-saving care. And with AI embedded across our products and... 
    Temporary work
    Immediate start
    Flexible hours

    Oracle

    Austin, TX
    2 days ago
  • $132.23k - $160.31k

     ...Lumen is the trusted network for AI. We're transforming how businesses connect, secure, and scale in an AI-driven world....  ...building the future. The Role We are seeking an experienced Technical Program Manager to drive the planning, execution, and delivery of complex,... 
    Temporary work
    Remote work

    Lumen Inc

    Austin, TX
    3 days ago
  • Teradata Corporation (SE) in Austin, Texas, seeks a Staff Technical Program Manager to manage significant, multi-team technical programs within its product portfolio. You will lead planning and execution while maintaining strong relations with engineering, product, and... 
    Flexible hours

    Teradata Corporation (SE)

    Austin, TX
    19 hours ago
  • $116k - $145k

    Join to apply for the Senior Technical Program Manager role at DigitalOcean . We are looking for a Senior Technical Program Manager (TPM) who is...  ...dynamic team dedicated to revolutionizing cloud computing and AI through operational excellence and seamless execution. You... 
    Local area
    Remote work
    Flexible hours

    DigitalOcean

    Austin, TX
    2 days ago
  • $192k - $278k

    Google is looking for a Technical Program Manager in Austin, Texas. In this role, you will lead complex engineering projects, managing project schedules and risks for various Engineering programs. With a focus on network technologies, you will drive transformative changes... 
    Full time

    Google

    Austin, TX
    2 days ago
  • $116k - $159.5k

     ...literally connect our world - like AI and IoT. If you want to push...  ...We’re committed to providing programs and support that encourage...  ...planning to implementation. Manages project schedule and task details...  ...to complete projects. Provides technical input to team members to... 
    Full time
    Worldwide
    Relocation

    Applied Materials

    Austin, TX
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Technical Program Manager- AI Cluster Validation. Be the first to apply!