Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Failure Analysis Engineering Manager, GPU ASIC and PCBA Debug

Advanced Micro Devices , Inc.

WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. THE ROLE The Quality Engineering team is looking for an experienced GPU ASIC and PCBA Debug and Failure Analysis Engineering Manager to lead and develop a team of FA engineers. This role is intended for a proven people manager with prior experience building, mentoring, and guiding high-performing engineering teams, while also serving as a strong technical lead in GPU ASIC and board-level (PCBA) failure analysis. The individual will oversee customer and factory failure investigations for GPU accelerators, help drive failure reproduction and isolation, and work closely with cross-functional teams including design, validation, FW, and manufacturing to accelerate root cause analysis and corrective actions. Your contributions will directly impact team effectiveness, product quality, reliability, and customer satisfaction. THE PERSON The ideal candidate is a strong people leader and technical expert who leads by example and is passionate about building, teaching, and mentoring a growing team of high-performing FA engineers. They bring prior experience managing, hiring, and developing engineers, creating an environment of accountability, collaboration, and continuous learning, while remaining hands‑on enough to guide complex debug and failure analysis efforts in a fast‑paced time to market environment. This person is a clear communicator, and a trusted technical leader who can elevate team capability, help others grow in their careers, and drive strong execution in a fast‑paced environment. They combine deep analytical problem‑solving skills with a practical, hands‑on approach, and continuously look for ways to improve team effectiveness, technical depth, and overall quality outcomes. KEY RESPONSIBILITIES Provide technical leadership for triage and debug of complex GPU and PCBA failures across power, ASIC, firmware, and thermals, guiding the FA team to root cause. Lead failure reproduction and triage by defining debug plans, directing investigations, and guiding experiments and escalation paths for complex issues. Drive debug automation, diagnostic tools, and data analysis methods that improve triage efficiency and consistency across failure domains. Lead cross‑functional triage with manufacturing partners and AMD teams to align on failure hypotheses, reproduction, and root cause. Guide board-level debug using schematics, layouts, and design documentation to direct analysis and mentor engineers through the process. Ensure clear documentation of failure analysis results, root cause findings, and corrective actions for customer and internal use. Present technical findings, triage updates, risks, and recovery plans to stakeholders and senior leadership. Drive continuous improvement of FA methods, triage processes, and best practices across power, ASIC, firmware, and thermal debug. Manage and develop a team of FA engineers by setting priorities, providing technical guidance, and coaching through complex investigations. PREFERRED EXPERIENCE Experience leading and developing engineering teams, with a strong track record of hiring, coaching, mentoring, and growing FA engineers. Deep expertise in GPU ASIC debug, validation, and functional or stress test development. Strong background in PCBA diagnostics, failure analysis, and board‑level debug from NPI through production. Experience leading triage across power, ASIC, firmware, and thermal failure domains. Strong hands‑on lab experience with oscilloscopes, logic analyzers, and custom debug tools. Solid understanding of firmware, drivers, and hardware interactions in complex system debug. Extensive experience in hardware verification, system integration, and failure reproduction. Proficient in Python, shell scripting, and working across Windows and Linux environments. Strong leadership, communication, and presentation skills, with the ability to teach, mentor, and lead by example. Able to read schematics, interpret datasheets, identify components, and support board-level debug and rework. Knowledge of high‑speed digital design, HBM or GDDR memory, PCIe, and GPU data center systems is a plus. ACADEMIC CREDENTIALS Bachelor’s degree in Electrical Engineering, Computer Engineering, or a related field. 3+ years of experience management experience LOCATION Secaucus, NJ This role is not eligible for visa sponsorship. BENEFITS & COMPENSATION (SUMMARY) Benefits offered are described: AMD benefits at a glance. AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process. AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here. This posting is for an existing vacancy. #J-18808-Ljbffr

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Failure Analysis Engineering Manager, GPU ASIC and PCBA Debug in Secaucus, NJ vacancy
  •  ...Advanced Micro Devices is seeking a GPU ASIC and PCBA Debug and Failure Analysis Engineering Manager in Secaucus, NJ. This role requires an experienced people manager with a strong technical background in GPU failure analysis and a proven ability to lead and mentor high... 
    Suggested

    Advanced Micro Devices , Inc.

    Secaucus, NJ
    3 days ago
  • $141k - $188k

     ...mindset across multiple engineering disciplines to...  ...risk analysis, quality, and continuous...  ...and TTF (Time to Failure) insights. Apply...  ...automated). Specify and manage Test Management and...  ...Storage, memory, GPU, networking Liquid...  ...assembly, PCBA, servers, or other... 
    Suggested
    Permanent employment
    Work at office
    Local area

    Sanmina-SCI Systems de México

    Secaucus, NJ
    16 hours ago
  •  .... You will work with an Engineering team to ensure the product...  ...on project completion Manage test engineers Manage process...  ...on trouble shooting/ failure analysis for server/rack/storage/GPU products Provide...  ...improvement Skilled in debugging and root cause analysis... 
    Suggested

    6AM City

    Brooklyn, NY
    3 days ago
  • $120k

     ...Provide leadership vision and management to the local engineering programming and CAD teams and act as strategic business member of the leadership...  ...resources Review hour estimation for work based on needs analysis, scope of work, and bid requirements Identify special requirements... 
    Suggested
    For contractors
    Work at office
    Local area

    AVI-SPL

    Lyndhurst, NJ
    10 hours ago
  • $131.75k - $155k

     ...Job Title Engineering Manager Job Description Summary The Engineering Manager will manage direct activities for the delivery of engineering...  ...written) Technical Proficiency Process Owner Problem Solving/Analysis Organization Customer Focus Multi-Tasking Leadership Vendor... 
    Suggested
    Minimum wage
    Contract work
    For contractors
    Work at office
    Local area
    Flexible hours
    Weekend work

    Cushman & Wakefield

    Jersey City, NJ
    4 days ago
  • $164.45k - $234.93k

     ...We are looking for an Engineering Manager to drive Advertising engineering leadership and practices...  ...Propagation, and Observability and Debugging. They design and own the data architecture...  ...for pacing, billing, and business analysis. They also manage the services and pipelines... 
    Flexible hours

    Spotify

    New York, NY
    3 days ago
  •  ...growing group of committed researchers, engineers, policy experts, and business leaders working...  ...inference or training. As an Engineering Manager on these teams you will be responsible...  ...High performance, large-scale ML systems GPU/Accelerator programming ML framework internals... 
    Work at office
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    New York, NY
    3 days ago
  •  ...Principal Reliability Engineer Are you ready to make an impact at DTCC? Do you want to...  ...with simulated disruptions, environmental failures, and performance scenarios to validate...  ...Strong troubleshooting and performance analysis skills ~ Java, Python, Bash, SQL ~ CI... 
    Remote work
    Flexible hours

    Dtcc

    Jersey City, NJ
    3 days ago
  • $288k - $360k

     ...Role Overview As a Forward Deployed AI Engineering Manager on our Enterprise team, you'll be the...  ...domains like customer support, data analysis, content generation, and workflow automation...  ...Problem Solving & Innovation Debug complex technical issues across the entire... 

    Scale AI, Inc.

    New York, NY
    3 days ago
  • $17.5k

     ...Engineering Manager, Technical Operations Technology Other | New York, New York | Full-time Company...  ...coordinated communication, root cause analysis, and timely action item closure....  ...such as SLIs/SLOs, error budgets, and failure mode analysis. What We’re Looking for... 
    Permanent employment
    Full time
    Work at office
    Home office
    Shift work

    MarketAxess Holdings Inc

    New York, NY
    4 days ago
  • $250k - $275k

     ...POSITION SUMMARY The Director of Engineering will report to the Senior Vice President Property Management and is a hands‑on, highly...  ...incident reviews, root‑cause analysis, and follow‑up corrective actions...  ...or repeated system failures Implement continuous improvement... 
    Contract work
    For contractors
    Summer work
    Local area
    Remote work
    Flexible hours

    Empire State Realty Trust

    New York, NY
    3 days ago
  • $127.8k - $217.2k

     ...The Yield, Device, and Integration Engineering Manager leads a multidisciplinary organization of...  ...laboratory resources to diagnose electrical failures and resolve device issues. Champion a...  ..., DPAT, SBL, and statistical yield analysis tools. onsemi is excited to share the... 
    Full time
    Local area
    Shift work

    Onsemi

    New York, NY
    3 days ago
  • $141k - $188k

    About the Role As Principal Engineer, Master Black Belt, you will lead enterprise‑wide quality...  ...for how ZT approaches manufacturing risk analysis, quality, and continuous improvement....  ...levels. Background in electronics assembly, PCBA, servers, or other high‑reliability... 
    Permanent employment
    Work at office
    Local area

    ZT Group Intl, Inc. dba ZT Systems

    Secaucus, NJ
    2 days ago
  • $116.25k - $193.75k

     ...About the Role The Principal Electrical Engineer will have demonstrated technical leadership...  ...technologies related to the design and debugging of cloud compute server systems and PCB...  .... Skilled in feature and cost trade‑off analysis. Ability to create specifications and... 
    For contractors
    Local area

    ZT Group Intl, Inc. dba ZT Systems

    Secaucus, NJ
    3 days ago
  • $116.25k - $193.75k

     ...About The Role**ThePrincipal Electrical Engineer will have demonstrated technical leadership...  ...technologies related to the design and debugging of cloud compute server systems and PCB...  ...* Skilled in feature and cost trade-off analysis* Ability to create specifications and... 
    Permanent employment
    For contractors
    Work at office
    Local area

    ZT Systems group

    Secaucus, NJ
    4 days ago
  •  ...Frontier Agent Engineering Manager, Enterprise San Francisco, CA; New York, NY About Scale...  ...across domains like customer support, data analysis, content generation, and workflow...  ...Problem Solving & Innovation Debug complex technical issues across the entire... 

    Scale AI

    New York, NY
    1 day ago
  • $120k - $240k

     ...Director - Electrical Engineering Systems to lead electrical...  .... This leader will manage and grow three tightly...  ...‑load validation, failure mode exploration, parameter...  ...safety, hazard analysis inputs, and design choices...  ...installation support, and debug. Ideal Experience &... 
    Night shift
    Weekend work

    Alumni Ventures

    Kearny, NJ
    4 days ago
  • $141k - $188k

    Principal Quality Engineer page is loaded## Principal Quality Engineerlocations...  ..., using SPC, measurement analysis, and advanced analytics to...  ...and ongoing supplier quality management, ensuring robust quality...  ...Background in electronics assembly, PCBA, servers, or other high-... 
    Contract work

    ZT Systems group

    Secaucus, NJ
    4 days ago
  •  ...The Role The Principal Quality Engineer will lead the development of...  ..., using SPC, measurement analysis, and advanced analytics to proactively...  ...and ongoing supplier quality management, ensuring robust quality...  ...in electronics assembly, PCBA, servers, or other high‑reliability... 
    Permanent employment
    Contract work
    Work at office
    Local area

    ZT Group Intl, Inc. dba ZT Systems

    Secaucus, NJ
    2 days ago
  • $150k - $171k

    ## Technical Engineering ManagerApplylocations: US Remotetime type: Full...  ...Engineering, SRE, Release Management, Product, Architecture, TPMs)...  ...and help determine root cause analysis.## ****Essential Duties & Responsibilities...  ...30%): Write and review code, debug complex issues, design and... 
    Full time
    Remote work
    Shift work

    Alkami Technology

    New York, NY
    9 hours ago
  • $171k - $214k

     ...looking for an Intelligent Automation Engineering Senior Manager to drive our automated solutions...  ...leadership in solution design, code reviews, debugging, exception handling, and performance...  ...incident response, root cause analysis (PIRs), and continuous improvement efforts... 
    Temporary work
    Local area

    Omaze

    New York, NY
    4 days ago
  • $200k

     ...economics optimization, specializing in risk management, growth facilitation, and ensuring the...  ...a team of data scientists and software engineers to create impactful products. Apply...  ...Qualifications 5+ years of technical modeling or analysis experience, ideally developing or... 
    Contract work
    Temporary work
    Remote work

    Blockchain Works

    New York, NY
    10 hours ago
  •  ...Aleph Aleph is an AI-native platform for Financial Planning & Analysis (FP&A), an established software category with a multi-billion...  ...-end production systems Set technical direction and raise the engineering bar Hire, develop, and retain exceptional talent Balance high... 
    Apprenticeship
    Local area
    Remote work

    ALEPH

    New York, NY
    2 days ago
  • # Engineering Manager## Description**Company Background**Alamance is recognized as a leading manufacturer of aerosol whipped cream, flavored...  ...or Six Sigma principles· Understanding of OEE, root cause analysis, and continuous improvement methodologies**Physical Requirements... 
    Temporary work

    Alamance Foods

    New York, NY
    4 days ago
  • $116.64k - $145.8k

     ...Job Summary Engineering Manager Level 2 will be involved in all phases of an MTA C&D project lifecycle, including planning, scheduling, analysis, design, technical specifications, and reports. Telework is eligible one day per week, starting 30 days after hire effective... 
    For contractors
    Work at office
    Remote work
    Flexible hours
    1 day per week

    Metropolitan Transportation Authority

    New York, NY
    8 hours ago
  •  ...part of the team reshaping how value flows on the internet. As Engineering Manager for the Protocols team, you’ll guide the engineers who...  ...including external security audits, formal verification, static analysis etc. Excellent written and verbal communication skills; able... 
    Contract work

    I did my part and supported the Regular Toilet

    New York, NY
    3 days ago
  •  ...Products is seeking an experienced Engineering Manager to provide leadership for...  ...obsolescence management.* Provide hands-on debugging of complex automation and...  ...efficiency.* Lead structured root cause analysis for chronic and high-impact failures.* Identify and implement control... 
    Local area

    BlueScope Steel

    New York, NY
    3 days ago
  •  ...As the Senior Engineering Manager, you will lead by being a highly technical leader who delivers high business impact on projects of increasing...  ...empowering team members. You have experience communicating analysis and establishing confidence among audiences who do not share... 
    Temporary work
    Remote work

    Aledade, Inc.

    New York, NY
    2 days ago
  • $160k - $225k

     ...share real‑time passenger information, manage day-to-day operations, and improve service...  ...are unable to provide Visa sponsorship. Engineering at Swiftly Engineering at Swiftly is not...  ...), GPS Playback (historical trip analysis), and an expanding suite of communication... 
    Work experience placement
    Currently hiring
    Work at office
    Home office
    Flexible hours

    Swiftly, Inc.

    New York, NY
    2 days ago
  •  ...partnering with a respected manufacturing client to hire an Engineering Manager who leads from the front and drives execution. This is a hands...  ...production through hands‑on problem‑solving and root cause analysis. Partner cross‑functionally with operations, quality, supply... 

    Talis Group

    Brooklyn, NY
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Failure Analysis Engineering Manager, GPU ASIC and PCBA Debug. Be the first to apply!