Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal System Debug Engineer

Graphcore

Principal System Debug Engineer

Austin, Texas, United States; US - Milpitas

Position Overview

We are seeking a senior technical leader (Principal Engineer level and above) to lead the bring-up, enablement, and hardware debug of server compute systems and rack-level platforms based on Arm® server architecture.

The successful candidate will be a key member of the System Validation organization, responsible for driving system-level debug activities and facilitating rapid resolution of complex hardware, firmware, and software issues. This role requires close collaboration with engineering teams across the organization to identify root causes, implement corrective actions, and ensure successful program execution.

The ideal candidate will be deeply involved in challenging system debug efforts while developing and executing scalable debug strategies that maximize throughput and ensure Product of Record (POR) quality. In addition, this individual will establish and drive debug methodologies, improve processes, and help create a culture of technical excellence across the organization.

We are looking for a disciplined, dynamic, and highly motivated leader who can thrive in a global environment while fostering strong cross-functional collaboration.

As a Lead System Debug Engineer within Server and Rack Validation, you will drive balanced, scalable, and automated debug solutions that optimize engineering efficiency and product quality. This highly visible role provides the opportunity to innovate and improve debugging capabilities while delivering industry-leading server technologies to market.

Your technical leadership, validation expertise, and problem-solving skills will play a critical role in product development, issue root cause analysis, and resolution. Success in this role requires close collaboration with System Validation, System Architecture, Silicon Engineering, Rack Firmware, and other cross-functional teams.

Primary Responsibilities
  • Develop and drive a Debug Center of Excellence, including scalable debug and triage methodologies, processes, and playbooks for server blade and rack-level issues spanning hardware, firmware, and software integration.
  • Debug issues discovered during server rack bring-up, post-silicon validation, and production phases.
  • Lead complex debug efforts involving silicon, server systems, firmware, and software to determine root causes and drive effective resolutions.
  • Ensure issues are resolved with high quality and within program timelines.
  • Manage and track technical issues, risks, and priorities to remove blockers and achieve key program milestones.
  • Develop and publish debug program metrics and indicators to identify roadblocks and improve overall debug efficiency.
  • Communicate program status, risks, and opportunities to customers, stakeholders, and executive leadership.
  • Drive technical innovation across triage and debug workflows through tool development, scripting, methodology enhancements, and cross-functional engineering initiatives.
  • Mentor engineers and promote best practices in system validation and debug methodologies.
Required Qualifications
  • Strong analytical and problem-solving skills with exceptional attention to detail.
  • Extensive experience in validation and debug roles involving operating systems, firmware, silicon, and hardware issues.
  • Deep understanding of industry-standard server interconnects and software stacks, including PCIe and CXL.
  • Strong knowledge of Arm® CPU or x86 architectures, SoC design, memory subsystems, RAS (Reliability, Availability, and Serviceability), and power management.
  • Extensive experience with system architecture, technical debugging, and validation strategies.
  • Strong understanding of platform-level and system-level debug methodologies, including Operating Systems, Device Drivers, and BIOS interactions.
  • Excellent communication, collaboration, and cross-functional leadership skills.
  • Highly organized and detail-oriented, with the ability to manage multiple priorities and deliver results under tight deadlines.
  • Experience leading technical programs and coordinating cross-functional engineering efforts.
  • Thorough understanding of data center technologies and associated software stacks.
  • Self-motivated with the ability to independently drive tasks from problem identification through resolution.
Preferred Qualifications
  • Master's degree or Ph.D. in Electrical Engineering, Computer Engineering, Computer Science, or a related technical field.
  • Experience with large-scale server platforms, rack-level systems, and hyperscale data center environments.
  • Expertise in automation, scripting, and debug tool development.
  • Experience establishing and scaling debug processes across multiple product generations and engineering organizations.
Vacancy posted 4 hours ago
Similar jobs that could be interesting for youBased on the Principal System Debug Engineer in Milpitas, CA vacancy
  • $230k - $260k

     ...Principal, Electro-Optics System Engineer Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build...  ...thermal stability). Lead & execute designs, development, debug for optical communication for racks. Lead build and... 
    Principal
    Full time

    PsiQuantum

    Milpitas, CA
    5 days ago
  • $112k - $286k

     ...communicate and advance faster than ever. As a HBM Customer Systems Validation Engineer at Micron Technology, Inc., you will: Responsibilities:...  ...: Experience in bios/Linux driver development/debug Understanding of system architecture and memory subsystem... 
    Principal
    Full time
    Local area
    Immediate start

    Micron Technology

    San Jose, CA
    1 day ago
  • $205k - $255k

     ...Senior Principal System Validation Engineer San Jose, California, United States Astera Labs provides rack-scale AI infrastructure through purpose...  ...with Silicon/System bring-up, validation, and debug experience, including in customer systems. A strong background... 
    Principal
    Flexible hours

    Astera Labs

    San Jose, CA
    1 day ago
  •  ...data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and...  ...The High-Speed IO (HSIO) validation engineer is responsible for driving end-to-end validation...  .../PI, and platform teams to validate and debug high-speed IO subsystems, ensuring robust... 
    Principal

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  •  ...centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation...  ...for a dynamic, energetic Lead / Principal Systems Design Engineer to join our growing team. As a key contributor...  ...execute on NPU/GPU effectively Debugging issues found during the system... 
    Principal

    Advanced Micro Devices , Inc.

    San Jose, CA
    5 days ago
  • $168k - $258.75k

    Join NVIDIA's datacenter product engineering team in our Operations organization and be at the forefront of technological advancement! As a Senior System Debug Engineer, you will drive failure analysis and debug efforts during our New Product Introduction (NPI) phase.... 
    Work experience placement
    Overseas

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $200k - $322k

     ...make a lasting impact on the world. Join NVIDIA's datacenter product engineering team in our Operations organization and be at the forefront of technological advancement! As a Senior System Debug Engineer, you will drive failure analysis and debug efforts during our... 
    Work experience placement
    Overseas

    NVIDIA AI

    Santa Clara, CA
    3 days ago
  •  ...data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and...  ...AMD is looking for a Senior Staff AI Infra Engineer who is passionate about improving the...  ...scheduling across heterogeneous hardware. • Debug and resolve complex system-level... 
    Principal

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  • $200k - $322k

    NVIDIA AI in Santa Clara is seeking a Senior System Debug Engineer to drive failure analysis during the New Product Introduction phase. You will collaborate with industry experts to ensure quality in GPU Server products while working in a diverse and supportive environment... 

    NVIDIA AI

    Santa Clara, CA
    3 days ago
  • $190.2k - $360.5k

     ...The Opportunity The Creative Cloud Engineering organization is building the next generation...  ..., we are looking for a Senior AI Systems Engineer who operates at the intersection...  ...automation, build failure triage, and autonomous debugging • Develop multi-agent orchestration... 
    Principal
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    4 days ago
  •  ...needs to keep our world moving forward. Job Description Essential Duties and Responsibilities: We are seeking a Principal System Design Engineer to join our team in Milpitas, United States. The successful candidate will lead the design and development of advanced... 
    Principal
    Temporary work
    Remote work
    Flexible hours
    Shift work

    SanDisk

    Milpitas, CA
    2 days ago
  •  ...requires US Citizenship. Your Career As a Principal Site Reliability Engineer, you will serve as the technical...  ...eliminate manual toil and improve system observability. AI‑Driven...  ...Cursor and Claude to accelerate coding, debugging, and documentation tasks. Incident... 
    Principal
    Visa sponsorship
    Work visa
    Shift work

    Palo Alto Networks, Inc.

    Santa Clara, CA
    1 day ago
  •  ...everyone's life. Bridging needs with innovation - where customer challenges become our next big idea. As a Lead Principal Satcom/Telecom System Application Engineer in our Customer Collaboration and Technical Marketing team, you work hand-in-hand with our customers,... 
    Principal

    Infineon Technologies AG

    Milpitas, CA
    2 days ago
  •  ...Hewlett Packard Enterprise Development LP is seeking a Principal Presales Systems Engineer to manage pre-sales technical support. This remote role requires extensive experience in networking technologies and consultative sales skills. The ideal candidate will articulate... 
    Principal
    Remote work

    Hewlett Packard Enterprise Development LP

    Sunnyvale, CA
    3 days ago
  •  ...centers, to PCs, gaming and embedded systems. Grounded in a culture of...  ...beyond. THE ROLE The AI Customer Engineering organization is looking for a Principal AI Systems Design Engineer to help...  ...customer‑facing role leading full‑stack debug of AI infrastructure focusing on high... 
    Principal

    Advanced Micro Devices

    Santa Clara, CA
    3 days ago
  • Cerebras Systems is seeking a deeply technical software engineer for its Kernel Reliability team in Sunnyvale, California. This role involves enhancing the reliability...  ...various teams to reduce system downtime and improve debug processes. Join us at the forefront of AI... 

    Dormont Manufacturing Co

    Sunnyvale, CA
    2 days ago
  • $194.2k - $310.8k

     ...Title: Principal Systems Engineer, Principal Member of Technical Staff (PMTS) Join us on our mission as we transform energy storage to enable a more sustainable future. We are looking to add a Principal Systems Engineer to our Cell Manufacturing Engineering team.... 
    Principal
    Full time

    QuantumScape Corporation

    San Jose, CA
    1 day ago
  •  ...Senior Systems Engineer Graphcore is one of the world's leading innovators in Artificial Intelligence compute. It is developing hardware...  ...with server hardware architectures and board-level debugging. Experience analyzing system logs, hardware telemetry, and... 

    Graphcore

    Milpitas, CA
    2 days ago
  • $180k - $205k

     ...Job Description Title: Principal Systems Engineer Compensation: $180,000-$205,000 base salary + 15% annual bonus Reports to: Director of Global Applications Engineering Overview We are seeking a Principal Systems Engineer to join our Power... 
    Principal
    Flexible hours

    Sterling Engineering

    San Jose, CA
    2 days ago
  • $150k - $225k

    A leading fusion energy company in Milpitas seeks a Principal Mechanical Engineer to lead R&D and equipment design for advanced thermal vacuum deposition systems. The ideal candidate will have over 15 years of experience, a Master’s in Mechanical Engineering, and skills... 
    Principal
    Flexible hours

    Commonwealth Fusion Systems LLC

    Milpitas, CA
    3 days ago
  • $155.8k - $224.2k

     ...brighter, more sustainable future while tackling the most pressing challenges of the 21st century. We are looking for a Principal Systems Engineer to join our team in one of today's most exciting technologies. This role will report to the Director of Advanced... 
    Principal
    Work at office
    Remote work
    Worldwide

    Bloom Energy

    Fremont, CA
    1 day ago
  • $103.6k - $155.4k

    Northrop Grumman Corp. (JP) is looking for a Principal Engineer, Reliability in Sunnyvale, CA. This role involves leading reliability engineering initiatives to improve uptime and reduce the risk of failures in submarine components testing. The position requires a Bachelor... 
    Principal

    Northrop Grumman Corp. (JP)

    Sunnyvale, CA
    5 days ago
  •  ...Principal Hardware Design Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our culture... 
    Principal
    3 days per week

    d-Matrix

    Santa Clara, CA
    5 days ago
  •  ...Senior Distributed Storage System Engineer This role has been designed as 'Onsite' with an expectation that you will primarily work from...  ...Job Family Definition: Designs, develops, troubleshoots and debugs software programs for software enhancements and new products.... 
    Work at office
    Local area

    Hewlett Packard Enterprise

    Alviso, CA
    6 days ago
  • $103.6k - $236.5k

     ...the best patient care to improve, prolong and save lives. The Roche Sequencing team is seeking a passionate Systems Integration Engineer/Principal Systems Engineer to join our Systems Integration Group to support the Next Generation Sequencing (NGS) products.... 
    Principal
    Local area
    Relocation package

    F. Hoffmann-La Roche Ltd

    Santa Clara, CA
    19 days ago
  • $210k - $280k

     ...their most valuable assets. About The Role The Principal Compiler Engineer - ML Systems position will be responsible for working with the...  ...high-performance systems engineering and performance debugging. An appreciation for process and developing cross-disciplinary... 
    Principal
    Full time
    Temporary work
    Local area
    Flexible hours

    SambaNova Systems

    San Jose, CA
    3 days ago
  • $199k - $411.5k

     ...Principal Presales Systems Engineer This role has been designated as 'Remote/Teleworker', which means you will primarily work from home. Who We Are: Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help... 
    Principal
    Work experience placement
    Local area
    Immediate start
    Remote work
    Work from home

    Hewlett Packard Enterprise Development LP

    Sunnyvale, CA
    3 days ago
  •  ...and optical designers, suppliers, mechanical designers, process engineers to identify potential risks and mitigation plans to prevent...  ...w/ new product platforms, components, assembly challenges, and system architectures to define, prioritize, track, and validate reliability... 
    Principal
    Work at office

    II-VI UK, Ltd.

    Fremont, CA
    5 days ago
  •  ...Principal System Design Engineer, SSD Memory Systems We are seeking a Principal System Design Engineer to join our team in Milpitas, United States. The successful candidate will lead the design and development of advanced solid-state drive (SSD) memory systems, driving... 
    Principal

    SanDisk

    San Jose, CA
    1 day ago
  • $164k - $215k

     ...Principal Systems Engineering Architect MaxLinear is seeking a Principal Systems Engineering Architect to join our Analog Mixed Signal (AMS) group. The AMS team develops a broad portfolio of integrated solutions spanning high speed and embedded connectivity interfaces... 
    Principal
    Work experience placement
    Live in

    MaxLinear (US)

    San Jose, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal System Debug Engineer. Be the first to apply!