Senior Debug & Failure Analysis Engineer - Datacenter GPUs
NVIDIA
NVIDIA is seeking a Senior System Debug Engineer to join its datacenter product engineering team in Santa Clara, California. The role involves driving failure analysis and debugging efforts during the New Product Introduction phase while collaborating with industry experts. The ideal candidate will have over 12 years of experience and a degree in Electrical Engineering. This position offers a competitive salary, equity, and a comprehensive benefits package. The successful candidate will perform rigorous failure analysis and engage with internal teams to ensure product quality and timely delivery. This job presents a unique opportunity to work with innovative technology that shapes the future of datacenters. #J-18808-Ljbffr NVIDIA
$200k - $322k
Join NVIDIA's datacenter product engineering team in our Operations organization and be at the forefront of technological advancement! As a Senior System Debug Engineer, you will drive failure analysis and debug efforts during our New Product Introduction (NPI) phase. You...SeniorWork experience placementOverseas- An established industry player is seeking a Sr. Failure Analysis Engineer to join their team. In this role, you will be responsible for driving... ...dynamic work environment where your expertise in hardware debugging, Linux system administration, and effective communication...Senior
$152k - $241.5k
...and Python), analytical, and debugging Good understanding of Deep Learning... ...Experience with NVIDIA GPUs, CUDA Programming, and Networking... ...stand out Proven SW engineering experience experience in deploying... ...in large AI job performance analysis for training/inference workload...Senior$130k - $200k
...connections among thousands of GPUs and memory units. The... ...and test systems engineer to design, build, and... ...automation and analysis frameworks, and closing... ...limitations and failure modes Debug complex issues across... ...Familiarity with telecom, datacenter optics, or silicon...SeniorContract work$200k
...AI. As AI moves beyond the datacenter into robots, autonomous mobile... ...exceptional architects and engineers to rethink how AI, sensing,... ...bug investigation, root-cause analysis, and debug across hardware and software... .... Relevant Backgrounds CPUs GPUs AI accelerators Networking...SeniorFlexible hours- ...your career. THE ROLE We are seeking a highly skilled Senior Hardware Validation & Failure Analysis Engineer to lead hardware validation, system bring-up,... ...candidate combines deep technical expertise in hardware debugging, root cause analysis, and system‑level validation...SeniorContract work
$46.3 per hour
...national staffing firm and are currently seeking an Sr. Failure Analysis Engineer for a prominent client of ours. This position is located... ...the test environment. Demonstrated ability at hardware debugging to component level and fault isolation is mandatory....SeniorMonday to FridayShift workDay shift$200k - $351k
...to receive an alert: Senior Principal, Design Engineering, Power Design... ...specification, design, and debug of complex power delivery systems for datacenter products. Skills &... ...relevant simulations, analysis, test vehicles, and... ...CPU, GPU components Failure mode analysis Good understanding...SeniorLocal area$132k - $207k
Senior System Power Validation and Applications Engineer page is loaded Senior System Power... ...Engineer in the Datacenter System Engineering... ...development, and debug power system issues... ...feasibility analysis, present options... ...reliability, and failure analysis Experience...SeniorFull time$200k - $400k
...for a deeply technical engineer to co-design and... ...engineering, distributed debugging, and communication-runtime... ...across thousands of GPUs • Architect fault-tolerant... ...real-world cluster failures Core Technical... ...concepts • Congestion analysis and routing optimization...SeniorVisa sponsorship$116k - $184k
NVIDIA Gruppe in Santa Clara is seeking a Product Quality Engineer to lead failure analysis for system products. The ideal candidate will have a strong analytical skillset and experience in problem-solving, focusing on quality improvement initiatives. Your responsibilities...Senior$116k - $184k
...hear from you! We are looking for a Product Quality Engineer to join our team leading all aspects of failure analysis for NVIDIA’s system product segment throughout... ...on experience in PCB level hardware verification/debug and signal integrity measurement. Experience with...SeniorWork experience placement- ...Overview Product Quality Engineer for the Systems Product Quality team,... ...serving as the system‑level power debug domain expert for customer returns and field failures on NVIDIA data center systems, compute... ...Lead system‑level power failure analysis for customer returns and field...Senior
- ...connections among thousands of GPUs and memory units. The... ...Overview We are looking for a senior individual contributor with... ...Applied Physics, Electrical Engineering, or related field 7 - 15+ years... ...Hands-on experience with failure analysis and materials characterization...Senior
$168k - $258.75k
The Senior Datacenter Systems Modeling Engineer leads cross‑functional execution and modeling for data‑center power‑delivery programs. This role owns... ...and optimize the design; drive fast and effective failure analysis with solid root cause and correlation actions. Own...Senior$132k - $207k
...looking to hire a System Test Engineer who will work in the test... ...manufacturing test solutions for Datacenter products. Through HW and SW... ...in the early design stages. Debug complex hardware and software... ...scripting for automation and data analysis. Strong problem‑solving...SeniorLocal areaOverseas$168k - $258.75k
Senior Hardware Validation Engineer - Datacenter Systems Engineering Senior Hardware Validation Engineer will lead validation activities... ...into NVIDIA’s datacenter products. Debug, triage issues, conduct root‑cause analysis, verify fixes, define new tests, and improve...Senior- ...over 40 years experience in engineering and materials testing services... ...everything we do. Every analysis, every test, and every question... ...Laboratories is seeking a Senior Failure Analysis Engineer to lead hands... ...who enjoys deep technical debugging, operates lab tools independently...SeniorPermanent employmentFull timeCasual workLocal areaMonday to Friday
- ...looking for curious, collaborative, and motivated hardware engineers to take our GPU memory subsystem from first silicon... ...on leadership of HBM/GDDR enablement, validation, and failure analysis for NVIDIA’s datacenter and graphics products. Drive review of memory partner’...Senior
- Senior Electrical Engineer (Data Center)- [Hybrid] Houston, Texas Unfortunately... ...delivers end-to-end AI datacenter infrastructure built around... ...and design reviews covering failure modes such as loose... ...equivalent power systems analysis software Proficiency in ECAD...Senior
$184k - $287.5k
## Senior Fortran Compiler EngineerApplylocations:... ...language parallelism for GPUs and Multicore CPUs*... ...with teamwork throughout debugging, prototyping, and... ...experience with semantic analysis* Knowledge of programming... ...growth, our exclusive engineering teams are rapidly...SeniorRemote work$184k - $287.5k
...looking for a dedicated engineer for the Senior Systems Software... ...and improve multiple datacenter products concurrently... ...datacenter builds for NVIDIA GPUs, CPUs, and networking... ...and tools. Analyze, debug, and resolve critical... ...profiler to systems analysis. Linux systems...Senior$210k - $265k
...execution mode and has a world-class engineering team with decades of... ...conversion systems, including circuit analysis, simulation, control loop... ...testing, validation, and debugging power supply designs to ensure... ...for AI accelerators, GPUs, or high‑performance ASICs. Knowledge...Senior$75 - $85 per hour
...development and hardware engineering company, offering end-... ...com/careers Title: Senior Electrical Engineer... ...electrical measurements and analysis across power rails,... .... ~Execute lab debugging and troubleshooting using... .... ~ Conduct failure analysis of electrical...SeniorFlexible hours$152k - $241.5k
...based programming model for our GPUs. CUDA Tile shipped with CUDA1... ..., and other general software engineering work. What we need to see:... ...compiler optimization, performance analysis and IR design. Ability to... ...design skills, including debugging, performance analysis, and test...Senior$152k - $241.5k
...looking for versatile software engineers for our XLA team. NVIDIA is... ...the OpenXLA compiler on NVIDIA GPUs at scale. You’ll collaborate... .... Performance tuning and analysis. Code‑generation for NVIDIA GPU... ...software design skills, including debugging, performance analysis, and...Senior- ...and experienced HPC Cluster Engineer to design, deploy and operate... ...workloads, including performance analysis and optimizations. Conduct... ...accelerate researchers’ velocity, debugging and software performance at... ...: Background with NVIDIA GPUs, CUDA programming, NCCL and MLPerf...Senior
- ...Senior Systems Engineer US - Milpitas About Us Graphcore is one of the... .... Diagnose system-level failures involving thermal behavior,... ...to perform root cause analysis and propose corrective actions... ...architectures and board-level debugging. Experience analyzing...SeniorFlexible hours
$200k
...As AI moves beyond the datacenter into robots,... ...exceptional architects and engineers to rethink how AI, sensing... ...We are looking for a Senior RTL Engineer to help build... ...optimization. Strong debugging and problem-solving... ...and memory subsystems GPUs AI accelerators...SeniorFlexible hoursNight shift- ...develop software infrastructure and improve datacenter architectures designed for deep... ...have a Bachelor’s degree in Electrical Engineering or Computer Science and 8 years of experience... ...with CUDA, PyTorch, and performance analysis is essential. #J-18808-Ljbffr NVIDIASenior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Debug & Failure Analysis Engineer - Datacenter GPUs. Be the first to apply!
- senior data management analyst Santa Clara, CA
- senior app developer Santa Clara, CA
- senior game producer Santa Clara, CA
- senior packaging engineer Santa Clara, CA
- senior manager quality engineering Santa Clara, CA
- senior software test automation engineer Santa Clara, CA
- senior compensation manager Santa Clara, CA
- senior sourcing engineer Santa Clara, CA
- senior director engineering Santa Clara, CA
- senior accounts receivable Santa Clara, CA


