Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Distinguished Resiliency and Safety Architect, GPU Diagnostics

$320k

NVIDIA

We are now looking for a Distinguished Resiliency and Safety Architect, GPU Diagnostics! Today, NVIDIA is tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, encouraging environment where everyone is inspired to do their best work. Come join the team and see how we can make a lasting impact on the world.We are now seeking a Resiliency and Safety Architect to support the development of GPU (graphical processing unit) diagnostics for Resiliency in the Datacenter and Functional Safety in Autonomous Vehicles and Robots. In this role, you will be a key member of a team of innovators, challenging the status quo and pushing beyond boundaries. You will have the opportunity to impact the industry's leading GPUs and SoCs powering product lines ranging from the rapidly growing field of artificial intelligence to self-driving cars and robots.**What you'll be doing:*** Design, develop, and maintain diagnostics software suite to efficiently stress test NVIDIA GPUs and SOCs to identify hardware defects, including defects that cause silent data corruption. These tests will run in large-scale deployments of Datacenter GPUs and Safety SOCs in package/board/rack configurations spanning GPUs, CPUs, and Networking SOCs.* Address coverage gaps in NVIDIA diagnostic suite flagged by silicon failures on customer workloads or test suites. Enhance diagnostics to improve repeatability of failures detected and optimize test time.* Tests for GPUs in automotive functional safety contexts should include low-level routines to exercise instruction sets, memory subsystems and interrupt mechanisms, in compliance with ISO 26262 and related safety standards. Collaborate with architecture, RTL, and verification teams to ensure safety coverage, correctness, and robustness across GPU generations.* Study silent data corruption, intermittent faults, and hard-to-reproduce failures in the field, including customer returns (RMAs), to establish root causes, and improve detection by diagnostics* Support deployment of diagnostics in pre-production qualification environments as well as large-scale production usages.**What we need to see:*** Master’s or PhD degree in Computer Science, Computer Engineering, Electrical Engineering or closely related degree or equivalent experience.* At least 15+ years of relevant experience.* Ability to reason across hardware/software boundaries to debug complex system-level issues* In-depth understanding of the architecture and micro-architecture of high-performance computing systems. Strong knowledge of hardware failure mechanisms that can result in incorrect computation.* Proficiency in C/C++, CUDA programming.* Scripting and automation with Python or similar.* Understanding of the software development life cycle, from requirements to testing closure and maintenance, including creating customer releases and documentation.* Excellent interpersonal skills and ability to collaborate with on-site and remote teams.* Strong debugging and analytical skills.* Be self-driven and results oriented.**Ways to stand out from the crowd:*** Familiarity with GPU and SOC Architectures, Machine Learning/Deep Learning concepts* Understanding factors causing silent data corruption in hardware* Ability to use high performance libraries and write hand-crafted kernels where necessary to create stress conditions to induce hardware failures.* Experience in embedded software development.NVIDIA’s invention of the GPU 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing - with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company”.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 320,000 USD - 488,750 USD.You will also be eligible for equity and .Applications for this job will be accepted at least until February 27, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA Corporation

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Distinguished Resiliency and Safety Architect, GPU Diagnostics in Santa Clara, CA vacancy
  • $320k

    NVIDIA Gruppe is seeking a Distinguished Resiliency and Safety Architect specialized in GPU Diagnostics. The role involves designing diagnostics software to identify hardware defects in NVIDIA GPUs. The successful candidate will collaborate closely with architecture and... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  •  ...is seeking a Senior Hardware Engineer to develop solutions for GPU products. You will collaborate in launching new GPU Accelerated...  ...for AI and analytics. Your responsibilities include developing diagnostic tests, defining manufacturing screens, and ensuring high-quality... 
    Suggested

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $152k - $287.5k

    A leading technology company is seeking a Senior GPU Memory System Architect in Santa Clara to develop architecture and micro-architecture for GPU memory systems. Candidates should have 3+ years in GPU or CPU architecture and a master's degree in a relevant discipline.... 
    Suggested

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • NVIDIA Gruppe in Santa Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design and implement cutting-edge GPU compute clusters, focusing on deep learning and high-performance computing. The ideal candidate will have at least 5+ years... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  •  ...Architecture Energy Modeling Engineer to develop methodologies for energy-efficient products. The role involves working with teams to improve GPU power consumption through machine learning models and prototyping architectural features. Candidates should have an MS or PhD with 6+... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $100k - $166.75k

    We are looking for a Datacenter GPU Power Architect - New College Grad! NVIDIA is known as a world leader in providing energy-efficient high-performance products and we continue to invest in the research and development of hyper-efficient GPU and SOC architectures. We... 

    NVIDIA

    Santa Clara, CA
    5 days ago
  • NVIDIA Gruppe is looking for a motivated architect to join the GPU memory architecture team in Santa Clara, California. This role involves developing innovative products to optimize memory systems for various applications including data centers and autonomous vehicles.... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  •  ...Corporation in Santa Clara is looking for a Senior Thermal Solutions Design Engineer to develop innovative thermal solutions for next-gen GPU and SoC products. You will collaborate with various teams and aim for high delivery standards. The ideal candidate has a Bachelor/... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • Compute Kernel Performance Architect NVIDIA is seeking a Compute Kernel Performance Architect with a unique blend of skills: someone who can...  ...draw — and who understands how those kernels interact with the GPU's Power Delivery Network (PDN) at a system level. This is not a... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

    Overview We are now looking for a Senior GPU Architect! The NVIDIA GPU Architecture group is looking for world class architects and software developers to join and lead our various architecture efforts. A key part of NVIDIA's strength is to innovate in the graphics and... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • NVIDIA Gruppe in Santa Clara, California, is seeking a professional to contribute to power estimation models for GPU products. You will analyze performance vs power and deploy machine learning techniques to develop advanced models. The ideal candidate has an MSEE/MSCE,... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

    Senior GPU Verification Architect page is loaded Senior GPU Verification Architect Apply locations US, CA, Santa Clara time type Full time posted on Posted 6 Days Ago job requisition id JR1989512 This is an outstanding opportunity to join a world-class team and play a... 
    Full time
    Work experience placement

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • NVIDIA Corporation is searching for a Senior GPU Architect in Santa Clara, CA to innovate and contribute to the design of our proprietary profiler subsystem. This role focuses on utilizing hardware modeling and verification to enhance GPU performance insights. Prospective... 
    Remote job

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe is looking for a technical leader to design and deliver a low-overhead GPU profiling service that operates continuously in production. This role will involve leading architecture design, implementing system software, and mentoring a team of engineers. The... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • NVIDIA Gruppe in Santa Clara is looking for a Senior HPC Architect to support the deployment of large-scale GPU compute clusters. You will provide engineering solutions for GPU computing products, ensuring technical relationships with teams and assisting in creative solutions... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • Overview We are now looking for a Senior GPU & Deep Learning Architect to join the NVIDIA GPU Architecture group. As a senior architect, you will lead architecture efforts for deep learning workloads, design new hardware features, advance parallel computation, and develop... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $116k - $189.75k

    What you’ll be doing: You will be contributing to power estimation models and tools for GPU products and systems like NVIDIA DGX/HGX based datacenters. Early GPU & System Architecture exploration with focus on energy efficiency and TCO improvements at GPU and Datacenter... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • NVIDIA Gruppe in Santa Clara is looking for a Compute Kernel Performance Architect to influence GPU power architecture for future products. You will design CUDA kernels focused on optimizing GPU power consumption and collaborate with teams to validate power integrity.... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • NVIDIA in Santa Clara is seeking a Datacenter GPU Power Architect - New College Grad to join their Applied Power Architecture team. You will contribute to power estimation models for GPUs and explore architecture focusing on energy efficiency. The ideal candidate would... 

    NVIDIA

    Santa Clara, CA
    3 days ago
  • Senior Applied Power Architect - GPU NVIDIA is known as a world leader in providing energy‑efficient high‑performance products, and we continue to invest in the research and development of hyper‑efficient GPU and SOC architectures. We are continually innovating in creative... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • NVIDIA Gruppe in Santa Clara is looking for a Senior Systems Software Engineer to focus on GPU performance at scale. You will be instrumental in driving innovation in AI and GPU computing, contributing to state-of-the-art computing hardware. The ideal candidate has extensive... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $320k

    NVIDIA Gruppe is seeking a Distinguished Engineer for the Apache Spark Acceleration group. This role focuses on leading the architecture and implementation of accelerated Apache Spark, while engaging with open source communities. The ideal candidate will have at least 1... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $175k - $250k

     ...semiconductor startup in Sunnyvale is looking for a highly experienced GPU Compiler Lead to design and implement a high-performance...  ..., optimizing workloads, and collaborating with hardware architects. The role offers a competitive salary range of $175,000-$250,00... 

    Bolt Graphics

    Sunnyvale, CA
    2 days ago
  • $224k - $356.5k

    Lead Safety Architect - Autonomous Vehicles page is loaded## Lead Safety Architect - Autonomous Vehicleslocations: US, CA, Santa Clara: US, DC...  ...AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars... 
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe is looking for a Senior GPU & Deep Learning Architect to join its GPU Architecture group in California. In this role, you will lead efforts to design hardware for deep learning and advance parallel computation across projects. The ideal candidate will hold... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  •  ...are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE. Role Distinguished Technologist, ASIC Design Architect — We are seeking an experienced ASIC Design Architect to own the hardware architecture, microarchitecture, and... 
    Work experience placement
    Work at office

    Hewlett Packard Enterprise

    Sunnyvale, CA
    5 days ago
  • $183k - $365k

    Distinguished Technologist, ASIC Design Architect We are seeking an experienced ASIC Design Architect to own the hardware architecture, microarchitecture, and successful delivery of complex ASICs from concept through tape-out and silicon bring-up. The ideal candidate combines... 

    Hobbsnews

    Sunnyvale, CA
    4 days ago
  • **Distinguished Technologist, ASIC Design Architect** We are seeking an experienced ASIC Design Architect to own the hardware architecture, microarchitecture...  ...to help them succeed.**COVID Policy**The health and safety of our team members, customers and partners is paramount... 
    Local area

    Hewlett Packard Enterprise Development LP

    Sunnyvale, CA
    2 days ago
  •  ...Together, we advance your career. THE TEAM AMD's Data Center GPU organization is transforming the AI and HPC landscape. Our...  ...ROLE AMD is seeking a highly accomplished Principal Modeling Architect to join the Product Architecture and Workload Strategy team for... 
    Remote work

    Advanced Micro Devices , Inc.

    San Jose, CA
    2 days ago
  • NVIDIA Gruppe is seeking a Senior GPU Architect in Santa Clara, California, to design new hardware for graphics and parallel processing. The ideal candidate will have a strong background in computer architecture and programming skills in C, C++, Perl, and Python. With over... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Distinguished Resiliency and Safety Architect, GPU Diagnostics. Be the first to apply!