Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Reliability Engineer

$116k - $184k

NVIDIA

NVIDIA is the world leader in accelerated computing, developing breakthroughs that tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society. Come join the team and help build the next era of computing!

We're seeking an outstanding Senior HTOL Reliability Engineer to join our Santa Clara lab. This role requires deep device-circuitry knowledge and hands-on hardware development. You will build next-generation HTOL boards and run HTOL processes on advanced ovens. This ensures world-class reliability of the silicon powering the AI era.

What you'll be doing:

  • Implement and optimize HTOL test programs aligned with JEDEC standards.

  • Operate and maintain HTOL ovens, ensuring efficient test conditions and high data accuracy.

  • Build and debug burn-in boards, resolving signal-integrity issues and optimizing thermal performance.

  • Apply sophisticated thermal management techniques to deliver detailed temperature control and mitigate thermal stress in HTOL environments.

  • Work alongside lab technicians, build engineers, and reliability engineers to solve technical challenges and continuously improve test processes.

  • Contribute to multi-functional teams to debug and resolve hardware and software product issues.

  • Maintain and improve our reliability database, finding opportunities for improvement.

  • Collaborate with vendors to develop and implement improvements to burn-in boards, HTOL systems, and thermal interface materials.

What we need to see:

  • Master's or Bachelor's degree in Electrical Engineering or a related field (or equivalent experience).

  • 5+ years of experience in HTOL test system operation and data analysis for semiconductor devices.

  • Proven expertise in HTOL stress testing, JEDEC standards, and environmental stress tests including Temperature Cycling (TC), Reflow, Thermal Shock, and HAST.

  • Hands-on experience with MCC HTOL chamber operation, repairs, and preventative maintenance.

  • Proficiency with oscilloscopes, current probes, and other test equipment for data acquisition and analysis.

  • Skill in vector debugging, test-script development/modification, and data-analysis tools. ATE experience is a plus.

  • Programming experience with Python or MATLAB for data analysis and automation.

  • Excellent communication, teamwork, and problem-solving skills, with strong attention to detail.

Ways to stand out from the crowd:

  • Experience with dual-die or multi-die configurations and the associated thermal challenges.

  • Background crafting burn-in boards for high-power GPU or SoC devices.

  • Familiarity with reliability analytics platforms (JMP) and statistical lifetime modeling (e.g., Weibull, Arrhenius).

  • Track record driving vendor qualification and component selection for reliability test hardware.

  • Exposure to AI/ML-based approaches for reliability data analysis or predictive failure modeling.

With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, with a genuine passion for technology, we want to hear from you.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 116,000 USD - 184,000 USD.

You will also be eligible for equity and benefits ( .

Applications for this job will be accepted at least until June 14, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Vacancy posted 7 hours ago
Similar jobs that could be interesting for youBased on the Senior Reliability Engineer in Santa Clara, CA vacancy
  • $120k - $171k

     ...Photonic Devices Engineer This position is cross-functional in nature and requires close cooperation with the design, development...  ...photonic devices and have a strong desire to learn about the reliability challenges associated with new product development. Your Responsibilities... 
    Senior
    Full time
    Temporary work

    Nokia

    Sunnyvale, CA
    4 days ago
  • $119.8k - $234.7k

     ...Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft's...  .... We are looking for a Senior Quality Engineer to join the team. #azurehwjobs...  ...Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or... 
    Senior
    Ongoing contract
    Work experience placement
    Work at office
    Local area
    Worldwide

    Microsoft Corporation

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe is seeking an experienced professional to lead package-level reliability for semiconductor products in Santa Clara, California. The ideal candidate will possess a Master’s or PhD in a related field, along with 8+ years of hands-on experience in IC packaging... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • What You’ll Be Doing Own the package‑level reliability spec for assigned products Define qualification requirements and pass/fail criteria...  ...requirement What We Need to See MS/PhD in Electrical Engineering, Materials Science, Mechanical Engineering, or related field,... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • A leading tech company in Santa Clara is seeking an experienced Product Quality and Reliability Engineer IV to oversee multiple engineering projects and ensure the reliability of new products. The ideal candidate should have over 10 years in quality engineering, applicable... 
    Senior

    Applied Materials, Inc.

    Santa Clara, CA
    3 days ago
  • A leading technology firm in Sunnyvale is seeking a hands-on engineer to oversee hardware quality management for their products. The...  ...with cross-functional teams to drive improvements in product reliability and customer satisfaction. This role is critical in ensuring that... 
    Senior

    Synopsys, Inc.

    Sunnyvale, CA
    4 days ago
  • $133.5k - $183.5k

    Applied Materials, Inc. is seeking a reliability expert in Santa Clara, California. This full-time role involves evaluating materials and techniques for reliability, advising design engineering, and managing Quality Assurance programs. The ideal candidate will lead projects... 
    Senior
    Full time

    Applied Materials, Inc.

    Santa Clara, CA
    4 days ago
  • Applied Materials, Inc. is looking for a Product Quality & Reliability Engineer III located in Santa Clara, CA. In this role, you will lead quality and reliability activities throughout the product lifecycle, ensuring products meet quality and reliability standards. You... 
    Senior

    Applied Materials, Inc.

    Santa Clara, CA
    4 days ago
  • $47 - $68 per hour

    Milestone Technologies, Inc. is seeking a Senior Data Center Operations Engineer to support the build-out and operation of a high-performance data center. The role requires expertise in server hardware and Linux systems, with responsibilities including hardware troubleshooting... 
    Senior
    Hourly pay

    Milestone Technologies, Inc.

    Sunnyvale, CA
    4 days ago
  • Applied Materials, Inc. is seeking a Product Quality & Reliability Engineer IV in Santa Clara, California. This role involves evaluating the reliability of materials and advising on the application of electronic components to enhance product reliability. Responsibilities... 
    Senior

    Applied Materials, Inc.

    Santa Clara, CA
    5 days ago
  • Applied Materials, Inc. is seeking a Product Quality and Reliability Engineer in Santa Clara, California. This highly technical role involves leading quality engineering projects and ensuring Design For Reliability (DfR) methods are integrated into new product developments... 
    Senior

    Applied Materials, Inc.

    Santa Clara, CA
    1 day ago
  • $133.5k - $183.5k

    Applied Materials, Inc. is seeking a Product Quality and Reliability Engineer IV in Santa Clara, CA. The role involves leading quality engineering projects and conducting reliability analyses in a cross-functional environment. The ideal candidate has a Bachelor's degree... 
    Senior
    Full time

    Applied Materials, Inc.

    Santa Clara, CA
    3 days ago
  •  ...for a role in their Hardware Infrastructure EDA Compute team to optimize workload scheduling systems and improve overall service reliability. The successful candidate will manage and scale job scheduling systems while driving measurable improvements in efficiency through... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $181.69k - $213.75k

     ...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects founders, investors, and limited partners through world-class software, purpose-built for everyone in venture capital, private... 
    Senior
    Full time
    Work at office

    Carta

    Santa Clara, CA
    3 days ago
  •  ...Senior Site Reliability Engineer LeanData helps the world's fastest-growing companies automate, simplify, and accelerate revenue. We are looking for a Senior Site Reliability Engineer to lead the strategic evolution of our cloud infrastructure. Reporting directly... 
    Senior
    Full time
    Work at office
    Flexible hours
    2 days per week

    LeanData

    Santa Clara, CA
    3 days ago
  • $207k - $300k

    Google Inc. is looking for a Staff Software Engineer specializing in Site Reliability Engineering in Sunnyvale, CA. This role combines software and systems engineering to build and manage distributed systems, ensuring high reliability and uptime. The ideal candidate should... 
    Senior

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $148k - $235.75k

     ...and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops...  ...Manage NVIDIA's on-prem infrastructure. Maintain uptime, reliability and readiness of on-prem engineering cloud spread across multiple... 

    NVIDIA

    Santa Clara, CA
    4 days ago
  • Intel Corporation is seeking a Sr. Data Center Facilities Engineer in Santa Clara, California. In this senior role, you'll provide technical leadership ensuring the reliability and performance of critical mechanical systems in data centers. The position requires a bachelor... 
    Senior

    Intel Corporation

    Santa Clara, CA
    5 days ago
  • $133.1k - $306.4k

    Senior Manager, Network Reliability Engineering Job Identification 336557 Job Category Product Development Posting Date 06/09/2026, 04:54 PM Role People Manager Job Type Regular Employee Does this position require a security clearance? No Years 6 to 10+ years Applicants... 
    Senior
    Temporary work
    Flexible hours

    Ll Oefentherapie

    Santa Clara, CA
    4 days ago
  • A leading technology firm is in search of a Senior Wireless Network Site Reliability Engineer to manage and enhance their wireless network infrastructure. The ideal candidate has over 8 years of experience in wireless network operations and a strong background in wireless... 
    Senior

    TechDigital Group

    Santa Clara, CA
    2 days ago
  • $174k - $252k

    A leading tech company is seeking a Senior Software Engineer for Site Reliability Engineering based in Sunnyvale, CA. The role involves ensuring service reliability, leading technical projects, and enhancing systems performance. Candidates should have at least 5 years of... 
    Senior

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $156k - $190k

    Crusoe Energy Systems in Sunnyvale, CA, is seeking a Staff Cloud Support Engineer to provide technical leadership in cloud infrastructure. You will lead incident responses, design reliability architecture, and mentor team members. The ideal candidate will have over 8 years... 
    Senior

    Crusoe Energy Systems

    Sunnyvale, CA
    4 days ago
  • $126k - $204.5k

     ...As part of this role, you will collaborate closely with our engineering teams to develop innovative solutions that provide clear and...  ...team to influence the operability of the product and ensure the reliability and availability of our services. Qualifications... 
    Senior
    Full time
    Work at office

    Palo Alto Networks

    Santa Clara, CA
    2 days ago
  •  ...building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute cluster): uptime, performance... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $174k - $252k

    Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California... 
    Senior
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $152k - $241.5k

     ...intelligence. Job Overview We’re looking for a Senior SRE to join our Compute Farm team and...  ...host lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑...  ...Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

    Overview NVIDIA is looking for a Senior Site Reliability Engineer (SRE) to join our Compute Farm team and help build the next generation of our global services platform. The role focuses on keeping critical systems operational while leveraging AI technologies to deliver... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $145k - $165k

    A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key... 
    Senior

    Bolt Graphics, Inc.

    Sunnyvale, CA
    3 days ago
  • $200k - $322k

    Senior Manager, Site Reliability Engineering page is loaded## Senior Manager, Site Reliability Engineeringlocations: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2016119For over 25 years, NVIDIA has been at the forefront of transforming... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • A leading technology company is looking for a Java SRE Engineer to support large-scale cloud migrations and production systems on AWS...  ...mentoring team members and collaborating with various teams to ensure reliability. This position is onsite in the San Francisco Bay Area. #J-188... 
    Senior

    EITACIES Inc.

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Reliability Engineer. Be the first to apply!