Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI/ML Systems Quality & Reliability Engineer

NVIDIA Corporation

NVIDIA Corporation is seeking a Systems Quality and Reliability Engineer to join their LPU team. This role is crucial for ensuring the reliability of NVIDIA's AI/ML products through in-depth root-cause analysis and failure investigations. The ideal candidate will have a BS/MS in Electrical Engineering or a related field, along with 5+ years of experience in systems quality engineering. The position includes managing RMA and FA procedures and collaborating with various engineering teams. #J-18808-Ljbffr NVIDIA Corporation

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the AI/ML Systems Quality & Reliability Engineer in Santa Clara, CA vacancy
  • $110.5k - $152k

    ## Quality & Reliability Systems Engineer - (E3)Applylocations: Santa Clara,CAtime type: Full timeposted on: Posted Todayjob requisition id: R2620088**Who...  ...technologies that literally connect our world - like AI and IoT. If you want to push the boundaries of materials... 
    Suggested
    Full time
    Relocation

    Applied Materials, Inc.

    Santa Clara, CA
    1 day ago
  • Job Summary We are seeking a Systems Quality and Reliability Engineer to join our LPU team at NVIDIA. The role focuses on ensuring the reliability of NVIDIA AI/ML products through comprehensive root‑cause analysis and failure investigation. Responsibilities Own, build,... 
    Suggested
    Contract work

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $136k - $218.5k

    NVIDIA in Santa Clara is seeking a Silicon Speed Features Engineer to co-design system-level speed features across Gaming, Datacenter, Automotive,...  ...The role involves collaborating cross-functionally and using AI to enhance automation tools for performance validation.... 
    Suggested

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $124k - $195.5k

     ...working at the cutting edge of AI infrastructure. As agentic LLM workloads...  ...on modern datacenters, we need engineers who can model, simulate, and reason about complex system-level traffic at scale. If you...  ...characterizing or benchmarking ML inference workloads NVIDIA is... 
    Suggested

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $136k - $218.5k

     ...Power Architecture & Optimization Engineer to push the limits of energy...  ...efficiency using advanced analytics and AI, including LLMs trained...  ...models and flows, including ML/RL‑based techniques for anomaly...  ...applied to EDA, architecture, or system‑level optimization. Interest... 
    Suggested

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $224k - $356.5k

     ...the unlimited potential of AI to define the next era of...  .../ Principal Deep Learning Engineer — Model Evaluation & AI Systems, you will play a meaningful...  ..., benchmarks, or ML infrastructure used by other...  ...appreciation for evaluation quality, including correctness, reproducibility... 

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $110.5k - $152k

    Applied Materials, Inc. in Santa Clara, CA is seeking a Quality & Reliability Systems Engineer (E3) to ensure product quality and reliability through testing and evaluation. This full-time position involves developing quality standards, implementing testing methods, and... 
    Full time

    Applied Materials, Inc.

    Santa Clara, CA
    1 day ago
  •  ...leading a team focused on distributed AI communication systems and setting technical direction. Candidates...  ...have at least 8 years of software engineering experience and 3 years of people...  ...programming in C/C++, and familiarity with ML systems concepts is essential. The position... 

    NVIDIA Corporation

    Santa Clara, CA
    23 hours ago
  • $255.85k - $361.2k

    Job Overview We are seeking a Principal Engineer to define and architect the next generation of distributed AI systems across heterogeneous compute platforms, including CPUs...  ...degree, or 6+ years with a PhD. Experience with AI/ML systems, inference infrastructure, or large‑... 
    Local area
    Shift work

    Intel Corporation

    Santa Clara, CA
    2 days ago
  • $180k - $200k

    Uber is hiring a Senior Staff Engineer to architect and scale an autonomous support agent, enhancing customer experience using GenAI...  ...will have over 10 years of experience in building production ML/AI systems and will lead voice agent initiatives. This role offers a... 

    Uber

    Sunnyvale, CA
    4 days ago
  • $131k - $175k

     ...Senior Hardware Systems Engineer – AI Rack & Cluster Infrastructure Arista Networks is an industry...  ...to maintain the highest standards of quality and performance in everything we do....  ...directly with hyperscalers or large-scale AI/ML cluster deployments Experience... 
    Remote work
    Flexible hours

    Arista Networks, Inc.

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency...  ...research that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way to... 

    NVIDIA

    Santa Clara, CA
    2 days ago
  • A leading AI technology company located in Sunnyvale, California, is looking for an experienced engineer to join its SOTA Training Platform team. The ideal candidate...  ...include bringing ML models to life on Cerebras CSX systems, performance tuning, and contributing... 

    Cerebras

    Sunnyvale, CA
    23 hours ago
  • $132k - $189k

    Google is seeking a talented Hardware Engineer to drive the future of AI/ML hardware acceleration. You will develop custom silicon solutions and lead the design and validation of hardware systems for test chips, working collaboratively with cross-functional teams. The role... 

    Google

    Sunnyvale, CA
    4 days ago
  • NVIDIA Gruppe is seeking a Senior Engineer in Santa Clara, CA, to join the Cosmos team. This role focuses on creating AI-native systems that enhance the efficiency of machine learning workflows. Candidates should have extensive Python and PyTorch experience, along with... 

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • Rhoda AI in Palo Alto is looking for a Robot Systems QA Engineer to enhance the quality and reliability of their advanced robotics platform. This role involves designing and executing validation frameworks while collaborating with cross-functional teams to ensure performance... 

    Rhoda AI

    Palo Alto, CA
    2 days ago
  • $165k - $180k

     ...innovative robotics, AI, and 3D ultrasound...  ...3D ultrasound system. To succeed here...  ...role The Imaging Engineer at iSono Health will...  ...performance, high reliability, supply continuity,...  ...Collaborate with AI/ML engineers to develop...  ...expectations and quality standards. Deliberately... 

    iSono Health

    Sunnyvale, CA
    4 days ago
  • $120k - $172k

    Google Inc. is seeking a Product Quality Engineer in Sunnyvale, CA, to lead quality initiatives for AI infrastructure. The role involves collaborating with various teams to embed quality into product development, ensuring consistent quality across all stages. Ideal candidates... 

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • Job Description As a Senior Systems Research Engineer , you will join a future-...  ...explore and build embodied AI applications at the intersection...  ...of state-of-the-art AI/ML and robotics. In this deeply...  ...and integrated into high-quality, reliable product development. You will... 

    Intuitive

    Sunnyvale, CA
    2 days ago
  • $181.1k - $318.4k

    A leading technology firm located in Santa Clara, California is seeking an experienced Machine Learning Data Engineer to design and implement AI robotic systems. This role requires strong software engineering skills and at least 3 years of relevant experience. Candidates... 

    Apple Inc.

    Santa Clara, CA
    4 days ago
  •  ...highly motivated Software Engineer to join our growing AI and Generative AI engineering...  ...of large-scale AI systems powering next-generation applications...  ..., and improve the reliability, safety, and performance of...  ...infrastructure for large-scale ML training, inference, and... 

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $152k - $208.5k

     ...leader in materials engineering solutions used to...  ...our world – like AI and IoT. If you want...  ...devices to ensure quality and functionality....  ...in intricate systems, deciphering code,...  ...expertise for highly reliable, observable, developer...  ...integrating AI/ML models into... 
    Full time
    Relocation

    Applied Materials

    Santa Clara, CA
    1 day ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of...  ...looking for a Senior Staff AI Infra Engineer who is passionate about improving...  ...benchmarks, with a special focus on AI/ML workloads and GPU-accelerated... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  •  ...CUDA, KAFKA Experience with MLOps, performance benchmarks for ML models, optimizing and deploying ML models What skills...  ...with Data science and Applied scientists to help with setting up ML system infrastructure aspects. What is the project this person... 
    Contract work
    Work at office
    Remote work
    Shift work
    2 days per week
    1 day per week

    E-Solutions

    Sunnyvale, CA
    23 hours ago
  •  ...CUDA, KAFKA Experience with MLOps, performance benchmarks for ML models, optimizing and deploying ML models Top Resource Skills...  ...Data science and Applied scientists to help with setting up ML system infrastructure aspects. Project Contribution Help setup... 
    Contract work
    Work at office
    2 days per week
    1 day per week

    Samprasoft

    Sunnyvale, CA
    2 days ago
  • $125k - $175k

     ...-on Senior Software & AI Test Engineer to design and operationalize...  ..., automation-first quality framework across our software and AI-driven systems. This role owns test...  ...pipelines, and AI/ML components. The mandate...  ..., scalability, and reliability testing. • Exposure... 
    Shift work

    Covalent

    Sunnyvale, CA
    2 days ago
  • NVIDIA Corporation in Santa Clara is seeking an experienced hardware engineer to collaborate cross-functionally on system-level features. Responsibilities include defining specifications, performing validation, and leading complex debug efforts to ensure timely product... 

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $152k - $287.5k

    NVIDIA Gruppe is seeking a highly motivated Software Engineer to contribute to the design and development of large-scale AI systems. The successful candidate will work on scalable infrastructure for ML training and cloud-native platforms, leveraging cutting-edge technologies... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • A leading tech company is looking for a Senior Hardware Engineer in California to work on cutting-edge ML/AI hardware systems projects. You will be responsible for leading the development of hardware designs from concept to production, validating new products, and ensuring... 

    Google

    Sunnyvale, CA
    4 days ago
  •  ...Business Area: Engineering Seniority Level: Mid-Senior level...  ...experience - enabling data and AI workloads to run anywhere, without...  .... You will build the "nervous system" of our AI stack-optimizing...  ...least 2+ years focused on AI/ML systems. Expert proficiency... 
    Work from home
    Flexible hours

    Cloudera

    Alviso, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI/ML Systems Quality & Reliability Engineer. Be the first to apply!