Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI & Systems Intern: Datacenter Debug & Reliability

$20 - $71 per hour

NVIDIA

NVIDIA is hiring an intern in Santa Clara, California for AI and Systems Software focused on datacenter applications. Candidates will participate in debugging, analyzing system failures, and improving infrastructure reliability. Applicants should be pursuing a relevant degree, have skills in Python and Bash, and experience in HPC environments. The internship offers an hourly rate between $20 and $71, along with benefits. This internship offers a great opportunity to work with cutting-edge technology and a collaborative team. #J-18808-Ljbffr NVIDIA

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the AI & Systems Intern: Datacenter Debug & Reliability in Santa Clara, CA vacancy
  • $20 - $71 per hour

    NVIDIA AI in Santa Clara is looking for an intern focused on AI and Systems Software for datacenter applications. The role involves system-level debugging and analyzing infrastructure reliability while developing workflows for deep learning solutions. The ideal candidate... 
    Internship
    Hourly pay

    NVIDIA AI

    Santa Clara, CA
    23 hours ago
  • $20 - $71 per hour

    Overview NVIDIA is looking for an intern for an exciting role in AI and Systems Software for datacenter applications. You will be deeply involved in system-level debugging, analyzing large-scale infrastructure reliability, and correlating complex failure modes to underlying... 
    Internship
    Hourly pay

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $20 - $71 per hour

     ...Corporation in Santa Clara is looking for an intern to assist in investigating failures within...  ...compute clusters and analyzing logs to identify system-level issues. This role involves collaboration with mentors to learn debugging methodologies and drive infrastructure... 
    Internship
    Hourly pay

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of...  ...ROLE: This role serves as the debug execution backbone of AMD's AI...  ...fleet anomalies, and data center reliability issues. Aggregate fleet, RMA,... 
    Suggested

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  • $136k - $218.5k

     ...is seeking a Silicon Speed Features Engineer to co-design system-level speed features across Gaming, Datacenter, Automotive, and Embedded markets. The role involves collaborating cross-functionally and using AI to enhance automation tools for performance validation. Ideal... 
    Suggested

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $136k - $218.5k

     ...definition across Gaming, Datacenter, Automotive, and...  ...Engineer, you will co‑design system‑level speed features,...  ...characterize them, and lead debug of the complex silicon...  ...tooling—including AI—without losing rigor. What...  .../software, process/reliability, and operations teams to... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...Cerebras Systems builds the world’s largest AI chip, 56 times larger than GPUs. Our novel...  ...tools. Support reliable operation and scale-out of...  ...engineers during complex debugging. • Progress toward independent...  ...systems engineering, or datacenter environments; basic familiarity... 
    Internship
    Work at office

    Dormont Manufacturing Company

    Sunnyvale, CA
    4 days ago
  • $80k - $85k

     ...Software Test Engineers (aka System Test engineers) to help...  ...and leverages ML/AI to simplify operations,...  ...and if troubleshooting & debugging network and system...  ...consistent functionality and reliability. Your job is...  ...: The intern base pay for this role... 
    Internship
    Night shift
    3 days per week

    Arista Networks, Inc.

    Santa Clara, CA
    4 days ago
  • $2,000 per month

     ...building the world’s first AI inference system purpose-built for transformers...  ...the accelerator card and debug link issues using BERTs,...  ...latency, serviceability, and reliability Strong fundamentals in optical...  ...Deep understanding of datacenter infrastructure, specifically... 
    Work at office
    Relocation package

    Jobr

    San Jose, CA
    4 days ago
  • $174k - $252k

    Senior Software Engineer, Embedded Systems/Firmware, AI and Infrastructure Sunnyvale, CA, USA Bachelor...  ...at unparalleled scale, efficiency, reliability and velocity. Our customers include...  ...Triage product or system issues and debug/track/resolve by analyzing the sources... 
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $168k - $264.5k

    ## Senior System Reliability EngineerApplylocations: US, CA, Santa Claratime type: Full timeposted...  ..., we are increasingly known as “the AI computing company.” We're looking to grow...  ...reliability or hardware engineering from datacenter, systems, or computer industries.*... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  •  ...Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs...  ..., scalability, reliability, and usability of next...  ...cloud and in our datacenter. You will work closely...  ...end-to-end triage, debug, and...  ...Read our blog: Intern at Cerebras Apply... 
    Internship

    Cerebras

    Sunnyvale, CA
    1 day ago
  • $200k - $322k

    Join NVIDIA's datacenter product engineering team in our Operations organization...  ...advancement! As a Senior System Debug Engineer, you will drive...  ...industry vendors, suppliers, internal and external engineers,...  ...existing vacancy. NVIDIA uses AI tools in its recruiting processes... 
    Work experience placement
    Overseas

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $20 - $71 per hour

     ...looking for a data analytics intern to work on data‑center telemetry...  ...passionate about large‑scale datacenter development and deployment....  ...and concepts. Proven debugging and problem‑solving skills. Knowledge...  ...of Linux‑based operating systems, Python, Bash, and C/C++. Ways... 
    Internship
    Hourly pay
    Work at office

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $6,710 per month

     ...environment. As a Research Intern in the Strategic Planning...  ...scale Artificial Intelligence (AI) datacenter environments. Your work will...  ...tracing and analysis systems capable of capturing packet-...  ...collaborate with engineers to improve reliability and strengthen the... 
    Internship
    Ongoing contract
    Summer work
    Local area

    Microsoft Corporation

    Mountain View, CA
    13 days ago
  •  ...XPENG & Volkswagen Group is seeking an entry-level engineer or intern in Santa Clara, CA, to support model optimization and...  ...Responsibilities include assisting in model quantization and deployment, debugging systems, and collaborating with teams to enhance model performance.... 
    Internship

    XPENG & Volkswagen Group

    Santa Clara, CA
    3 days ago
  • $20 - $71 per hour

    NVIDIA Gruppe is looking for an intern to investigate and triage failures within large-scale compute clusters. The role...  ...proficiency in Python and shell scripting, alongside strong debugging skills in complex systems. Interns will work closely with mentors, receive... 
    Internship
    Hourly pay

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...California, is seeking a Vector Compute Architect Intern to join our advanced architecture team....  ...vector compute architectures for AI and high-performance computing platforms....  ...driving architecture trade-offs, developing system specifications, and collaborating with various... 
    Internship

    Jobleads-US

    Sunnyvale, CA
    23 hours ago
  • AI Chopping Block, Inc. is seeking an intern for its Systems and Safety team in Santa Clara, California. This role involves developing agentic tooling for Systems Engineering applications, including AI chatbot-like interfaces and coding prototypes to enhance internal processes... 
    Internship
    Hourly pay

    AI Chopping Block, Inc.

    Santa Clara, CA
    1 day ago
  •  ...accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. The NTSG team develops advanced system solutions...  ...correctness and system‑level performance Debug and resolve issues across simulation, emulation, lab... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  •  ...NVIDIA's DGX Cloud AI Efficiency Team...  ...implementing software and systems engineering...  ...meaningful and actionable reliability metrics to track...  ...). Strong debugging skills and experience...  ...analysis of failures and datacenter scale. Good...  ...of DL frameworks internal PyTorch, TensorFlow... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and...  ...workload scheduling across heterogeneous hardware. • Debug and resolve complex system-level performance issues... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  • $192k - $278k

    Technical Program Manager, NPI, AI/ML (GPU) Systems corporate_fare Google place Sunnyvale, CA, USA Bachelor's degree in a technical field...  ...delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud... 
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $184k - $287.5k

     ...Software Engineer, DGX Cloud AI...  ...building the software and systems that power the world’s...  ...workloads run efficiently and reliably at scale. You will lead...  ...bring-up, validation, and debugging of large-scale AI...  ...attribution systems for datacenter-scale infrastructure.NVIDIA... 
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • NVIDIA Corporation in California is seeking a Systems Performance Engineer for agentic AI workloads. In this role, you will develop simulations using C++ and Python to analyze performance for LLM workloads and guide architectural decisions. The ideal candidate has a strong... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $132k - $207k

    Senior System Power Validation and Applications Engineer...  ...Engineer in the Datacenter System Engineering Team...  ...scalability, manufacturability, reliability, security, protection,...  ...joint development, and debug power system issues of...  ...can solve. Our work in AI and digital twins is... 
    Full time

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $188.3k - $269.28k

    A leading precision timing company is seeking a Networking System Architect to focus on datacenter, AI, and 5G applications. In this senior role, you will foster technical relationships with customers, lead architectural discussions, and influence strategies in cutting-... 

    SiTime

    Santa Clara, CA
    4 days ago
  • $200k - $322k

     ...Engineering Manufacturing - AI and...  ...role expands include Datacenter Board, Networking and Physical AI Board and System products in various Production...  ...Lead hands‐on factory debugging and issue resolution....  ...trends, defect paretos, reliability data, and test coverage... 
    Contract work

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • NVIDIA Corporation is seeking a Senior Systems Architect in Santa Clara, California. As part of a dynamic team, you'll be crucial in crafting the designs for next-generation AI Super Computing Datacenters. Responsibilities include defining engineering requirements, collaborating... 

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  •  ...Intelligence, where smart tech and AI are seamlessly woven into...  ...smartwatches to renewable energy systems that efficiently distribute...  ...Internship Experience: At Synopsys, interns dive into real-world projects,...  ...activities including coding, debugging, testing, and documentation... 
    Internship
    Full time
    Part time
    Summer work
    Summer internship
    Start working today
    Work at office
    Local area
    Worldwide

    Synopsys, Inc.

    Sunnyvale, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI & Systems Intern: Datacenter Debug & Reliability. Be the first to apply!