Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AIOps SRE for AI Data Center Platform

NVIDIA Gruppe

NVIDIA Gruppe in Santa Clara is seeking an experienced engineer to build an AI Data Center AIOps platform. The ideal candidate will have a strong background in Kubernetes and automation, ensuring the reliability of GPU fleet management. Key responsibilities include monitoring platform health, owning infrastructure deployments, and leading incident resolution. Candidates should possess 5+ years of experience in production systems and a degree in CS/CE. A competitive salary and generous benefits package are offered. #J-18808-Ljbffr NVIDIA Gruppe

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior AIOps SRE for AI Data Center Platform in Santa Clara, CA vacancy
  • $148k - $235.75k

     ...tapping into the unlimited potential of AI to define the next era of computing. An...  ...engineers who are building an AI Data Center AIOps platform that turns raw, high-volume telemetry into...  ...operating production distributed systems as SRE/DevOps/Platform Ops. Proven... 
    Platform
    Senior

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $136k - $212.75k

    NVIDIA Gruppe in Santa Clara is hiring a Senior Power System Engineer to lead the development...  ...high-current power systems for advanced AI accelerators. The role involves architecting power delivery systems for data center platforms and collaborating with cross-functional... 
    Platform
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...Architect focused on security architecture for Client and Data Center SoCs. You'll drive AI-driven tools for analysis and patching vulnerabilities...  ...The ideal candidate will need extensive experience in SOC/platform security architecture and proficiency with AI tools. A... 
    Platform
    Senior

    Intel Corporation

    Santa Clara, CA
    3 days ago
  • $184k - $356.5k

     ...is seeking an experienced Network Solutions Architect Engineer to help deploy next-generation AI networking platforms. You will guide architecture decisions across data centers, support on-site setups, and provide solutions to key technology customers. The role requires... 
    Platform
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...Systems builds the world's largest AI chip, 56 times larger than...  ...Sr. TPM role owns site and data center operations programs...  ...metrics, and operational risks to senior leadership Required Background...  ...basics ~ Hardware-centric platforms Proven ability to define... 
    Platform
    Senior

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    1 day ago
  • $152k - $241.5k

     ...NVIDIA seeks a senior software engineer to join the AI Networking co-design and benchmark R&D team. In this...  ...utilization of system resources at data center scale. The role involves working on...  ...with diverse hardware and platforms, such as Host Channel Adapters (HCAs... 
    Platform
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $224k - $356.5k

     ...Station (Galaxy) is NVIDIA’s workstation-class AI computer—built on GB300 Blackwell GPUs with NVLink interconnect, delivering data-center-grade AI compute in a deskside form factor...  ...-GPU, high-bandwidth architecture of this platform. We are looking for a deeply technical... 
    Platform
    Senior
    Local area

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $148k - $235.75k

     ...NVIDIA is looking for a Senior AI Compute Engineer to join its Infrastructure...  ...to revolutionize deep learning and data analytics, and to power data centers. Join the team building many of the...  ...GPFS. Familiarity with OEM GPU platforms NVIDIA is widely considered... 
    Platform
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    9 days ago
  • A leading technology company is seeking a Senior Technical Marketing Engineer to focus on AI data center networking in Santa Clara. This role involves delivering content on NVIDIA’s networking platforms, articulating challenges with high-density deployments, and developing... 
    Platform
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  •  ...-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded...  ...technology. THE PERSON: As a Senior Staff Software Developer, you will be...  ...AI agents, ensuring AMD is the platform of choice for the most demanding workloads... 
    Platform
    Senior

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...is looking for an experienced Network Solutions Architect Engineer to help bring our next-generation AI networking platforms into production at customer data centers. Do you want to be part of a team that brings new AI hardware and software technologies to production... 
    Platform
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    23 days ago
  • $189k - $210k

     ...job location. Cohesity is a leader in AI-powered data security and management. Aided by an...  ...get value from data - across the data center, edge, and cloud. Cohesity helps organizations...  ...you have experience building a Gen AI Platform from beginning to end. This is a... 
    Platform
    Senior
    Hourly pay
    Full time
    Work at office
    Shift work
    2 days per week
    3 days per week

    Cohesity

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...brings Artificial Intelligence (AI) to some of the biggest...  ...Learning (ML), Deep Learning (DL), Data Analytics and other related topics...  ...on various Cloud Computing Platforms. As part of the NVIDIA...  .../containers, Kubernetes, data center deployments, etc. ~ Effective... 
    Platform
    Senior
    Local area
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $237.6k - $318.24k

     ...Senior Staff Software Engineer For Ai Model Lifecycle Team Crusoe is on a mission to accelerate the abundance...  ...across energy, manufacturing, data center construction, and cloud services....  ...in building a comprehensive managed platform for the entire application development... 
    Platform
    Senior
    Temporary work

    Crusoe

    Sunnyvale, CA
    3 days ago
  • $208k - $327.75k

     ...NVIDIA is driving a vision for AI factories that convert tokens to intelligence at scale...  .... Collaborate with NCP operators, SRE teams, and hardware vendor partners to integrate...  ...management experience in infrastructure, platform, or MLOps areas, or equivalent background.... 
    Platform
    Senior
    Live in

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

     ...Senior Solution Network Architect, Enterprise Products We are...  ...implementations for enterprise-grade AI systems. Your role...  ...detailed specifications for platforms and datacenter architectures,...  ...understanding of Ethernet, InfiniBand, data center LAN (local area networking),... 
    Platform
    Senior
    Local area
    Remote work

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $190.61k - $361.48k

    Job Overview Join Intel's AI Revolution. Intel's new AI SoC organization...  ..., from edge devices to data‑center accelerators. We are seeking...  ...across teams. Mentor senior engineers and provide technical...  ...defining or influencing SoC or platform architecture roadmap across multiple... 
    Platform
    Senior
    Local area
    Shift work

    Intel Corporation

    Santa Clara, CA
    1 day ago
  •  ...‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded...  .... THE ROLE The Enterprise AI Partner Senior Specialist role is an opportunity to...  ...vendors that align with company platform and ecosystem priorities. Ideate, drive... 
    Platform
    Senior

    Advanced Micro Devices

    Santa Clara, CA
    3 days ago
  • $187.2k - $208k

     ...location. Cohesity is a leader in AI-powered data security and management....  ...from data — across the data center, edge, and cloud. Cohesity...  ...selling practices. As the Senior Enablement AI Specialist , you...  ...teams that own the core AI platform, providing application‑level... 
    Platform
    Senior
    Full time
    Work at office
    2 days per week
    3 days per week

    Cohesity

    Santa Clara, CA
    5 days ago
  • Business Area Engineering Seniority Level Mid-Senior level...  ...to transform complex data into clear and...  ...operational databases, and AI. Cloudera is looking for...  ...AI and machine learning platform. You will be responsible...  ...provider or in private data centers. You’ll work with... 
    Platform
    Senior
    Work from home
    Worldwide
    Flexible hours

    Nerdleveltech

    Santa Clara, CA
    1 day ago
  •  ...generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems....  ...future of AI and beyond. The Role As a Senior Manager (Individual Contributor), AI...  ...Data Center GPU and AI infrastructure platforms. This partner‑facing role focuses on driving... 
    Platform
    Senior

    AMD

    Santa Clara, CA
    4 days ago
  • $262k - $365k

    Senior Staff Research Scientist, Google Cloud AI Research Google Sunnyvale, CA, USA Requirements...  ...machine (and deep) learning, data mining, natural language...  ...compilers for mobile platforms, as well as core search...  ...worldwide. We’re at the center of amazing work at... 
    Platform
    Senior
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $131k - $175k

     ...Senior Hardware Systems Engineer – AI Rack & Cluster Infrastructure Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets...  ..., ensuring Arista platforms integrate cleanly into... 
    Platform
    Senior
    Remote work
    Flexible hours

    Arista Networks, Inc.

    Santa Clara, CA
    4 days ago
  • $174k - $252k

    Senior Software Engineer, Embedded Systems/Firmware, AI and Infrastructure Sunnyvale, CA, USA Bachelor’...  ...years of experience with data structures/algorithms....  ...and maintaining our data centers to building the next generation of Google platforms, we make Google's product... 
    Platform
    Senior
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    5 days ago
  • $166k - $244k

    Senior Software Engineer, Infrastructure, Google Cloud AI Apply info_outline info_outline X Note: By applying...  ...of experience with data structures/algorithms....  ...providing the essential platforms that enable developers...  ...Global Networking, Data Center operations, systems... 
    Platform
    Senior
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $174k - $252k

    Senior Software Engineer, Performance, AI and Infrastructure Google Sunnyvale, CA, USA Bachelor...  ...performance, large scale systems data analysis, visualization...  ...and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio... 
    Platform
    Senior
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • A global technology leader is looking for an experienced SRE software engineer in Cupertino, California, to build and enhance compute...  ...infrastructure for Apple's services. The role involves developing AI-powered tooling, automating deployment, and ensuring that services... 
    Platform
    Senior

    Apple Inc.

    Cupertino, CA
    4 days ago
  •  ...Senior Solution Architect – AI / GPU Cloud Mountain View, California, United States About the Job...  ...scaling Partner with Infrastructure, Data Center Ops, and Engineering teams...  ...product and engineering to improve the platform Required Qualifications Technical... 
    Platform
    Senior

    Glint Tech Solutions LLC

    Mountain View, CA
    4 days ago
  •  ...located in Santa Clara, is seeking a talented engineer to join their platform SWQA team, focusing on server integration and automation. The...  ...with OS automation, strong Linux skills, and a background in AI tools. You will contribute to the development and execution of NVIDIA... 
    Platform
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $165k - $220k

     ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and...  ...AI revolution—working across data centers, hardware systems, and...  .... About the role: As a Senior Specialist Field Engineer CoreWeave... 
    Platform
    Senior
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    17 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AIOps SRE for AI Data Center Platform. Be the first to apply!