Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Manager: AI-Driven Site Reliability

$200k - $322k

NVIDIA

A leading technology company is looking for a Senior Manager of Site Reliability Engineering in California. The role involves managing the full lifecycle of IT operations, transforming incident response through AI, and leading a high-performing team. The ideal candidate has extensive experience in Site Reliability Engineering, strong leadership skills, and a deep understanding of modern reliability practices. This full-time position offers a salary range of $200,000 to $322,000 based on experience and location. #J-18808-Ljbffr NVIDIA Corporation

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior Manager: AI-Driven Site Reliability in Santa Clara, CA vacancy
  • Palo Alto Networks, Inc. is seeking a Senior Site Reliability Engineer in Santa Clara, California. You will design and operate cloud infrastructure...  ...across GCP, AWS, and global data centers while leveraging AI and machine learning for transformative operational... 
    Senior
    Website

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  • JPMorgan Chase & Co. is seeking a Director of Site Reliability Engineering to partner with the Infrastructure Platforms and Foundational...  ...through complex projects, leading the application of AI capabilities, and managing teams effectively. Applicants must have over 10 years... 
    Senior
    Website

    JPMorgan Chase & Co.

    Palo Alto, CA
    4 days ago
  • $174k - $239.5k

     ...connect our world – like AI and IoT. If you want to...  ...Technical Marketing Manager recognized as a...  ...storytelling and data-driven decisions. ~ Experience...  ...enablement. Familiarity with reliability, yield, and metrology/...  ...to make our careers site ( accessible to all... 
    Senior
    Website
    Full time
    Relocation

    Applied Materials

    Santa Clara, CA
    5 hours ago
  • $200k - $322k

    Senior Manager, Site Reliability Engineering page is loaded## Senior Manager, Site Reliability Engineeringlocations...  ...gaming, and accelerated computing, driven by a legacy of continuous innovation...  ...leveraging the immense potential of AI to usher in the next era of computing... 
    Senior
    Website

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $165k - $230k

     ...for a hybrid role that combines project management, customer success, and technical support...  ...customers. The ideal candidate will manage AI deployments, coordinate projects, and ensure...  ...,000 and will involve travel to customer sites as needed. #J-18808-Ljbffr Alumni... 
    Senior
    Website

    Alumni Ventures

    Campbell, CA
    5 days ago
  • $152k - $241.5k

    Overview NVIDIA is looking for a Senior Site Reliability Engineer (SRE) to join our...  ...while leveraging AI technologies to deliver groundbreaking...  ...‑as‑Code) and configuration management to standardize and automate...  ...‑end observability or data‑driven operations (AIOps/ML‑driven... 
    Senior
    Website

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $208k - $327.75k

     ...is looking for an experienced senior manager to lead a team of PCBA...  ...highest quality of our complex AI systems. Your primary responsibility...  ...manage NVIDIAs manufacturing sites. Collaborate closely with...  ...including process, assembly, test, reliability, customer quality as main... 
    Senior
    Website

    Nvidia Corporation

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...Overview We’re looking for a Senior SRE to join our Compute...  ...harness the power of AI to deliver...  ...Infrastructure‑as‑Code) and config management to standardize and...  ...lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑driven operations (AIOps/ML‑... 
    Senior
    Website

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $201.6k - $302k

    Job Description The Role: As the Senior Engineering Manager for Hybrid Services & Reliability (HSR) within AV Core Infrastructure (ACI) at GM, you are the architect...  ...Qualifications) Required: Extensive background in Site Reliability Engineering (SRE) and defining SLO/SLI... 
    Senior
    Website
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  • $181.69k - $213.75k

     ...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA...  ...representing nearly $185B in assets under management, with tools designed to enhance the...  .../or GraphQL API design principles. AI Fluency: You use AI tools in your own... 
    Senior
    Website
    Full time
    Work at office

    Carta

    Santa Clara, CA
    5 days ago
  • $216.15k - $262k

     ...vertically integrated AI infrastructure...  ...We are hiring the Senior Staff TPM who will...  ...introduction. Not manage a workstream inside...  ...implications. Evolve the Site Operations PM to...  ...matters for fleet reliability at scale. Networking...  ...clear, data-driven, decision-oriented... 
    Senior
    Website
    Temporary work

    Crusoe

    Sunnyvale, CA
    27 days ago
  • $148k - $235.75k

     ...into the unlimited potential of AI to define the next era of...  ...where you will be working as a Senior SRE Engineer. The position will...  ...needed. What you’ll be doing: Manage NVIDIA's on-prem infrastructure. Maintain uptime, reliability and readiness of on-prem engineering... 
    Senior
    Website
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $150k - $175k

     ...Site Reliability Engineer At ASAPP, our mission is simple: deliver the best AI-powered customer experience—faster than anyone else. To...  ...wherever they are. If you're driven by continuous learning, rapid...  ..., containers and container management frameworks ~ Familiarity with... 
    Senior
    Website
    Remote work

    ASAPP

    Mountain View, CA
    1 day ago
  •  ...accelerator cards targeting AI, ML, networking and...  ...an experienced Senior Manager, Test Development Engineering...  ...integration, and data-driven yield optimization....  ...hardware meets performance, reliability, and cost targets....  ...parallel and multi-site testing to reduce test... 
    Senior
    Website

    Achronix Semiconductor Corporation

    Santa Clara, CA
    2 days ago
  • $126k - $204.5k

     ...Execution, Integrity, and Inclusion. We weave AI into the fabric of everything we do and...  ...monitoring tools and practices, having managed high cardinality metrics, implemented...  ...operability of the product and ensure the reliability and availability of our services. Qualifications... 
    Senior
    Website
    Full time
    Work at office

    Palo Alto Networks

    Santa Clara, CA
    4 days ago
  •  ...security, delivering an AI-powered platform that...  ...leadership role. You will own reliability for major platform...  ..., implement, and manage highly available and scalable...  ...implement shared Event‑Driven Architecture components...  ...Engineering, or Site Reliability Engineering... 
    Senior
    Website

    Saviynt

    Milpitas, CA
    11 days ago
  •  ...tapping into the unlimited potential of AI to define the next era of computing. An era...  ...turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for...  ...performance, data integrity, and safe change management. You’ll own SLOs/SLIs, incident response,... 
    Senior
    Website

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $176k - $276k

    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and...  ...networking, coding, database, capacity management, continuous delivery and deployment and...  ...is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA... 
    Senior
    Website

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • Senior Staff Software Engineer, Site Reliability Engineering In accordance with Washington state law, we are highlighting our comprehensive benefits package,...  ...observability. Familiarity with the emerging landscape of AI/ML development and operations (MLOps), particularly... 
    Senior
    Website
    Temporary work

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • $159.2k - $301.6k

     ...Opportunity We are seeking a Senior SRE (Site Reliability Engineer) to help compose,...  ...of cloud-native and AI-enabled platforms. What...  ...APIs, operability, lifecycle management, and developer experience....  ...impact, powered by AI and driven by human ingenuity. Our... 
    Senior
    Website
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    3 days ago
  •  ...Senior Site Reliability Engineer Latitude AI develops automated driving technologies, including L3, for Ford vehicles at scale. We're driven by the opportunity to reimagine what it's like to drive and make travel safer, less stressful, and more enjoyable for everyone... 
    Senior
    Website
    Work at office
    Immediate start

    Latitude AI

    Palo Alto, CA
    5 hours ago
  • Nectar is seeking a Senior Site Reliability Engineer based in Palo Alto to own the reliability and scalability...  .... You will define and implement SLOs, manage incident responses, and collaborate...  ...collaborative team at the forefront of AI marketing infrastructure. #J-18808-... 
    Senior
    Website

    Nectar

    Palo Alto, CA
    3 days ago
  • SambaNova Systems is seeking a Senior Cloud Platform Engineer in...  .... This role focuses on the reliability and scalability of our AI inferencing service, requiring experience in Site Reliability Engineering and...  ...infrastructure. The ideal candidate will manage on-call responsibilities,... 
    Senior
    Website

    jobs.frontdoordefense.com - Jobboard

    Palo Alto, CA
    1 day ago
  • $256.05k - $361.48k

    Responsibilities Directs and manages a team of design verification engineers responsible for...  ...Ethernet, USB, AXI/CHI). Experience with AI GPU preferred. Experience with hardware emulation...  ...to split their time between working on-site at their assigned Intel site and off-site.... 
    Senior
    Website
    Local area

    Intel Corporation

    Santa Clara, CA
    1 day ago
  • $88.45k - $184.25k

    A leading AI-driven technology firm is seeking an experienced Incident & Problem Manager to join their global IT operations team. This senior role involves leading the resolution of high-severity incidents and shaping AI-driven automation strategies within incident management... 
    Senior

    Omnissa, LLC

    Mountain View, CA
    1 day ago
  • Cerebras is looking for a Senior Site Reliability Engineer to join their Infrastructure team in Palo Alto, California. This role involves designing and optimizing infrastructure for distributed AI applications, contributing to the open-source Ray project, and ensuring... 
    Senior
    Website

    Cerebras

    Palo Alto, CA
    4 days ago
  • $181k - $197k

    Clutch Canada in Palo Alto is seeking a Senior Site Reliability Engineer to oversee the production reliability and performance of our SaaS platforms...  ...cloud infrastructure and a proactive approach to project management. We offer a competitive annual base salary range of $181-1... 
    Senior
    Website

    Clutch Canada

    Palo Alto, CA
    1 day ago
  •  ...infrastructure for Apple's services. The role involves developing AI-powered tooling, automating deployment, and ensuring that...  ...efficiently. Applicants should have at least 8 years of experience in site reliability engineering, a strong background in cloud infrastructure, and... 
    Senior
    Website

    Apple Inc.

    Cupertino, CA
    2 days ago
  • $168k - $258.75k

     ...Senior Technical Program Manager, DGX Cloud Software Products and Services...  ...Cloud (DGXC) powers AI for strategic research...  ...emphasizing resilience, reliability, and goodput. This...  ...operations, and environments driven by data and research....  ..., platform, site reliability, operational... 
    Senior
    Website

    NVIDIA

    Santa Clara, CA
    2 days ago
  • A leading technology company seeks a Senior Technical Program Manager in Santa Clara, CA, to drive AI initiatives for Chip Design. The role involves leading planning and execution, collaborating with engineering teams, and ensuring scalable delivery of AI capabilities.... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Manager: AI-Driven Site Reliability. Be the first to apply!