Senior Manager, Site Reliability Engineering
$200k - $322kNVIDIA
Senior Manager, Site Reliability Engineering page is loaded## Senior Manager, Site Reliability Engineeringlocations: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2016119For over 25 years, NVIDIA has been at the forefront of transforming computer graphics, PC gaming, and accelerated computing, driven by a legacy of continuous innovation and exceptional talent. We are now leveraging the immense potential of AI to usher in the next era of computing, where our GPUs power the "brains" of computers, robots, and autonomous vehicles that can comprehend the world. This pioneering work demands vision, innovation, and the world's best talent. Join our diverse and supportive environment, where NVIDIANs are inspired to excel and make a profound global impact.NVIDIA is seeking a Senior Manager of Site Reliability Engineering to lead and reshape how IT operations function at scale. This role goes beyond traditional service management to build AI-powered systems that enhance reliability, speed, and employee experience. We offer an outstanding opportunity to lead and refine Incident, Problem, and Change Management into an intelligent, automated operating model using observability, AI insights, and orchestration. This leader will apply strong operational execution with an SRE attitude, facilitating the move from reactive processes to predictive and autonomous operations.**What you’ll be doing*** Manage the full lifecycle of Incident, Problem, and CM as a 24×7 operational function, ensuring high reliability and minimal business disruption.* Transform incident response by bringing to bear AI detection, correlation, and guided remediation, reducing time to detect, respond, and resolve.* Build and scale intelligent incident workflows that integrate monitoring, telemetry, and service context to enable faster and more consistent response.* Evolve Problem Management into a data-driven field, using AI and analytics to identify patterns, eliminate recurring issues, and drive systemic fixes.* Modernize CM by introducing risk-aware, data-driven decisioning, improving change success rates, and reducing blast radius.* Drive the adoption of observability as a foundation, ensuring service-level visibility, signal quality, and actionable insights across the IT ecosystem.* Lead the development of automation and orchestration platforms that reduce manual effort across the outage lifecycle, including detection, triage, communication, and RCA or equivalent experience.* Partner closely with engineering, infrastructure, and business teams to align operations with service reliability goals and SLOs.**What we need to see:*** BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering or related fields (or equivalent experience).* 5+ years of experience leading and managing global IT operations or service management teams, with growing scope and complexity.* 12+ overall years of experience in Site Reliability Engineering, IT Service Management, with a focus on Incident Management, Problem Management, and Configuration Management* Proven proficiency in Incident, Problem, and CM with a consistent record of delivering measurable gains in reliability and efficiency.* Demonstrated experience applying AI, automation, or advanced analytics to improve operational outcomes.* Solid understanding of observability, monitoring ecosystems, and modern reliability practices (SRE principles, SLOs, error budgets).* Demonstrated ability to move organizations from process-heavy to technology-focused operating models.* Strong leadership capability with experience building and scaling engineering-focused teams (SRE, SWE, or equivalent).* Ability to deliver executive-level communication and insights, translating operational signals into clear, actionable narratives for leadership.* Ability to build and lead a high-performing team of SREs and engineers, encouraging a culture of ownership, innovation, and continuous improvement.**Ways to stand out from the crowd:*** ITIL knowledge and/or certification* Experience building or scaling AI-powered operational platforms.* Ability to challenge traditional ITSM models and introduce innovative, scalable approaches.* A mentality passionate about automation first, prevention over reaction, and systems over process.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 200,000 USD - 322,000 USD.You will also be eligible for equity and .Applications for this job will be accepted at least until April 17, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA Corporation
$168k - $270.25k
Senior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: US, CA, Santa Claratime type: Full timeposted on: Posted... ...Engineering, Production Engineering, or Incident Management roles* Bachelor’s or Master’s degree in Computer Science...Senior$174k - $252k
Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California...SeniorFull time$148k - $235.75k
...on the world.Join our team of innovative engineers who are building an AI Data Center AIOps... ...turns raw, high-volume telemetry into reliable, job-centric insights and automation for... ...performance, data integrity, and safe change management. You’ll own SLOs/SLIs, incident response...Senior$145k - $165k
A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key...Senior$176k - $276k
Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high... ...systems, networking, coding, database, capacity management, continuous delivery and deployment and open source cloud...Senior$181.69k - $213.75k
...funds and SPVs, representing nearly $185B in assets under management, with tools designed to enhance the strategic impact of... ...solve today unlock the opportunities of tomorrow. As a Senior Site Reliability Engineer, you’ll work to: Build and scale our internal platform...SeniorFull timeWork at office- ...Sr. Manager API Platform Make Next Happen Now. For more than 30 years, the Bank has helped innovative companies and their investors... ...Management platform. You will work cross-functionally with Architects, Engineers, Business Analysts, and Service Managers across multiple teams...Senior
$129.3k - $193.9k
Northrop Grumman Corp. (JP) is seeking a Deputy Operations Program Manager in Sunnyvale, CA. This role involves leading project teams, managing manufacturing operations, and ensuring program delivery meets schedule and budget. Ideal candidates bring extensive experience...Senior$272k - $431.25k
A leading technology company in Santa Clara is looking for a Senior Manager in Systems Software Engineering to drive the development of cloud services. The ideal candidate will have over 10 years of software development experience, including 5 years in leadership roles...Senior- ...A global healthcare leader is seeking a Senior Product Manager to drive marketing strategies for innovative coronary therapies, including Intravascular Lithotripsy (IVL). This fully remote role focuses on increasing product penetration and launching new campaigns while...SeniorRemote work
- Robotics Process Automation, LLC is looking for an experienced iOS Engineer based in Sunnyvale, California. The ideal candidate will have over 8 years of experience in iOS development and a passion for delivering high-quality mobile applications. Responsibilities include...Senior
$224k - $356.5k
...how you can make a lasting impact on the world. As a Senior Developer Relations Manager for Data Platforms, you’ll work with our most strategic... ...offerings and alignment at the business, product, and engineering levels. Build relationships with executive and technical...Senior$272k - $431.25k
...developers in the semiconductor ecosystem. The Industrial Engineering organization is a strong, growing, and visible group both inside... ..., SK Hynix, TSMC, Qualcomm, Intel, etc. with 5+ years people management experience. ~ MS/PhD in Electrical Engineering, Computer...Senior- ...A leading construction recruitment firm is looking for a Senior Project Manager for doors, frames, and hardware projects. The role entails leading multiple commercial projects, ensuring client satisfaction, and maintaining profitability. Candidates should have over 7...SeniorFull timeRemote work
- ...A leading global supply chain services provider is seeking a Key Account Executive to manage and grow accounts for top companies. This fully remote position requires strong consultative sales skills, a deep understanding of client needs, and the ability to build relationships...SeniorRemote work
- A leading tech company is seeking an SAP Test Manager to oversee comprehensive testing activities for an SAP upgrade project. The ideal... ..., and ensuring compliance with industry standards. This senior role offers competitive compensation in a dynamic and collaborative...Senior
$196k - $310.5k
...accelerated computing technology across various industries Develop key messages that resonate with external audiences Pitch and manage stories with reporters to secure impactful press coverage Nurture strong long-term relations with business, technology and trade...Senior$232k - $368k
NVIDIA Corporation is seeking a Senior Manager for the System Integration in the Silicon Co-Design Group based in Santa Clara, CA. The... ...excellent leadership skills, and a strong background in electrical engineering. Competitive salary between $232,000 - $368,000 annually,...Senior$232k - $368k
NVIDIA AI is looking for a Senior Manager to lead the Silicon Co-Design Group in Santa Clara, California. This role includes planning post-silicon feature integration, leading technical teams, and building operational rigor. The ideal candidate will have over 12 years...Senior$188k - $275k
...What You'll Do: The Observability Engineering organization at CoreWeave is responsible... ...telemetry pipelines, and observability reliability, enabling teams to detect issues quickly... ...the role: CoreWeave is seeking a Senior Manager, Observability Engineering to lead a team...SeniorPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hours$210k - $270k
Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities...Senior$184k - $287.5k
...We are looking for a Senior Developer Relations Manager to drive strategic technical teamwork with leading Agentic AI companies building the next... ...building agents, model fine-tuning, tool calling, and context engineering, combined with a strategic understanding of the rapidly...SeniorWork experience placement$200k - $322k
A leading technology company is looking for a Senior Manager of Site Reliability Engineering in California. The role involves managing the full lifecycle of IT operations, transforming incident response through AI, and leading a high-performing team. The ideal candidate...SeniorFull time- Lockheed Martin in Sunnyvale, California is looking for a Senior Manager to lead technologies that support national defense, particularly in optics and electro-optics. This role entails overseeing R&D programs, managing budgets, and collaborating with various teams within...Senior
$232k - $368k
Nvidia Corporation in Santa Clara is seeking a System Integration Lead to manage and resolve critical silicon issues before production. The role involves leading a team focused on delivering high-quality silicon, developing strategies to keep programs on schedule, and...Senior$130k - $160k
DeWinter Group is seeking a Senior Financial Reporting & Technical Accounting Lead in Sunnyvale, CA. This role involves architecting... ...expansion, leading GAAP-compliant financial statement preparation, and managing audit relationships. The ideal candidate has 3-6 years of...Senior$207k - $300k
Google Inc. is looking for a Staff Software Engineer specializing in Site Reliability Engineering in Sunnyvale, CA. This role combines software and systems engineering to build and manage distributed systems, ensuring high reliability and uptime. The ideal candidate should...Senior$160k - $220k
A leading vehicle intelligence company in California is seeking a Solutions Engineering Manager to lead technical pre-sales for global customers. This role demands expertise in automotive software development and customer engagement, with a focus on closing deals. Ideal...SeniorFlexible hours$127.84 per hour
...in marketing working across one or more marketing fields (i.e. growth, product marketing, brand marketing, social). ~ Experience managing cross-functional or cross-team projects. Preferred Qualifications: Ability to thrive in a scrappy, fast-paced environment...SeniorHourly payContract work- ...Senior Manager, Deal Desk LeanData helps the world's fastest-growing companies automate, simplify, and accelerate revenue. We're seeking... ...Desk to drive strategic initiatives across the commercial engine, owning complex enterprise deals while designing scalable processes...SeniorFull timeWork at officeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Manager, Site Reliability Engineering. Be the first to apply!
- site reliability engineer Santa Clara, CA
- site reliability engineer sre Santa Clara, CA
- senior development executive Santa Clara, CA
- senior technical manager Santa Clara, CA
- senior software development engineer in test Santa Clara, CA
- senior manager data science Santa Clara, CA
- senior platform engineer Santa Clara, CA
- senior procurement Santa Clara, CA
- senior director product management Santa Clara, CA
- senior electronic design engineer Santa Clara, CA


