Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Staff Site Reliability Engineer

$126k - $204.5k

Palo Alto Networks

Job Summary The Cortex team builds and delivers the industry’s most advanced SecOps platform, consisting of XDR, XSIAM, XSOAR, and XPANSE. As a member of the Cortex DevOps team, your role involves operating and maintaining a large‑scale GCP environment, including the design, implementation, and continuous enhancement of our comprehensive observability systems. To meet the opportunities that such a role provides, you will have a deep knowledge of modern observability and monitoring tools and practices, having managed high cardinality metrics, implemented tracing, and operationalized large‑scale logging solutions. As part of this role, you will collaborate closely with our engineering teams to develop innovative solutions that provide clear and actionable insights into our systems’ performance and health. Key Responsibilities Utilize expertise in monitoring cloud platforms, particularly GCP, to optimize our infrastructure, leveraging cloud‑native technologies. Improve monitoring processes, alerts, and metrics, and work with development teams to ensure that all of our services have the right monitoring and metrics in place to detect problems before our customers do. Leverage incident management processes to ensure efficient resolution of system issues and minimal impact on services. Automate complex monitoring and alerting tasks by building tools for cloud operations, such as automated remediation of known issues and auto‑scaling. Stay up‑to‑date with cutting‑edge technologies, evaluate their potential impact on our operations, and implement them when appropriate. Provide follow‑the‑sun operational coverage in the production of our Observability infrastructure. Work with our Engineering team to influence the operability of the product and ensure the reliability and availability of our services. Qualifications Required Qualifications 5+ years of experience as a DevOps/SRE engineer with a passion for technology and a strong motivation for high reliability at the service level. High proficiency with Thanos, Prometheus, Grafana, Open Telemetry and other monitoring tools. Clear understanding of incident and alerts management using tools like Pagerduty and Prometheus Alert Manager. High proficiency in either Google Cloud Platform or Amazon Web Services. High proficiency with Kubernetes and Docker for container orchestration. High proficiency in Python programming and Linux Shell commands. Experience with Ansible and Terraform for infrastructure as code. Preferred Qualifications Effective communication and interpersonal skills, with the ability to work and coordinate between multiple teams in different time zones. Ability to effectively troubleshoot and address emerging and complex problems. Ability to operate independently, make decisions, take action, and take responsibility. Compensation Disclosure The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non‑sales roles) or base salary + commission target (for sales/commissioned roles) is expected to be the annual range listed below. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here. $126,000.00 - $204,500.00/yr Equal Opportunity Employer Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics. All information will be kept confidential according to EEO guidelines. #J-18808-Ljbffr Palo Alto Networks, Inc.

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior Staff Site Reliability Engineer in Santa Clara, CA vacancy
  •  ...building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute cluster): uptime, performance... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $200k - $322k

    Senior Manager, Site Reliability Engineering page is loaded## Senior Manager, Site Reliability Engineeringlocations: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2016119For over 25 years, NVIDIA has been at the forefront of transforming... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $176k - $276k

    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $230k - $250k

    Cerebras Systems in Sunnyvale, CA, seeks a Sr. Member of Technical Staff to develop resilient software for their AI chip. Responsibilities include designing robust software features, maintaining deployment workflows using AWS, and debugging software issues. Candidates should... 
    Senior
    Remote job

    Cerebras

    Sunnyvale, CA
    1 day ago
  • $163.8k - $226.22k

    42dot Inc. is seeking a Sr. Staff Technical Project Manager to lead complex projects for software-defined vehicles. This role involves cross-functional collaboration, ensuring technical milestones, and managing vendor relationships. The ideal candidate has over 6 years... 
    Senior

    42dot Inc.

    Sunnyvale, CA
    2 days ago
  • $184k - $287.5k

    We are seeking software engineers to work on next-generation high-speed interconnect technologies. Our charter is to develop the most demanding high-speed IO applications a GPU or high-performance computing server will encounter in its lifecycle, by collaborating closely... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $200k - $322k

    NVIDIA Gruppe in Santa Clara is seeking a Senior Staff Software Engineer to lead engineering efforts in their enterprise systems. Responsibilities include designing AI-driven workflows, managing enterprise issues with an automation focus, and mentoring team members. The... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • Overview Staff/Senior Backend Engineer - Sunnyvale, CA. Duration: 6 to 12+ months. Rate: DOE. Responsibilities Provide operations support for backend end-to-end tools. Develop REST APIs and automation solutions. Collaborate with a large backend team (navigate through a... 
    Senior

    Redolent Infotech Pvt. Ltd.

    Sunnyvale, CA
    3 days ago
  • $180.5k - $270.7k

    Qualcomm is seeking an experienced Thermal Engineer to develop high-performance thermal solutions for data center applications in Santa Clara, California. The role involves hands-on lab work, thermal testing, and collaboration with cross-functional teams. The ideal candidate... 
    Senior

    Qualcomm

    Santa Clara, CA
    5 days ago
  • $152k - $241.5k

    We are looking for a Senior System Software Engineer to work on. NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in AI, enabling breakthroughs in... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $262k - $365k

    Google Inc. is seeking a Senior Staff Software Engineer, specializing in Site Reliability Engineering. This role involves leading projects, engaging through the entire lifecycle of services, and ensuring systems remain reliable and efficient. Candidates should have 8 years... 
    Senior

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $233k - $349.6k

    Qualcomm is looking for a Server Power Management Architect in Santa Clara for its Data Center team. This role involves designing high-performance, energy-efficient server solutions, requiring over 10 years of experience in power management, particularly with high-performance...
    Senior

    Jobleads-US

    Santa Clara, CA
    5 days ago
  • A multinational semiconductor company based in California is seeking a Fellow Server CPU Validation Architect. This role involves driving the CPU validation strategy, engaging with technical leaders on next-generation technologies, and ensuring effective execution of validation...
    Senior

    Advanced Micro Devices

    Santa Clara, CA
    1 day ago
  • $207k - $300k

    Google Inc. is looking for a Staff Software Engineer specializing in Site Reliability Engineering in Sunnyvale, CA. This role combines software and systems engineering to build and manage distributed systems, ensuring high reliability and uptime. The ideal candidate should... 
    Senior

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • Qualcomm in Santa Clara is seeking a highly experienced Server Product Architect to define the architecture for a Server SoC that meets critical customer KPIs. This role involves collaborating with architects, developing a server roadmap, and analytical modeling of server...
    Senior

    Jobleads-US

    Santa Clara, CA
    4 days ago
  • One of our direct clients is urgently looking for a Staff/Senior Web QE Automation Engineer in Sunnyvale, CA. TITLE: Staff/Senior Web QE Automation Engineer LOCATION: Sunnyvale, CA Duration: 6 to 12+ Months Rate: DOE Description: 7+ Years of extremely strong hands-on experience... 
    Senior

    Redolent Infotech Pvt. Ltd.

    Sunnyvale, CA
    2 days ago
  • Advanced Micro Devices is seeking a Fellow for Post-Silicon Validation Architecture in Santa Clara, CA. This role demands leadership in CPU validation strategies and engagement with design teams to ensure quality and efficiency in server product validations. The ideal candidate...
    Senior

    Advanced Micro Devices

    Santa Clara, CA
    1 day ago
  • Qualcomm is seeking a Server Platform Architect to join their Data Center team in Santa Clara, California. This role involves hands-on design and execution of high-performance server platforms, collaborating closely with cross-functional teams to ensure designs meet system...
    Senior

    Qualcomm

    Santa Clara, CA
    1 day ago
  • $272k

     ...with skilled teams to drive quality and speed in product development. The ideal candidate has a strong educational background in engineering and expertise in C/C++, Python, and data center management tools. Salary ranges from $272,000 to $488,750 based on experience. #J... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing... 
    Senior

    Nectar

    Palo Alto, CA
    2 days ago
  • $165.5k - $289.6k

    Company Description It all started when engineer Fred Luddy wrote code that automated a tedious task for his coworker, Phyllis. That moment...  ...Security Incident Command (SIC) team is seeking an experienced senior security incident commander to join our fast‑growing team. This... 
    Senior
    Work at office
    Immediate start
    Remote work
    Relocation
    Flexible hours

    ServiceNow

    Santa Clara, CA
    1 day ago
  • $164.47k - $311.89k

     ...Join Intel's Hard IP Development Group (HIPD) within the Central Engineering Organization, where innovation meets execution. Our team...  ...which allows employees to split their time between working on‑site at their assigned Intel site and off‑site. #J-18808-Ljbffr Intel
    Senior
    Local area
    Remote work
    Shift work

    Intel

    Santa Clara, CA
    3 days ago
  •  ...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems... 
    Senior

    Prophet Town

    Mountain View, CA
    4 days ago
  • $184k - $287.5k

     ...technologies. We're in search of a visionary technical leader to engineer and propel innovation in diagnostics for NVIDIA's partner...  ...diagnostics are the nervous system of our platforms—ensuring reliability, performance, and innovation at scale. If you're a creative, driven... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • A leading technology company is looking for a Java SRE Engineer to support large-scale cloud migrations and production systems on AWS...  ...mentoring team members and collaborating with various teams to ensure reliability. This position is onsite in the San Francisco Bay Area. #J-188... 
    Senior

    EITACIES Inc.

    Santa Clara, CA
    2 days ago
  • JPMorgan Chase & Co. is seeking a Director of Site Reliability Engineering to partner with the Infrastructure Platforms and Foundational Services team in Palo Alto. This role involves guiding stakeholders through complex projects, leading the application of AI capabilities... 
    Senior

    JPMorgan Chase & Co.

    Palo Alto, CA
    3 days ago
  • $224k - $431.25k

    NVIDIA Gruppe in Santa Clara is looking for a skilled engineer to develop diagnostic systems for data center platforms. You'll lead platform integration and analyze failures to develop scalable solutions in collaboration with multi-disciplinary teams. The ideal candidate... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal... 
    Senior

    Amiri Recruiting

    Mountain View, CA
    2 days ago
  • Staff / Senior Staff Physical Design Engineer - India Bolt Graphics is a semiconductor startup based in Sunnyvale, CA building the fastest and most efficient graphics processors. We pride ourselves on our first principles approach to solving problems. We are energized by... 
    Senior

    Bolt Graphics, Inc.

    Sunnyvale, CA
    1 day ago
  • $250k

     ...systems, eGain provides the single source of truth—explainable, reliable, and maintainable—that serves as the repository for all...  ...at scale. Position Overview As Director of Site Reliability Engineering, you will ensure that eGain’s AI knowledge management platform... 
    Work at office

    eGain Corporation

    Sunnyvale, CA
    1 hour ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Staff Site Reliability Engineer. Be the first to apply!