Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer (SRE)

$181k - $197k

Clutch Canada

Senior SRE Palo Alto, CA • Engineering • Hybrid • Full-time Founded by a team of ex-Apple engineers, Instrumental provides a collection of software technologies that enable brands and manufacturers to optimize their manufacturing process to cut waste by orders of magnitude compared to conventional solutions. This waste is non-trivial: 20 cents on every dollar spent in manufacturing is wasted in mistakes, experiments, downtime, yield loss, scrap, and returns. Our customers are top electronics companies, including Meta, NVIDIA, F5, and others. The Instrumental platform collects, intelligently transforms, and contextually presents manufacturing data to technical end-users, enabling them to optimize their manufacturing process in real-time. Our core technology is proprietary ML algorithms, packaged in an accessible, user-centric user interface – we believe we must have both the best technology and the best access to that technology to win. We are experiencing incredible growth, especially in the AI‑Infrastructure market segment. Requirements 5 or more years of DevOps or SRE experience deploying and operating commercial SaaS platforms on public cloud infrastructure, AWS preferred. Expert knowledge with Linux, shell, containerization, Kubernetes, IaC (terraform preferred), monitoring, logging, and APM tools. Proven ability to take initiative and drive impactful projects to completion efficiently and independently. Comfort with ambiguity, pace, and frequent pivots inherent in a startup environment, with a track record of creating clarity for teams. Experience introducing and integrating AI tools/processes into development and operation workflows. Demonstrated skill in setting, iterating on, and measuring KPIs to ensure ongoing performance, reliability and efficiency. Network/application security and compliance experience is a plus. Who You Are PSR-obsessed owner who takes responsibility for production reliability, performance, and customer impact, making pragmatic tradeoffs under pressure. Strong systems & infrastructure engineer with an everything-as-code, automation-first mindset. Thrives in ambiguity and high-growth environments; trusted, decisive, and calm when it matters most. Trusted, dependable partner known for sound judgment, clear communication, and calm execution—especially when things go sideways. This position requires access to items and data that are developed under U.S. government contracts and subject to dissemination controls that limit access to U.S. citizens only. We’re a growing team that works collaboratively, is supportive of each other, and is highly energized by the opportunity for a large impact. We actively work to promote an inclusive environment, valuing passion and the ability to learn. You’re encouraged to apply even if your experience doesn’t precisely match the job description! The following is a representative annual base salary range for this position within the Bay Area: $181-197k. Job level and salary opportunities are evaluated through our interview process – we review the experience, knowledge, skills, and abilities of each applicant. Instrumental is proud to offer a highly-rated variety of benefits, including health, vision, dental, commuter plans, and parental leave. #J-18808-Ljbffr Clutch Canada

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer (SRE) in Palo Alto, CA vacancy
  • $158k - $225k

     ...Senior Site Reliability Engineer (SRE) Manufacturing advanced electronics requires understanding millions of signals generated across complex assembly processes. Instrumental builds systems that capture and analyze those signals — images, test results, and process data... 
    Senior

    Instrumental Inc

    Palo Alto, CA
    4 days ago
  • $170k - $230k

     ...Site Reliability Engineer (SRE) Palo Alto / San Francisco Bay Area About Mithril Mithril is an AI infrastructure platform built to make GPU compute more accessible and affordable for the world's leading enterprises, AI startups, and the AI research community,... 
    Suggested
    Work at office
    Local area
    1 day per week

    Mithril

    Palo Alto, CA
    5 days ago
  •  ...The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems...  ...of a fast-growing customer base, and we need a seasoned SRE to help us scale these systems safely and keep them... 
    Senior

    XRC Ventures

    Palo Alto, CA
    2 days ago
  • $137.77k - $194.59k

     ...of roughly 80 scientists and engineers building and operating Rubin'...  ...role: You will own the reliability and robustness of Rubin Observatory...  ...Experience working in an SRE, DevOps, or data-intensive...  ...position, SLAC is open to on-site, hybrid, and remote work options... 
    Senior
    Remote work
    Flexible hours
    Night shift

    Stanford University

    Menlo Park, CA
    5 days ago
  •  ...Site Reliability Engineer There are NO limits to your career: come shape the future and be part of a truly unique global culture at OutSystems...  ...Hybrid Onsite in Menlo Park, CA Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software... 
    Senior
    Immediate start
    Remote work
    Worldwide

    OutSystems

    Menlo Park, CA
    5 days ago
  • $150k - $180k

    A technology-focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role... 
    Senior
    Full time

    Verrus, LLC

    Mountain View, CA
    5 days ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon’s production systems...  ...’ll build tooling that makes the entire engineering team more effective, establish on‑call...  ...not a pure ops role. At Pylon, we believe SRE work should be a maximum of 50 %... 
    Senior

    Pylon

    Palo Alto, CA
    5 days ago
  • A global technology leader is looking for an experienced SRE software engineer in Cupertino, California, to build and enhance compute infrastructure...  ...Applicants should have at least 8 years of experience in site reliability engineering, a strong background in cloud infrastructure,... 
    Senior

    Apple Inc.

    Cupertino, CA
    2 days ago
  • A leading technology company is looking for a Java SRE Engineer to support large-scale cloud migrations and production systems on AWS and...  ...team members and collaborating with various teams to ensure reliability. This position is onsite in the San Francisco Bay Area. #J-18... 
    Senior

    EITACIES Inc.

    Santa Clara, CA
    3 days ago
  • A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The...  ...Terraform. Ideal candidates will have over 4 years of experience in SRE or DevOps and a strong understanding of security best... 
    Senior

    Amiri Recruiting

    Mountain View, CA
    3 days ago
  • donato technologies is seeking a Senior SRE / DevOps Engineer in Sunnyvale, CA. The successful candidate will focus on ensuring system reliability and scalability while automating operations across all teams. Candidates should have over 8 years of experience in DevOps,... 
    Senior

    donato technologies

    Sunnyvale, CA
    2 days ago
  •  ...Site Reliability Engineer (SRE) Location: Santa Clara Valley (Cupertino), California, Hybrid. Duration: 6+ Months Job Description Deploy, support and monitor new and existing services, platforms, and application stacks. Use scale testing to measure, tune... 

    Zortech Solutions

    Cupertino, CA
    1 day ago
  •  ...Title: Site Reliability Engineer (SRE) Location: Location: Sunnyvale, CA (3x/ week onsite) Contract Responsibilities: Engage with our product teams to understand requirements, design and implement resilient and scalable infrastructure... 
    Contract work

    AceStack LLC

    Sunnyvale, CA
    3 days ago
  •  ...Senior Site Reliability Engineer LeanData helps the world's fastest-growing companies automate, simplify, and accelerate revenue. We are looking...  ...Experienced Architect: 5+ years of experience in SRE, DevOps, or Systems Engineering, with a proven track record... 
    Senior
    Full time
    Work at office
    Flexible hours
    2 days per week

    LeanData

    Santa Clara, CA
    5 days ago
  • $150k - $175k

     ...Site Reliability Engineer At ASAPP, our mission is simple: deliver the best AI-powered customer experience—faster than anyone else. To achieve that, we're guided by principles that shape how we think, build, and execute. We value customer obsession, purposeful speed... 
    Senior
    Remote work

    ASAPP

    Mountain View, CA
    1 day ago
  • $148k - $235.75k

     ...world. NVIDIA is looking for a seasoned SRE to join its complex and fast-paced...  ...organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-...  ...-prem infrastructure. Maintain uptime, reliability and readiness of on-prem engineering... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...Senior Site Reliability Engineer Latitude AI develops automated driving technologies, including L3, for Ford vehicles at scale. We're driven by the opportunity to reimagine what it's like to drive and make travel safer, less stressful, and more enjoyable for everyone... 
    Senior
    Work at office
    Immediate start

    Latitude AI

    Palo Alto, CA
    5 days ago
  •  ...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native... 
    Senior
    Remote job

    BuildBuddy

    Palo Alto, CA
    3 days ago
  • $207k - $300k

    Google Inc. is looking for a Staff Software Engineer specializing in Site Reliability Engineering in Sunnyvale, CA. This role combines software and systems engineering to build and manage distributed systems, ensuring high reliability and uptime. The ideal candidate should... 
    Senior

    Google Inc.

    Sunnyvale, CA
    5 days ago
  • $180k - $260k

     ...effortless integration into customers' logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you will... 
    Senior
    Odd job
    Work at office
    Remote work

    Gatik AI

    Mountain View, CA
    5 days ago
  • $169k - $224k

     ...organization of scientists, engineers, and physicians and we are using...  ...GRAIL is seeking a Staff Site Reliability / DevOps Engineer to lead the...  ...~10+ years of experience in SRE, DevOps, or infrastructure engineering...  ...with cross-functional and senior stakeholders Fast-paced,... 
    Full time
    Work at office
    Local area
    Flexible hours
    Shift work

    GRAIL

    Menlo Park, CA
    6 days ago
  • $126k - $204.5k

     ..., you will collaborate closely with our engineering teams to develop innovative solutions that...  ...of the product and ensure the reliability and availability of our services. Qualifications...  ...~5+ years of experience as a DevOps/SRE engineer with a passion for technology and... 
    Senior
    Full time
    Work at office

    Palo Alto Networks

    Santa Clara, CA
    4 days ago
  •  ...that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and...  ...operating production distributed systems as SRE/DevOps/Platform Ops. Proven ownership of... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $174k - $252k

    A leading tech company is seeking a Senior Software Engineer for Site Reliability Engineering based in Sunnyvale, CA. The role involves ensuring service reliability, leading technical projects, and enhancing systems performance. Candidates should have at least 5 years of... 
    Senior

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $174k - $252k

    Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered...  ..., and a good one. Site Reliability Engineering (SRE) is an engineering discipline that combines software... 
    Senior
    Full time

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $152k - $241.5k

    Overview NVIDIA is looking for a Senior Site Reliability Engineer (SRE) to join our Compute Farm team and help build the next generation of our global services platform. The role focuses on keeping critical systems operational while leveraging AI technologies to deliver... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...intelligence. Job Overview We’re looking for a Senior SRE to join our Compute Farm team and help...  ...host lifecycle management, fleet reliability/auto‑healing, E2E observability or data...  ..., Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • A leading technology firm is in search of a Senior Wireless Network Site Reliability Engineer to manage and enhance their wireless network infrastructure. The ideal candidate has over 8 years of experience in wireless network operations and a strong background in wireless... 
    Senior

    TechDigital Group

    Santa Clara, CA
    4 days ago
  •  ...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems... 
    Senior

    Prophet Town

    Mountain View, CA
    5 days ago
  • Title: Sr. SRE / DevOps Engineer Location: Sunnyvale, CA Job Summary - For this role, we are looking for a Sr. SRE / DevOps Engineer at Sunnyvale, California location. As Site Reliability Engineer, the individual will work closely with multi‑functional teams, automate... 
    Senior
    Local area

    donato technologies

    Sunnyvale, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer (SRE). Be the first to apply!