Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer - Scale & Observability

gamma.app

A dynamic tech firm located in San Francisco is seeking a Site Reliability Engineer to enhance operational health across their production systems. This high-impact role demands expertise in AWS and strong programming skills. You will manage production systems' reliability and lead incident response efforts to prevent issues, all while contributing to the scalability and efficiency of their services. Ideal candidates will have 5+ years of relevant experience and a passion for leveraging technology to drive outcomes. #J-18808-Ljbffr gamma.app

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer - Scale & Observability in San Francisco, CA vacancy
  • $175k - $250k

     ...did my part and supported the Regular Toilet is seeking a Site Reliability Engineer to enhance the reliability and performance of our systems at...  ...environment. Join us to help ensure our platform runs reliably at scale. #J-18808-Ljbffr I did my part and supported the Regular... 
    Suggested
    Remote job
    Flexible hours

    I did my part and supported the Regular Toilet

    San Francisco, CA
    1 day ago
  •  ...is currently Tuesday. Engineering at Lambda is responsible for building and scaling our cloud offering. Our...  ...Do Deploy and operate observability platforms for logging,...  ...and improve product reliability. Lead members of other...  ...years of experience in Site Reliability Engineering... 
    Suggested
    Work at office
    Local area
    Work from home

    Lambda

    San Francisco, CA
    15 hours ago
  • A leading AI research company based in San Francisco is seeking experienced reliability engineers to scale their infrastructure and ensure system performance and reliability. This role involves collaborating with diverse teams to develop resilient systems and enhance operations... 
    Suggested

    OpenAI

    San Francisco, CA
    1 day ago
  • Fieldguide is seeking a Senior Site Reliability Engineer to ensure the reliability and scalability of our production systems in San Francisco...  ...teams to define reliability standards and build robust observability practices. Candidates should have at least 5 years of experience... 
    Suggested
    Remote job
    Flexible hours

    Fieldguide

    San Francisco, CA
    4 days ago
  • $147k - $202k

     ...Overview: We are seeking a highly technical Staff Observability Site Reliability Engineer with a specialty in Splunk to own and evolve our Splunk...  ...: Eliminate "toil" by automating the deployment and scaling of observability agents and collectors. Required Skills... 
    Suggested
    Permanent employment
    Work at office
    Local area
    Worldwide
    Flexible hours

    Okta

    San Francisco, CA
    a month ago
  • $230k - $310k

    A tech company is seeking an experienced Site Reliability Engineer to ensure the reliability and performance of its production systems across AWS infrastructure. You will build observability tools, lead incident responses, and collaborate on architectural improvements.... 

    Gamma

    San Francisco, CA
    2 days ago
  • $177.19k - $364.8k

    Pinterest is seeking a Staff Software Engineer to join the Observability team. This role involves designing and building observability solutions while collaborating with various teams. Ideal candidates will have over 7 years of experience in distributed systems, a Bachelor... 
    Work at office

    jobr.pro

    San Francisco, CA
    1 day ago
  •  ...in San Francisco seeks infrastructure engineers to enhance the tooling and systems...  ...include building GPU orchestration, scaling cloud batchjob systems, and designing...  ...infrastructure and a strong focus on reliability and observability. This position is in-person, and international... 
    Visa sponsorship

    Exa

    San Francisco, CA
    3 days ago
  • A leading AI research company in San Francisco is seeking a Software Engineer to enhance infrastructure supporting cutting-edge AI systems. The role involves designing reliable systems and optimizing performance for millions of users. Ideal candidates possess experience... 

    OpenAI

    San Francisco, CA
    2 days ago
  • $175k - $250k

     ...base of SaaS companies. About the Site Reliability Engineering Team The Site Reliability Engineering...  ...remains fast, reliable, and resilient at scale. We build the systems and practices...  ...modes Care deeply about uptime, observability, and performance, placing... 
    Remote work

    I did my part and supported the Regular Toilet

    San Francisco, CA
    1 day ago
  • $60 per hour

    Senior Site Reliability Engineer (Copy) Seattle Hybrid (Hybrid location). Full-time. About Us Supio...  ...bringing total funding to $91 M. We’re scaling rapidly and looking for exceptional...  ...coordination. Build safe, repeatable, and observable workflows. GitHub Operations: Manage... 
    Full time
    Work at office
    Flexible hours

    Bonfirevc

    San Francisco, CA
    1 day ago
  • # Senior Site Reliability EngineerHybrid - San Francisco**Our Mission &...  ...operates as both a central engineering function and an embedded reliability...  ...native stack to help Drata scale reliably for a rapidly...  ...artifacts - SLO templates, observability checklists, alerting... 
    Work at office
    Immediate start
    Worldwide
    Monday to Friday
    Flexible hours

    Careers at Drata

    San Francisco, CA
    2 days ago
  • $166.9k - $225.9k

     ...operates as both a central engineering function and an embedded reliability practice. You'll be part...  ...stack to help Drata scale reliably for a rapidly growing...  ...—SLO templates, observability checklists, alerting standards...  ...years of experience in Site Reliability Engineering,... 
    Flexible hours

    Drata

    San Francisco, CA
    2 days ago
  •  ...role We're looking for a world-class Site Reliability Engineer to ensure the reliability, performance...  ...systems that power agentic AI at scale. Your mission: keep our ultra-low-latency...  ...our reliability posture end-to-end—observability, performance tuning, incident ops, infrastructure... 

    Blaxel

    San Francisco, CA
    2 days ago
  •  ...Fieldguide. About the Role As a Senior Site Reliability Engineer (SRE) at Fieldguide, you will be...  ...ensuring the reliability, scalability, and observability of our production systems. You will...  ..., highly available, and capable of scaling with rapid growth. You’ll work closely... 
    Remote work
    Work from home
    Flexible hours

    Fieldguide

    San Francisco, CA
    4 days ago
  •  ...the economics of data integration at scale. And now Airbyte is building the frontier...  ...: You'll be the infrastructure and reliability engineer on the Data Replication team - a full-...  ...infrastructure. Maintain and enhance observability, alerting, and anomaly detection with... 
    Local area

    Airbyte

    San Francisco, CA
    4 days ago
  •  ...users while enabling our engineering teams to ship fast....  ...tooling that improves reliability and partnering with engineering...  ...systems that are observable, resilient, and easy...  ...help shape how Gamma scales to serve its next 100...  ...ll bring 5+ years in Site Reliability Engineering... 
    Work at office
    Work from home

    gamma.app

    San Francisco, CA
    4 days ago
  •  ...’re hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems...  ...will influence how we build, scale and operate our platform as we...  ...What you’ll do Reliability, Observability and Performance: Maintain and... 
    Work at office
    Remote work
    Flexible hours
    2 days per week

    Plenful

    San Francisco, CA
    2 days ago
  •  ...Connor was a machine learning research engineer at Scale AI. The rest of our team comes from...  ...Senior SRE, you'll tackle the scaling and reliability challenges that come with adding...  ...services, and building the automation and observability that keep Unify fast and reliable at... 

    Unify

    San Francisco, CA
    1 day ago
  •  ...computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU...  ...affordable, accessible AI compute at scale. Who You Are Expert in site reliability...  ...automated rollback mechanisms Proficient in observability tools and practices including metrics... 

    Hyperbolic Labs

    San Francisco, CA
    3 days ago
  • $151.5k - $252.5k

     ...enable the acceleration of safe AI at scale. As the market leader in both data...  ...are looking for an experienced Senior Site Reliability Engineer to join the Veeam Data Cloud (VDC) engineering...  ..., the Serverless Framework, etc.) Observability (Azure Monitor, AppInsights, Elastic... 
    Base plus commission
    Local area
    Worldwide

    Veeam

    San Francisco, CA
    2 days ago
  •  ...significantly outperforms individual engineers. We combine language models...  ...are seeking an experienced Site Reliability Engineer to join our...  ...to deploy, monitor, and scale our services reliably. As...  ...monitoring, alerting, and observability solutions using Datadog and... 

    CodeRabbit

    San Francisco, CA
    1 day ago
  • What you’ll do As a Senior Site Reliability Engineer, you’ll work closely with product teams in Spend...  ...readiness. Lead incident response, observability, and automation across critical systems...  ...Able to lead SRE strategy for large‑scale, cross‑functional projects. Strong... 

    Airwallex-

    San Francisco, CA
    15 hours ago
  •  ...The TeamPlatform Engineering is the department within SRE that is responsible...  ...internal service mesh), and observability and alerting systems.The...  ...that ensure cluster reliability and security (e.g., CoreDNS,...  ...Gatekeeper). As our infrastructure scales to support new use cases and... 
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    San Francisco, CA
    4 days ago
  • $140k - $205k

     ...Senior Technology Site Reliability Engineer Cooley is seeking a Senior Site Reliability Engineer...  ...and maintain automated, resilient, and observable systems that support high...  ...using Terraform Automate deployment, scaling, and recovery processes to reduce manual... 
    Full time
    Temporary work
    Work at office
    Flexible hours
    Weekend work

    Cooley

    San Francisco, CA
    4 days ago
  •  ...onboard services and teams to the reliability tenets. Establish and...  ...development teams to build resilient, observable, fault‑tolerant, recoverable...  .... 6+ years of experience in Site Reliability Engineering, managing infrastructure and services at scale. History of end‑to‑end... 

    OutSystems, Inc.

    San Francisco, CA
    1 day ago
  • $210.6k - $305.1k

     ...helping customers deploy at scale while also delivering AI-powered...  ...Security, Collaboration, and Observability portfolios Your Impact...  ...led a distributed team of 5+ engineers, can demonstrate strong technical...  ...Please see the Cisco careers site to discover more benefits and... 
    Full time
    Temporary work
    Local area
    Flexible hours

    Cisco

    San Francisco, CA
    5 days ago
  • $150k

     ...Role We are seeking an experienced Site Reliability Engineer (SRE) with a strong focus on...  ...implement alerting and dashboards using observability tooling (e.g., CloudWatch, Datadog, Grafana...  ...and vulnerability remediation at scale, including OS-level patching (Amazon... 

    VantageScore

    San Francisco, CA
    19 days ago
  • $227.2k - $324.5k

     ...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team...  ...challenges of building and running large-scale, distributed systems. Our mission is...  ...strategy and vision for Tubi's observability, and automation platforms. Partner with... 
    Full time
    Contract work
    Temporary work
    Local area
    Flexible hours

    Tubi

    San Francisco, CA
    4 days ago
  •  ...growing, early-stage startup to identify a top-tier Site Reliability Engineer who will play a critical role in scaling and strengthening a high-performance platform...  ...Deep understanding of system performance, observability, and debugging techniques Experience identifying... 

    Velia multiservices

    San Francisco, CA
    15 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer - Scale & Observability. Be the first to apply!