Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Manager, Site Reliability Engineering

$204k - $281k
Full-time

Okta

Secure Every Identity, from AI to Human Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. Manager, Site Reliability Engineering San Francisco, California Secure Every Identity, from AI to Human Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. **This position requires 2 days a week in our San Francisco Office. The IDaaS Site Reliability Engineering Group Okta authenticates, authorizes and provisions millions of users a day. The service is hosted on Amazon Web Services (AWS) across multiple availability zones and geographically separated regions. The service is designed for high throughput and 99.999 availability. We're looking for a technical leader to help us continue to scale the service with great people and reliable, cost-effective, and efficient infrastructure, processes, and tooling. As the Manager of Infrastructure Platform and Shared Services, you will oversee multiple teams focused on Edge networking, K8s platform, CI/CD, Observability, automation platform & tooling. What you’ll be doing Managing a team of SRE’s supporting various workloads and teams that support our IDaaS platform. Drive the microservice journey, DevOps maturity, and workload reliability in tandem with architects and teams across the organization. Accelerate the velocity of SRE and product engineering by developing powerful tooling, intuitive self-service capabilities, and robust self-healing patterns. Lead, mentor, and grow a high-performing team of engineers and managers across platform, infrastructure, and shared services domains. Perform engineering design evaluations and ensure the completion of projects within resource, budget, and scheduling constraints. Improve SDLC processes for Cloud infrastructure as a code, including the maturity of CI/CD pipelines, change and release management Manage service and business expectations and prioritize resource allocation Maintain a deep knowledge of industry best practices, evolving trends, and technologies What you’ll bring to the role 3+ years of experience in technical leadership & people management Extensive experience using Agile and DevOps methodologies to build product infrastructure and shared service at scale Experience running large-scale infrastructure platforms supporting a SaaS/Cloud service in a public Cloud, preferably AWS. Experience supporting a multi-Cloud environment will be a plus. Strong expertise in cloud-native architectures, containerization (Kubernetes), IaC (Terraform), and CI/CD pipelines Strong background and hands-on experience in SW development, PaaS and automation Deep experience with building and operating observability platforms and monitoring tools (Grafana, Splunk, APM etc.) in a large scale environment. Effective verbal, written communication and interpersonal skills Computer Science Degree or related degree or equivalent experience Additional requirements: This position requires the ability to access federal environments and/or have access to protected federal data. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire. #LI-Hybrid

#LI-TM

(P21661_3436238)

Below is the annual base salary range for candidates located in San Francisco Bay Area. Your actual base salary will depend on factors such as your skills, qualifications, experience, and work location. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program please visit: The annual base salary range for this position for candidates located in the San Francisco Bay area is between:

$204,000—$281,000 USD

The Okta Experience Supporting Your Well-Being Driving Social Impact Developing Talent and Fostering Connection + Community We are intentional about connection. Our global community, spanning over 20 offices worldwide, is united by a drive to innovate. Your journey begins with an immersive, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one. Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws. If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation. Notice for New York City Applicants & Employees: Okta may use Automated Employment Decision Tools (AEDT), as defined by New York City Local Law 144, that use artificial intelligence, machine learning, or other automated processes to assist in our recruitment and hiring process. In accordance with NYC Local Law 144, if you are an applicant or employee residing in New York City, please click here to view our full NYC AEDT Notice. Okta is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Personnel and Job Candidate Privacy Notice at

Vacancy posted 10 hours ago
Similar jobs that could be interesting for youBased on the Manager, Site Reliability Engineering in San Francisco, CA vacancy
  •  ...Job Description Forhyre is looking for engineers who can bring unique perspectives and...  ...practices while building a culture of reliability and observability Engage in and improve...  ...Participate in critical incident management and timely post-mortems of production incidents... 
    Suggested

    Forhyre

    San Francisco, CA
    5 days ago
  • $150k

     ...Description About The Role We are seeking an experienced Site Reliability Engineer (SRE) with a strong focus on DevSecOps to join our growing...  ..., APIs, and software supply chain. You will drive patch management programs, harden our Cloud infrastructure, and maintain... 
    Suggested

    VantageScore

    San Francisco, CA
    10 days ago
  • $163k - $203k

     ...contributor on the SRE team, responsible for the reliability, scalability, and security of Prosper’s...  .... This is as much of a platform engineering role as it is SRE role — you will...  ...reliability within Kubernetes-based compute (managed by the Infrastructure Engineering team)... 
    Suggested
    Work experience placement
    Work at office
    Local area
    Remote work
    Flexible hours
    2 days per week

    Prosper

    San Francisco, CA
    1 day ago
  •  ...lasting impact. About the Role SDF is looking for a Senior Site Reliability Engineer to help build and operate the foundation that powers our...  ...engineer. First-hand experience with configuration management and infrastructure as code (Ansible, Puppet, Terraform).... 
    Suggested

    TechChain Talent

    San Francisco, CA
    29 days ago
  • $238k - $290k

     ...getting started. Role Overview As a Staff Software Engineer on the Site Reliability team at Harvey, you will ensure the reliability, scalability...  .... What You'll Do Design, implement, and manage monitoring, alerting, and infrastructure resources (compute... 
    Suggested
    Relocation package

    Harvey

    San Francisco, CA
    2 days ago
  •  ...alongside clinicians to make that possible. We’re a team of doctors, engineers, designers, researchers, and creatives building tools that...  ...for leading incidents end-to-end. Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes... 
    Work at office
    Worldwide

    Heidi Health Ltd

    San Francisco, CA
    2 days ago
  • $163k - $203k

     ...contributor on the SRE team, responsible for the reliability, scalability, and security of Prosper’s...  .... This is as much of a platform engineering role as it is SRE role — you will...  ...reliability within Kubernetes‑based compute (managed by the Infrastructure Engineering team)... 
    Work experience placement
    Work at office
    Local area
    Remote work
    Flexible hours
    2 days per week

    Prosper

    San Francisco, CA
    3 days ago
  • $125k - $165k

    Position: Site Reliability Engineer Location: San Francisco, CA Job Id: 434 # of Openings: 1 TELCOR Inc, a leading innovator in laboratory software...  ...across cloud and containerized environments, as well as manage production infrastructure and deployment workflows across... 
    Temporary work
    Work at office
    Visa sponsorship
    Work visa
    Relocation package
    Flexible hours

    TELCOR

    San Francisco, CA
    10 hours ago
  • The role We're looking for a world-class Site Reliability Engineer to ensure the reliability, performance, and scalability of our AI infrastructure...  ...Jenkins) Strong debugging, problem‑solving, and incident‑management skills Preferred Experience with infrastructure‑as‑code... 

    Blaxel

    San Francisco, CA
    3 days ago
  •  ...daily users while enabling our engineering teams to ship fast. You'll...  ...automation and tooling that improves reliability and partnering with...  ...reliability best practices Manage and optimize our infrastructure...  ...What you'll bring 5+ years in Site Reliability Engineering, DevOps... 
    Work at office
    Work from home

    gamma.app

    San Francisco, CA
    10 hours ago
  • US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average...  ...implementing automated infrastructure using Terraform, managing containerized workloads within Kubernetes, and refining... 

    Axiom Pursuits

    San Francisco, CA
    2 days ago
  •  ...poised to redefine computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI...  ...success rates, building robust incident response systems, managing capacity across our distributed GPU network, and implementing... 

    deCircle

    San Francisco, CA
    1 day ago
  • $166.9k - $225.9k

     ...SRE team operates as both a central engineering function and an embedded reliability practice. You'll be part of a...  ...infrastructure requests: ECS task management, secret rotations, Terraform changes...  ...bring 6+ years of experience in Site Reliability Engineering, Cloud Engineering... 
    Flexible hours

    Drata

    San Francisco, CA
    3 days ago
  • $140.3k - $191.55k

     ...time to write medical publications and regulatory paperwork. Site Reliability Engineer Location: Atlanta, GA; Miami, FL; Cambridge, MA; San...  ...solutions to support service delivery processes Build and manage CI/CD pipelines, automated testing, capacity planning, performance... 
    Temporary work
    Work experience placement

    Writemed

    San Francisco, CA
    2 days ago
  •  ...home day is currently Tuesday. Engineering at Lambda is responsible for...  ...for system deployment, management and maintenance. What You’ll...  ...adoptable and improve product reliability. Lead members of other engineering...  ...5+ years of experience in Site Reliability Engineering... 
    Work at office
    Local area
    Work from home

    Lambda

    San Francisco, CA
    1 day ago
  •  ...manifesto. About the Role We're looking for an Infrastructure Engineer to take the lead on scaling our operational resilience as we...  ...This is a high-impact, high-trust role where you’ll shape how reliability is done - reducing incident load, building internal tooling, and... 
    Worldwide
    Shift work

    Happyrobot Inc.

    San Francisco, CA
    2 days ago
  • $140k - $205k

    Senior Technology Site Reliability Engineer page is loaded## Senior Technology Site Reliability Engineerlocations: San Francisco: New York: Santa...  ...to ensure high availability and performance* Implement and manage service-level indicators (SLIs), objectives (SLO’s),... 
    Full time
    Temporary work
    Work at office
    Flexible hours
    Weekend work

    Cooley LLP

    San Francisco, CA
    2 days ago
  •  ...About the role We’re hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems that power our product...  ...patching, audit readiness and vulnerability management. Participate in the on‑call rotation and respond... 
    Work at office
    Remote work
    Flexible hours
    2 days per week

    Plenful

    San Francisco, CA
    3 days ago
  • $125k - $165k

    Position Site Reliability Engineer Location Lincoln, NE, San Francisco, CA, or Remote Job ID 434 Openings 1 Job Summary The Site Reliability...  ...resilient systems across cloud and containerized environments, and manage production infrastructure and deployment workflows across... 
    Temporary work
    Remote work
    Visa sponsorship
    Work visa
    Flexible hours

    TELCOR Inc

    San Francisco, CA
    10 hours ago
  • $165k - $225k

     ...and changing Stellar ecosystem. SDF is looking for a Senior Site Reliability Engineer to help build and operate the foundation that powers our...  ...DevOps engineer. First-hand experience with configuration management and infrastructure as code (Ansible, Puppet, Terraform). Proficient... 
    Temporary work
    Work at office
    Local area
    Worldwide
    Flexible hours

    Stellar

    San Francisco, CA
    10 hours ago
  • Happyrobot Inc. is looking for an Infrastructure Engineer in San Francisco, California. This role involves leading the stability and observability of systems while debugging complex issues as they arise. Candidates should have over 3 years of experience with production... 

    Happyrobot Inc.

    San Francisco, CA
    2 days ago
  •  ...customer acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from companies like...  ...of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of data monthly and... 

    Unify

    San Francisco, CA
    3 days ago
  •  ...that significantly outperforms individual engineers. We combine language models with human...  ...The Role We are seeking an experienced Site Reliability Engineer to join our Platform...  ...Engineering roles Proven track record of managing production systems at scale, preferably... 

    CodeRabbit

    San Francisco, CA
    2 days ago
  • A dynamic tech firm located in San Francisco is seeking a Site Reliability Engineer to enhance operational health across their production systems...  ...expertise in AWS and strong programming skills. You will manage production systems' reliability and lead incident response... 

    gamma.app

    San Francisco, CA
    10 hours ago
  • TELCOR Inc is looking for a Site Reliability Engineer to ensure the reliability, scalability, and performance of our AI products' systems. The...  ...resilient systems in cloud and containerized environments while managing production infrastructure and deployment workflows. The... 
    Remote job

    TELCOR Inc

    San Francisco, CA
    10 hours ago
  •  ...specialist technology provider delivering advanced provisioning, management, and security solutions for data centers. The organization...  ...Skills/Qualifications BS/MS degree in Computer Science, Engineering, or a related subject. Equivalent experience accepted. Proven... 
    Work experience placement
    Start working today
    Remote work
    Flexible hours

    Hamilton Barnes Associates Limited

    San Francisco, CA
    10 hours ago
  • # Senior Site Reliability EngineerHybrid - San Francisco**Our Mission & Values:** At Drata, we...  ...s SRE team operates as both a central engineering function and an embedded reliability...  ...Handle infrastructure requests: ECS task management, secret rotations, Terraform changes,... 
    Work at office
    Immediate start
    Worldwide
    Monday to Friday
    Flexible hours

    Careers at Drata

    San Francisco, CA
    3 days ago
  • For more information, please read ourSenior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: US - San Francisco...  ...6+ years of experience in Site Reliability Engineering, managing infrastructure and services at scaleHistory of end-to-end... 
    Immediate start
    Remote work
    Worldwide

    OutSystems Inc.

    San Francisco, CA
    2 days ago
  • What you’ll do As a Senior Site Reliability Engineer, you’ll work closely with product teams in Spend to deliver and maintain scalable, reliable cloud infrastructure in support of key product initiatives. Aligned to the roadmap, you’ll lead on infrastructure design and... 

    Airwallex-

    San Francisco, CA
    1 day ago
  • $250.5k - $335.9k

    P5/P6: SRE Lead, Content Distribution Engineering Media Engineering. SF CA / LA CA / NYC Team Intro On any given day at Disney Entertainment...  ...of 12 years of engineering leadership experience, including managing and influencing teams directly and indirectly Bachelors or... 
    Worldwide

    The Walt Disney Company (Germany) GmbH

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Manager, Site Reliability Engineering. Be the first to apply!