Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Site Reliability Engineer - Observability GCP

$194k - $267k
Full-time

Okta

Secure Every Identity, from AI to Human Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. We are seeking a highly technical Observability Site Reliability Engineer with a specialty in Google Cloud, to own and expand our Observability ecosystem into GCP. In this role, you will move beyond simple monitoring to delivering a world class, comprehensive, scalable Observability Platform that enables our SRE teams and business partners. You will treat infrastructure as code—utilizing Terraform and strong coding proficiency in Go, Python, or Ruby—to automate the deployment of agents and collectors across complex distributed systems. Key Responsibilities Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform. GCP Observabilty Engineering: Optimize the collection, processing, and storage of Observabilty data to ensure high reliability and low latency of our Splunk and Grafana services Incident Response: Participate in on-call rotations and lead post-incident reviews to drive systemic improvements and "observability-driven development." Automation: Eliminate "toil" by automating the deployment and scaling of observability agents and collectors. Required Skills & Experience (The Essentials) GKE: Minimum 5+ Experience scaling and managing observability in a Google Cloud platform. Visualization: Expertise in creating intuitive, actionable Splunk or Grafana dashboards that correlate data across multiple sources.SRE Mindset: Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems. Programming Proficiency: Strong coding skills in Python, Go for building internal tools and automating workflows. Distributed Systems: Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/GKE). Problem Solving: A data-driven approach to debugging complex, cross-service performance bottlenecks. Bonus Skills (The "Nice-to-Haves") Telemetry Standards: Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications. Grafana Loki: Experience in migrating Splunk to Grafana Loki Other Cloud Platforms: Experience managing observability native tools within AWS. Additional requirements: This position requires the ability to access federal environments and/or have access to protected federal data. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.

#LI-MM

#LI-Hybrid

P24517_3387022

Below is the annual base salary range for candidates located in San Francisco Bay Area. Your actual base salary will depend on factors such as your skills, qualifications, experience, and work location. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program please visit: The annual base salary range for this position for candidates located in the San Francisco Bay area is between:

$194,000—$267,000 USD

The Okta Experience Supporting Your Well-Being Driving Social Impact Developing Talent and Fostering Connection + Community We are intentional about connection. Our global community, spanning over 20 offices worldwide, is united by a drive to innovate. Your journey begins with an immersive, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one. Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws. If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation. Notice for New York City Applicants & Employees: Okta may use Automated Employment Decision Tools (AEDT), as defined by New York City Local Law 144, that use artificial intelligence, machine learning, or other automated processes to assist in our recruitment and hiring process. In accordance with NYC Local Law 144, if you are an applicant or employee residing in New York City, please click here to view our full NYC AEDT Notice.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Staff Site Reliability Engineer - Observability GCP in Washington DC vacancy
  • $175k - $250k

     ...Senior Cloud Infrastructure Engineer Location: San Francisco,...  ...unavailable. Modality: On‑Site only. Must live within commuting...  ..., performance, and reliability across environments. What...  ...systems for orchestration, observability, distributed storage, and networking... 
    Suggested
    Full time
    Remote work
    Relocation
    Relocation package

    The Recruiting Guy

    Washington DC
    1 day ago
  • A leading insurance company is seeking a Senior Engineer to drive innovation in building high-performance, low-maintenance platforms...  .... The role requires deep technical expertise in open-source observability and experience with distributed systems, Docker, and Kubernetes... 
    Suggested

    GEICO

    Chevy Chase, MD
    2 days ago
  • $103.5k - $150k

     ...whole self. The Role and Team The Site Reliability Engineering organization at Medallia brings together...  ...availability, and performance using observability and alerting platforms. Participate...  ...infrastructure platforms such as AWS, OCI, or GCP. Demonstrated experience with Linux... 
    Suggested
    Temporary work
    Work experience placement
    Local area
    3 days per week

    Medallia

    Mc Lean, VA
    2 days ago
  • $158k - $195k

    Position Summary: We are looking for a savvy engineer who will provide technical leadership...  ...platform capabilities that improve reliability, security, and consistency across development...  ...enabling application teams to adopt observability solutions such as the ELK Stack for... 
    Suggested
    For contractors
    Work at office
    Work from home
    2 days per week

    NRECA

    Arlington, VA
    1 day ago
  •  ...This role requires regularly working on-site at customer locations in Arlington, VA...  ...About The Role We are hiring a Site Reliability Engineer to join our Infrastructure & Security...  ...by building the tools, processes, and observability that make "fast recovery" a reality.... 
    Suggested
    Relocation
    Relocation package

    Onebrief, Inc.

    Arlington, VA
    3 days ago
  • $125k - $200k

    Overview As a Site Reliability Engineer (SRE) , you will help design, build, and operate reliable, secure, and observable cloud‑native systems that support mission‑critical applications...  ...manage cloud resources (e.g., AWS, Azure, GCP). Build and maintain CI/CD pipelines... 
    Local area
    2 days per week

    Steampunk

    Mc Lean, VA
    16 hours ago
  • $165k - $230k

    Sr. Site Reliability Engineer (Starshield) Washington, DC SpaceX was founded under the belief that a future where humanity is out exploring the...  ...for government use, with an initial focus on earth observation, communications, and hosted payloads. SpaceX’s satellite programs... 
    Permanent employment
    Temporary work
    Immediate start
    Weekend work

    SPACE EXPLORATION TECHNOLOGIES CORP

    Washington DC
    3 days ago
  • $166k - $220k

    ABOUT THE JOB As a site reliability engineer in Platform Discovery, you will solve a wide variety of problems involving networking, autonomy, systems...  ..., Ansible). Experience with cloud platforms (Azure, AWS, GCP). Proficiency in containerization (Docker) and container orchestration... 
    Full time
    Work experience placement
    Relocation package

    Slope

    Washington DC
    2 days ago
  • Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of...  ...cleanup. 4. Monitoring, Alerting & Incident Response Observability: Build and manage comprehensive dashboards using Prometheus... 
    Local area

    Tiger Analytics, LLC

    Washington DC
    2 days ago
  • $158k - $195k

    NRECA seeks a skilled engineer to provide technical leadership in software delivery through shared platforms...  ...DevOps practices. The role focuses on enhancing reliability and security across teams while adopting observability solutions like the ELK Stack. Position includes... 
    Work from home

    NRECA

    Arlington, VA
    1 day ago
  • $100k - $215k

    GEICO is seeking a Senior Engineer in Bethesda, Maryland to enhance their cloud platforms through innovative design and deployment. The role focuses on improving performance, automation, and observability within OpenStack-based environments. Ideal candidates will have... 
    Flexible hours

    GEICO

    Bethesda, MD
    3 days ago
  •  ...Job Description Senior DevOps / Cloud Engineer (AWS, GCP, AI Platform Operations) We are...  ...to-end, and driving implementation of reliable, secure, scalable, and compliant DevOps...  ..., secrets management, logging, and observability tools. · Experience with security scanning... 

    Software Consultants Inc.

    Bethesda, MD
    24 days ago
  • $185k - $230k

    As a Sr. Site Reliability Engineer (SRE) III, you’ll work as part of a collaborative and high-performing team providing your expertise to deliver...  ...to support reliable software delivery and operational observability across development, integration, staging, and production... 
    Full time
    Work experience placement
    Local area
    Immediate start
    Flexible hours

    MetroStar

    Washington DC
    4 days ago
  • $147k - $202k

     ...too, let's talk. The Auth0 Platform Observability team owns the observability tooling that...  ...we are looking for an Observability Engineer to help ensure that our Product and Platform...  .... If you have experience within the Site Reliability Engineering (SRE) field or working as... 
    Full time
    Local area
    Worldwide
    Flexible hours

    Okta

    Washington DC
    2 days ago
  • $194k - $267k

     ...on new concepts and tools. POSITION OVERVIEW: The Site Reliability Engineer (SRE) will play a key role in building and managing Kubernetes...  ...provide service-to-service communication, security, and observability within the Kubernetes clusters. Enable fine-grained... 
    Permanent employment
    Full time
    Work at office
    Local area
    Worldwide
    Flexible hours

    Okta

    Washington DC
    1 day ago
  • $105k - $215k

    GEICO is looking for a Senior Software Engineer to build the next-generation Release Platform and DevOps Tooling at their Bethesda, MD location. You will enhance software delivery workflows and mentor junior engineers, contributing to a collaborative environment. This... 

    GEICO

    Bethesda, MD
    1 day ago
  •  ...Senior Reliability Engineer This Department of War enterprise data and analytics program...  ...of the platform. Perform site reliability engineering to build...  ...cloud environments (AWS, Azure, or GCP). Experience implementing monitoring, observability, and performance management... 

    Koitecc Solutions

    Alexandria, VA
    3 days ago
  • $136.29k - $214.17k

     ...revenues. As a Senior Software Engineer in the Web Applications Team...  ...improve latency, reliability, and cost Review code, document...  ...~ Cloud experience on GCP (preferred) or AWS; Docker;...  ...Actions/Jenkins/GitLab ~ Observability skills: Datadog/OpenTelemetry... 
    Full time
    Worldwide

    ZoomInfo

    Bethesda, MD
    3 days ago
  • $130k - $165k

     ...We’re looking for a Software Engineer to help scale the systems that...  ...-facing products fast, reliable, and scalable. You’ll work closely...  ...automated tests. Performance & observability: Monitor system health,...  ...with cloud infrastructure (AWS, GCP, or similar). Experience with... 
    Apprenticeship
    Work experience placement
    Local area
    Work from home

    Axios

    Washington DC
    4 days ago
  • Insight Global is seeking an experienced Observability Engineer to enhance system health and performance in a complex IT landscape, including...  ...dashboards and implement monitoring solutions to ensure reliability and security. The role requires 7+ years in IT operations... 
    Remote job

    Insight Global

    Arlington, VA
    3 days ago
  • - Google Cloud Platform Engineer - LightFeather# Google Cloud Platform Engineer## Job DescriptionLightFeather is seeking a Cloud Engineer - GCP who will play a critical role in designing, implementing, and maintaining cloud infrastructure solutions within Google Cloud Platform... 
    Full time
    Contract work
    Local area

    TryApplyNow

    Washington DC
    1 day ago
  • A technology consulting firm is seeking a Senior Google Cloud Engineer to enhance cloud capabilities in secure environments. This role requires strong experience in Google Cloud Platform (GCP) and Infrastructure-as-Code (IaC). The ideal candidate has significant experience... 

    Booz Allen Hamilton

    Alexandria, VA
    1 day ago
  • $107.9k - $195.05k

     ...technical authority and hands-on engineer for systems architecture,...  ...performance, security, and reliability.  Mentor engineers and...  ...public cloud (AWS, Azure, or GCP) and hybrid architectures....  ...equivalent.  Experience with observability/monitoring stacks (... 
    Local area
    Immediate start

    Leidos

    Chevy Chase, MD
    2 days ago
  • $113k - $188k

    Dovel Technologies, Inc is looking for a highly skilled Senior DevOps / Cloud Engineer to support AWS workloads and establish GCP capabilities. This role requires deep expertise in cloud infrastructure and automation tools like Ansible and Python. Responsibilities include... 

    Dovel Technologies

    Bethesda, MD
    16 hours ago
  •  ...building secure infrastructure solutions and establishing best practices for cloud resource management. The role requires deep expertise in GCP, automation with Terraform, and strong communication skills to interact with various stakeholders. This position is remote, but... 
    Remote job

    careMESH

    Washington DC
    4 days ago
  • $55 per hour

     ...Position: Sr. Engineer, Software - Kafka Location: Bellevue WA...  ...the quality, scalability, and reliability of software delivered, and the...  ...Implement audit logging, observability, and human-in-the-loop controls...  ...(Azure, AWS, or GCP) Experience contributing to... 

    TekWissen ®

    Washington DC
    3 days ago
  • $113k - $187k

     ...hiring a mid-level Software Engineer II. This is a full-time...  .... Improve performance, reliability, and maintainability of...  ..., AWS, Azure or GCP is a plus. Interest in...  ...), 10 annual paid U.S. observed holidays, 401(k) with a...  ...fitness reimbursement or on-site fitness facilities,... 
    Full time
    Flexible hours

    TryApplyNow

    Arlington, VA
    3 days ago
  • $197.4k - $232k

     ...Remote Department Engineering Compensation: $197.4K –...  ...services, governance and metadata, observability and telemetry, security and...  ...decisions that balance reliability, scalability, performance, and...  ...services in a public cloud (AWS, GCP, or Azure), including... 
    Full time
    Remote work

    Confluent

    Washington DC
    4 days ago
  •  ...seeking an experienced Senior Engineer with a passion for building...  ...Senior Engineer works with our Sr Staff Engineer and other Sr....  ...expertise in the Open-Source Observability, Data platform domain. Position...  ...years of experience with AWS, GCP, Azure, or hybrid data center... 
    Hourly pay
    Work experience placement
    Local area
    Flexible hours

    GEICO

    Chevy Chase, MD
    4 days ago
  • $63 per hour

    Senior Data Dog Cloud Engineer (Observability) Work location: Hybrid- 1* week in Washington, D.C. 20002 Type: Contract-to-hire Clearance: Must...  ...detect issues faster, reduce alert noise, and improve reliability in a 24x7 environment. Key responsibilities include: Build... 
    Contract work
    Local area

    System One

    Washington DC
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Site Reliability Engineer - Observability GCP. Be the first to apply!