Staff Site Reliability Engineer - Observability GCP
$194k - $267kOkta
Secure Every Identity, from AI to Human Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. We are seeking a highly technical Observability Site Reliability Engineer with a specialty in Google Cloud, to own and expand our Observability ecosystem into GCP. In this role, you will move beyond simple monitoring to delivering a world class, comprehensive, scalable Observability Platform that enables our SRE teams and business partners. You will treat infrastructure as code—utilizing Terraform and strong coding proficiency in Go, Python, or Ruby—to automate the deployment of agents and collectors across complex distributed systems. Key Responsibilities Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform. GCP Observabilty Engineering: Optimize the collection, processing, and storage of Observabilty data to ensure high reliability and low latency of our Splunk and Grafana services Incident Response: Participate in on-call rotations and lead post-incident reviews to drive systemic improvements and "observability-driven development." Automation: Eliminate "toil" by automating the deployment and scaling of observability agents and collectors. Required Skills & Experience (The Essentials) GKE: Minimum 5+ Experience scaling and managing observability in a Google Cloud platform. Visualization: Expertise in creating intuitive, actionable Splunk or Grafana dashboards that correlate data across multiple sources.SRE Mindset: Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems. Programming Proficiency: Strong coding skills in Python, Go for building internal tools and automating workflows. Distributed Systems: Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/GKE). Problem Solving: A data-driven approach to debugging complex, cross-service performance bottlenecks. Bonus Skills (The "Nice-to-Haves") Telemetry Standards: Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications. Grafana Loki: Experience in migrating Splunk to Grafana Loki Other Cloud Platforms: Experience managing observability native tools within AWS. Additional requirements: This position requires the ability to access federal environments and/or have access to protected federal data. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.
#LI-MM
#LI-HybridP24517_3387022
Below is the annual base salary range for candidates located in San Francisco Bay Area. Your actual base salary will depend on factors such as your skills, qualifications, experience, and work location. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program please visit: The annual base salary range for this position for candidates located in the San Francisco Bay area is between:$194,000—$267,000 USD
The Okta Experience Supporting Your Well-Being Driving Social Impact Developing Talent and Fostering Connection + Community We are intentional about connection. Our global community, spanning over 20 offices worldwide, is united by a drive to innovate. Your journey begins with an immersive, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one. Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws. If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation. Notice for New York City Applicants & Employees: Okta may use Automated Employment Decision Tools (AEDT), as defined by New York City Local Law 144, that use artificial intelligence, machine learning, or other automated processes to assist in our recruitment and hiring process. In accordance with NYC Local Law 144, if you are an applicant or employee residing in New York City, please click here to view our full NYC AEDT Notice.$175k - $250k
...Senior Cloud Infrastructure Engineer Location: San Francisco,... ...unavailable. Modality: On‑Site only. Must live within commuting... ..., performance, and reliability across environments. What... ...systems for orchestration, observability, distributed storage, and networking...SuggestedFull timeRemote workRelocationRelocation package- A leading insurance company is seeking a Senior Engineer to drive innovation in building high-performance, low-maintenance platforms... .... The role requires deep technical expertise in open-source observability and experience with distributed systems, Docker, and Kubernetes...Suggested
$103.5k - $150k
...whole self. The Role and Team The Site Reliability Engineering organization at Medallia brings together... ...availability, and performance using observability and alerting platforms. Participate... ...infrastructure platforms such as AWS, OCI, or GCP. Demonstrated experience with Linux...SuggestedTemporary workWork experience placementLocal area3 days per week$158k - $195k
Position Summary: We are looking for a savvy engineer who will provide technical leadership... ...platform capabilities that improve reliability, security, and consistency across development... ...enabling application teams to adopt observability solutions such as the ELK Stack for...SuggestedFor contractorsWork at officeWork from home2 days per week- ...This role requires regularly working on-site at customer locations in Arlington, VA... ...About The Role We are hiring a Site Reliability Engineer to join our Infrastructure & Security... ...by building the tools, processes, and observability that make "fast recovery" a reality....SuggestedRelocationRelocation package
$125k - $200k
Overview As a Site Reliability Engineer (SRE) , you will help design, build, and operate reliable, secure, and observable cloud‑native systems that support mission‑critical applications... ...manage cloud resources (e.g., AWS, Azure, GCP). Build and maintain CI/CD pipelines...Local area2 days per week$165k - $230k
Sr. Site Reliability Engineer (Starshield) Washington, DC SpaceX was founded under the belief that a future where humanity is out exploring the... ...for government use, with an initial focus on earth observation, communications, and hosted payloads. SpaceX’s satellite programs...Permanent employmentTemporary workImmediate startWeekend work$166k - $220k
ABOUT THE JOB As a site reliability engineer in Platform Discovery, you will solve a wide variety of problems involving networking, autonomy, systems... ..., Ansible). Experience with cloud platforms (Azure, AWS, GCP). Proficiency in containerization (Docker) and container orchestration...Full timeWork experience placementRelocation package- Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of... ...cleanup. 4. Monitoring, Alerting & Incident Response Observability: Build and manage comprehensive dashboards using Prometheus...Local area
$158k - $195k
NRECA seeks a skilled engineer to provide technical leadership in software delivery through shared platforms... ...DevOps practices. The role focuses on enhancing reliability and security across teams while adopting observability solutions like the ELK Stack. Position includes...Work from home$100k - $215k
GEICO is seeking a Senior Engineer in Bethesda, Maryland to enhance their cloud platforms through innovative design and deployment. The role focuses on improving performance, automation, and observability within OpenStack-based environments. Ideal candidates will have...Flexible hours- ...Job Description Senior DevOps / Cloud Engineer (AWS, GCP, AI Platform Operations) We are... ...to-end, and driving implementation of reliable, secure, scalable, and compliant DevOps... ..., secrets management, logging, and observability tools. · Experience with security scanning...
$185k - $230k
As a Sr. Site Reliability Engineer (SRE) III, you’ll work as part of a collaborative and high-performing team providing your expertise to deliver... ...to support reliable software delivery and operational observability across development, integration, staging, and production...Full timeWork experience placementLocal areaImmediate startFlexible hours$147k - $202k
...too, let's talk. The Auth0 Platform Observability team owns the observability tooling that... ...we are looking for an Observability Engineer to help ensure that our Product and Platform... .... If you have experience within the Site Reliability Engineering (SRE) field or working as...Full timeLocal areaWorldwideFlexible hours$194k - $267k
...on new concepts and tools. POSITION OVERVIEW: The Site Reliability Engineer (SRE) will play a key role in building and managing Kubernetes... ...provide service-to-service communication, security, and observability within the Kubernetes clusters. Enable fine-grained...Permanent employmentFull timeWork at officeLocal areaWorldwideFlexible hours$105k - $215k
GEICO is looking for a Senior Software Engineer to build the next-generation Release Platform and DevOps Tooling at their Bethesda, MD location. You will enhance software delivery workflows and mentor junior engineers, contributing to a collaborative environment. This...- ...Senior Reliability Engineer This Department of War enterprise data and analytics program... ...of the platform. Perform site reliability engineering to build... ...cloud environments (AWS, Azure, or GCP). Experience implementing monitoring, observability, and performance management...
$136.29k - $214.17k
...revenues. As a Senior Software Engineer in the Web Applications Team... ...improve latency, reliability, and cost Review code, document... ...~ Cloud experience on GCP (preferred) or AWS; Docker;... ...Actions/Jenkins/GitLab ~ Observability skills: Datadog/OpenTelemetry...Full timeWorldwide$130k - $165k
...We’re looking for a Software Engineer to help scale the systems that... ...-facing products fast, reliable, and scalable. You’ll work closely... ...automated tests. Performance & observability: Monitor system health,... ...with cloud infrastructure (AWS, GCP, or similar). Experience with...ApprenticeshipWork experience placementLocal areaWork from home- Insight Global is seeking an experienced Observability Engineer to enhance system health and performance in a complex IT landscape, including... ...dashboards and implement monitoring solutions to ensure reliability and security. The role requires 7+ years in IT operations...Remote job
- - Google Cloud Platform Engineer - LightFeather# Google Cloud Platform Engineer## Job DescriptionLightFeather is seeking a Cloud Engineer - GCP who will play a critical role in designing, implementing, and maintaining cloud infrastructure solutions within Google Cloud Platform...Full timeContract workLocal area
- A technology consulting firm is seeking a Senior Google Cloud Engineer to enhance cloud capabilities in secure environments. This role requires strong experience in Google Cloud Platform (GCP) and Infrastructure-as-Code (IaC). The ideal candidate has significant experience...
$107.9k - $195.05k
...technical authority and hands-on engineer for systems architecture,... ...performance, security, and reliability. Mentor engineers and... ...public cloud (AWS, Azure, or GCP) and hybrid architectures.... ...equivalent. Experience with observability/monitoring stacks (...Local areaImmediate start$113k - $188k
Dovel Technologies, Inc is looking for a highly skilled Senior DevOps / Cloud Engineer to support AWS workloads and establish GCP capabilities. This role requires deep expertise in cloud infrastructure and automation tools like Ansible and Python. Responsibilities include...- ...building secure infrastructure solutions and establishing best practices for cloud resource management. The role requires deep expertise in GCP, automation with Terraform, and strong communication skills to interact with various stakeholders. This position is remote, but...Remote job
$55 per hour
...Position: Sr. Engineer, Software - Kafka Location: Bellevue WA... ...the quality, scalability, and reliability of software delivered, and the... ...Implement audit logging, observability, and human-in-the-loop controls... ...(Azure, AWS, or GCP) Experience contributing to...$113k - $187k
...hiring a mid-level Software Engineer II. This is a full-time... .... Improve performance, reliability, and maintainability of... ..., AWS, Azure or GCP is a plus. Interest in... ...), 10 annual paid U.S. observed holidays, 401(k) with a... ...fitness reimbursement or on-site fitness facilities,...Full timeFlexible hours$197.4k - $232k
...Remote Department Engineering Compensation: $197.4K –... ...services, governance and metadata, observability and telemetry, security and... ...decisions that balance reliability, scalability, performance, and... ...services in a public cloud (AWS, GCP, or Azure), including...Full timeRemote work- ...seeking an experienced Senior Engineer with a passion for building... ...Senior Engineer works with our Sr Staff Engineer and other Sr.... ...expertise in the Open-Source Observability, Data platform domain. Position... ...years of experience with AWS, GCP, Azure, or hybrid data center...Hourly payWork experience placementLocal areaFlexible hours
$63 per hour
Senior Data Dog Cloud Engineer (Observability) Work location: Hybrid- 1* week in Washington, D.C. 20002 Type: Contract-to-hire Clearance: Must... ...detect issues faster, reduce alert noise, and improve reliability in a 24x7 environment. Key responsibilities include: Build...Contract workLocal area
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Site Reliability Engineer - Observability GCP. Be the first to apply!
- staff engineer Washington DC
- senior staff systems engineer Washington DC
- engineering aide Washington DC
- software engineer staff Washington DC
- assistant engineer Washington DC
- technology administrator Washington DC
- senior staff engineer Washington DC
- site reliability engineer Washington DC
- site reliability engineer sre Washington DC
- site services specialist Washington DC


