Staff Site Reliability Engineer - Observability GCP
$194k - $267kOkta
Secure Every Identity, from AI to Human Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. We are seeking a highly technical Observability Site Reliability Engineer with a specialty in Google Cloud, to own and expand our Observability ecosystem into GCP. In this role, you will move beyond simple monitoring to delivering a world class, comprehensive, scalable Observability Platform that enables our SRE teams and business partners. You will treat infrastructure as code—utilizing Terraform and strong coding proficiency in Go, Python, or Ruby—to automate the deployment of agents and collectors across complex distributed systems. Key Responsibilities Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform. GCP Observabilty Engineering: Optimize the collection, processing, and storage of Observabilty data to ensure high reliability and low latency of our Splunk and Grafana services Incident Response: Participate in on-call rotations and lead post-incident reviews to drive systemic improvements and "observability-driven development." Automation: Eliminate "toil" by automating the deployment and scaling of observability agents and collectors. Required Skills & Experience (The Essentials) GKE: Minimum 5+ Experience scaling and managing observability in a Google Cloud platform. Visualization: Expertise in creating intuitive, actionable Splunk or Grafana dashboards that correlate data across multiple sources.SRE Mindset: Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems. Programming Proficiency: Strong coding skills in Python, Go for building internal tools and automating workflows. Distributed Systems: Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/GKE). Problem Solving: A data-driven approach to debugging complex, cross-service performance bottlenecks. Bonus Skills (The "Nice-to-Haves") Telemetry Standards: Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications. Grafana Loki: Experience in migrating Splunk to Grafana Loki Other Cloud Platforms: Experience managing observability native tools within AWS. Additional requirements: This position requires the ability to access federal environments and/or have access to protected federal data. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.
#LI-MM
#LI-HybridP24517_3387022
Below is the annual base salary range for candidates located in San Francisco Bay Area. Your actual base salary will depend on factors such as your skills, qualifications, experience, and work location. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program please visit: The annual base salary range for this position for candidates located in the San Francisco Bay area is between:$194,000—$267,000 USD
The Okta Experience Supporting Your Well-Being Driving Social Impact Developing Talent and Fostering Connection + Community We are intentional about connection. Our global community, spanning over 20 offices worldwide, is united by a drive to innovate. Your journey begins with an immersive, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one. Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws. If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation. Notice for New York City Applicants & Employees: Okta may use Automated Employment Decision Tools (AEDT), as defined by New York City Local Law 144, that use artificial intelligence, machine learning, or other automated processes to assist in our recruitment and hiring process. In accordance with NYC Local Law 144, if you are an applicant or employee residing in New York City, please click here to view our full NYC AEDT Notice.$160k - $200k
Site Reliability Engineer, Observability Please note this is for Chicago, Illinois, United States. You only need toapply to one location if there are multiple listed for the job. Apply Now At Ripple, we’re building a world where value moves like information does today...SuggestedFull timeWork at officeLocal area$160k - $200k
Ripple in Chicago is seeking a Senior Site Reliability Engineer to enhance product reliability and performance. In this role, you will engage with engineering teams to implement observability practices and optimize CI/CD pipelines, ensuring robust security. The position...Suggested$93.9k - $156.5k
Site Reliability Engineer II page is loaded## Site Reliability Engineer IIlocations: Chicago - 20 S. Wackertime... ...senior engineers to learn how we observe, monitor, automate, and improve... ...applications to Google Cloud Platform (GCP)* Collaborate with cross-functional teams...SuggestedWork at officeLocal areaWorldwide2 days per week$93.9k - $156.5k
...requiring 2 days per week on‑site at our corporate office... ...and rock‑solid reliability to seamlessly handle the... ...product teams and senior engineers to assist with building out observability, monitoring and alerting... ...Google Cloud Platform (GCP). Collaborate with cross...SuggestedWork at officeLocal area2 days per week$125.04k - $187.56k
..., Technology and more. Overview The Site Reliability Engineer (SRE) III is responsible for ensuring... ...production systems through automation, observability, incident response, and... ...cloud infrastructure (AWS, Azure, or GCP). Strong analytical, debugging, and...SuggestedFull timeWork at officeRemote workFlexible hours- ...seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our dynamic team. In... ...SLO/SLI metrics and runbooks Improve observability through scalable monitoring solutions... ...such as Google Cloud Platform (GCP) and Pivotal Cloud Foundry (PCF) 4+ years...
- DevOps / Site Reliability Engineer ID70127 Full time | AgileEngine | United States Posted On 06/17/2026 Job Information City... ...infrastructure, automate operational controls, and improve observability across AWS, Azure, and GCP environments. WHAT YOU WILL DO Scale and...Full timeWork at officeRemote workVisa sponsorshipWork visaFlexible hours
- Hitachi Vantara Corporation is looking for a Site Reliability Engineer (SRE) to design and operate the enterprise observability stack, including Azure Monitor and Managed Grafana. This position requires extensive experience in SRE and cloud infrastructure, with a focus...
- Senior Site Reliability Engineer - Google Distributed Cloud Edge (Edge SRE) Location: Hybrid - Chicago... .... Develop monitoring, alerting, and observability frameworks (real-time + historical)... ...Code (Terraform). Proven background in GCP (preferred) and/or AWS cloud...Contract work
- Title: GCP Cloud Engineer / Developer Location: Chicago, IL (Hybrid) Employment Type: 3-6 Month Contract... ...Improve system performance, reliability, and cost efficiency Implement monitoring, alerting, and observability across environments Ongoing Success Ensure...Contract work
- ...digital landscape. The Role We are seeking an experienced GCP Cloud Platform Engineer to support our client on a critical cloud infrastructure initiative... ...through final delivery, ensuring scalable, secure, and reliable cloud environments. Key Responsibilities Cloud...Contract workShift work
$125.83k - $221.28k
...investment activities. What you’ll do We are building a new Site Reliability Engineering function and need a leader to establish SRE practices... ...tuning and optimization of monitoring, alerting, and observability tooling. Drive reduction of system disruptions through automation...Flexible hours$140k - $180k
A global trading firm in Chicago is seeking a Platform Engineer to join their Platform Infrastructure team. The role focuses on deploying, observing, and scaling systems critical to trading operations. Responsibilities include automating deployment patterns, driving CI/...$130k - $225k
Site Reliability Engineer - Algorithmic Trading Job Location Chicago Employment type Regular Department Technology Targeted Start Date Immediate... ..., trading and infrastructure - someone who creates observability that surfaces issues before they ever cause a problem,...Temporary workWork at officeImmediate startFlexible hours- Site Reliability Engineer (Chicago, IL; Dallas, TX; ...) Qualifications: 8+ years of Software Engineering... ...on Google Cloud Platform (GCP) for Snowflake data warehousing. Monitor... ...effectively with the client, IT management and staff, and other groups in Information Technology...Contract workFor contractorsWork experience placement
$130k - $165k
Job Title: Senior Software Engineer Company: Snapsheet Job Location: USA, Remote Job... ...Job Department: Technology Team: Site Reliability Engineering About Snapsheet Snapsheet... ...Build and operate our core internal observability platform Monitor our systems for capacity...Full timeTemporary workLocal areaRemote workVisa sponsorshipWork visaFlexible hours$130k - $180k
...belonging, collaboration, and accomplishment. Being a Senior Site Reliability Engineer atiManageMeans… You are an engineer, a builder, and a... ...in on‑call rotations. You’ll be a key voice in observability, change management, and service scalability, providing guidance...Work at officeLocal areaRemote workWorldwideMonday to FridayFlexible hours$130k - $140k
...IL Design and operate the enterprise observability stack: Azure Monitor, Log Analytics, and... ...+ years of experience in SRE, platform engineering, or cloud infrastructure engineering in... ..., you’re placing your trust in a safe, reliable, and ethical global company. Integrity...Work experience placementWork at office- ...external vendors if their integrations fail Measure the front-end metrics for the site with various tools available Qualifications Must have worked on support projects Must know GCP, Kubernetes and Dynatrace Knowledge of Splunk / Sumologic log monitoring is an advantage...
- Description Design and operate the enterprise observability stack: Azure Monitor, Log Analytics, and Managed Grafana. Develop self-healing... ...Python. Requirements 7+ years of experience in SRE, platform engineering, or cloud infrastructure engineering in large-scale...
$72k - $90k
A leading observability company is hiring a Solutions Engineer to provide technical support across North America. This remote role involves collaborating with Sales, Customer Success, and Engineering teams to deliver product demonstrations and support technical evaluations...Remote job- Job Title: GCP Data Engineer Duration: 6 months Contract to hire Location: Chicago is the preferred... ...data architectures, and enabling reliable, high‑quality data platforms for analytics... ...for data workflows. Data Quality, Observability & Reliability Implement and maintain data...Contract work
$140k - $170k
ITRS is looking for an experienced FDE Engineer to support our collaborative team in Chicago. This... ...in the office, with flexibility for client site visits. You will be involved in developing infrastructure for our observability platform, working closely with clients to ensure...Work at office2 days per week- A leading technology consulting firm seeks an experienced GCP Cloud Platform Engineer in Chicago to support a critical cloud infrastructure initiative. In this role, you will build and manage GCP infrastructure, execute cloud migrations, and implement Infrastructure as...
$21 - $45 per hour
...IL60622 Responsibilities The Associate Site Reliability Engineer (SRE) ensures the reliability,... ...Strong knowledge of cloud platforms (AWS, GCP, Azure) and infrastructure-as-code tools... ...company is dedicated to empowering its staff with a comprehensive, competitive benefits...Full timeLocal areaShift workDay shift$128.5k - $214.1k
We're looking for a Staff Site Reliability Engineer to join our team, focusing on the core systems that power global financial markets. This isn... ...solutions at a global scale.* **Spearhead**the adoption of observability and performance testing, guiding teams to a "build with...Work at officeWorldwide2 days per week$58 per hour
...Job Title Site Reliability/DevOps Engineer End Client Northern Trust Bill Rate $58/hr Location Chicago... ...do not submit Hybrid Cloud/AWS /GCP/ Mixed profiles. Skills: Overall... ...Apps - Must 8+ years Monitoring & Observability tools of which 3+ years with Grafana...Work at office3 days per week- Itrs Insights is looking for an experienced FDE Engineer to join our Chicago team. In this permanent, full-time position, you will work on developing an observability platform to support critical applications. Your role involves collaboration with Product, Engineering,...Permanent employmentFull time
- CME Chicago Mercantile Exchange Inc. is seeking a Site Reliability Engineer III to enhance stability for CME Clearing & Risk. In this role, you will... ...management services. Your expertise in cloud platforms like GCP, AWS, or Azure, alongside SRE principles, will be crucial in...
$132.1k - $220.1k
The Data Platform Engineering team is a collection of highly skilled individuals dedicated to building and scaling the infrastructure that... ...Core Skills and Qualifications Cloud Architecture & Orchestration GCP Mastery: Expert‑level experience architecting and operating...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Site Reliability Engineer - Observability GCP. Be the first to apply!
- staff engineer Chicago, IL
- senior staff systems engineer Chicago, IL
- structural engineering assistant Chicago, IL
- project engineer assistant project manager Chicago, IL
- engineering aide Chicago, IL
- software engineer staff Chicago, IL
- assistant engineer Chicago, IL
- technology administrator Chicago, IL
- senior staff engineer Chicago, IL
- assistant civil engineer Chicago, IL

