Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Site Reliability Engineer - Observability

$147k - $202k

Okta

Secure Every Identity, from AI to Human

Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence.

This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk.

Position Overview:

We are seeking a highly technical Staff Observability Site Reliability Engineer with a specialty in Splunk to own and evolve our Splunk ecosystem. In this role, you will move beyond simple monitoring to delivering a world class, comprehensive, scalable Observability Platform that enables our SRE teams and business partners. You will treat infrastructure as code —utilizing Terraform and strong coding proficiency in Go, Python, or Ruby —to automate the deployment of agents and collectors across complex distributed systems.

Key Responsibilities

  • Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform.
  • Splunk Engineering: Optimize the collection, processing, and storage of log data to ensure high reliability and low latency of our Splunk services
  • Incident Response: Participate in on-call rotations and lead post-incident reviews to drive systemic improvements and "observability-driven development."
  • Automation: Eliminate "toil" by automating the deployment and scaling of observability agents and collectors.

Required Skills & Experience (The Essentials)

Log Management: Minimum 5+ Experience scaling and managing Splunk Cloud at scale (1000+ SVCs), including Workload Management (WLM) and HEC optimization. Visualization: Expertise in creating intuitive, actionable Splunk dashboards that correlate data across multiple sources.
SRE Mindset: Minimum 5+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems.

  • Programming Proficiency: Strong coding skills in SPL , Go for building internal tools and automating workflows.
  • Distributed Systems: Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/EKS).
  • Problem Solving: A data-driven approach to debugging complex, cross-service performance bottlenecks.

Bonus Skills (The "Nice-to-Haves")

  • Telemetry Standards: Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications.
  • Charge-back app: Experience in implementing Splunk charge-back app for usage reporting 

Cloud Platforms: Experience managing observability native tools within AWS or GCP.

Additional requirements:

  • This position requires the ability to access federal environments and/or have access to protected federal data.  As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.
  • This person must attend in person onboarding in our San Francisco office the first week of employment. 

#LI-MM

#LI-Hybrid
P14596_3372199

Below is the annual base salary range for candidates located in California (excluding San Francisco Bay Area), Colorado, Illinois, New York and Washington. Your actual base salary will depend on factors such as your skills, qualifications, experience, and work location. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program please visit: .   

The annual base salary range for this position for candidates located in California (excluding San Francisco Bay Area), Colorado, Illinois, New York, and Washington is between: $147,000—$202,000 USD


The Okta Experience

We are intentional about connection. Our global community, spanning over 20 offices worldwide, is united by a drive to innovate. Your journey begins with an immersive, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one.

Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws.

If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please  use this Form to request an accommodation.

Notice for New York City Applicants & Employees: Okta may use Automated Employment Decision Tools (AEDT), as defined by New York City Local Law 144, that use artificial intelligence, machine learning, or other automated processes to assist in our recruitment and hiring process. In accordance with NYC Local Law 144, if you are an applicant or employee residing in New York City, please  click here to view our full NYC AEDT Notice.

Okta is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Personnel and Job Candidate Privacy Notice at  .
Vacancy posted 28 days ago
Similar jobs that could be interesting for youBased on the Staff Site Reliability Engineer - Observability in Bellevue, WA vacancy
  • $194k - $267k

     ...experiences.  Join our team! We’re building a world where Identity belongs to you. We are seeking a highly technical Observability Site Reliability Engineer with a specialty in Google Cloud, to own and expand our Observability ecosystem into GCP. In this role, you will... 
    Suggested
    Permanent employment
    Full time
    Work at office
    Local area
    Flexible hours

    Okta

    Bellevue, WA
    more than 2 months ago
  •  ...Lead the architecture and implementation of a comprehensive observability strategy across the entire SIEM modernization ecosystem, spanning...  ..., and executive-level views). Partner closely with Security Engineering, Platform Engineering, and Data Engineering to ensure... 
    Suggested

    TechDigital Group

    Bellevue, WA
    1 day ago
  • $49 - $87.12 per hour

    Staff RN - Short Stay Observation & Infusion Unit (0.75 FTE / Days) page is loaded## Staff RN - Short Stay Observation & Infusion Unit (0.75 FTE / Days)locations: Bellevue, WAtime type: Part timeposted on: Posted Yesterdayjob requisition id: R-10693# **Welcome to a medical... 
    Suggested
    Hourly pay
    Local area
    Immediate start
    Shift work
    Weekend work
    Day shift

    Overlake Hospital Medical Center

    Bellevue, WA
    2 days ago
  • $163.62k - $212.71k

     ...platforms, and processes that improve our engineering teams' productivity and streamline...  ...and strategic Lead/Principal Site Reliability Engineer to drive the reliability, scalability...  ...(e.g., EMR, Databricks, Glue). Observability and Monitoring: Establish comprehensive... 
    Suggested
    Full time
    Part time
    Work experience placement
    Work at office
    Local area
    Immediate start
    Remote work
    Work from home
    Flexible hours
    Shift work
    3 days per week
    1 day per week

    iSpot

    Bellevue, WA
    12 days ago
  •  ...Excellence (SE) team owns the tools and infrastructure that help engineers understand and operate production systems. The Incident...  ...across the stack — touches distributed systems, Kubernetes, observability pipelines, and web-based tooling Familiarity with observability... 
    Suggested
    Local area

    The Trade Desk, Inc.

    Bellevue, WA
    3 days ago
  • $185k - $210k

     ...innovation.  About the role The Observability department plays a pivotal role in CoreWeave...  ...We are seeking senior observability engineers with specializations in logging and...  ...Improve the performance, security, reliability, and scalability of observability services... 
    Full time
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    Coreweave

    Bellevue, WA
    1 day ago
  •  ...DevOps Engineer/ Site Reliability Engineer We are seeking a skilled DevOps Engineer with SRE capabilities to join our team in Seattle, WA...  ...development through automation, CI/CD, containerization, and observability, while bringing Site Reliability Engineering (SRE) best... 

    Staffing the Universe

    Seattle, WA
    2 days ago
  • $160k - $210k

     ...achieving remarkable growth in a rapidly evolving industry. Now, we're growing! The Role We are looking for a senior site reliability engineer to work on expanding our global footprint of datacenters and improve service management across Cognitiv. Our immediate... 
    Work at office
    Immediate start
    Remote work
    Work from home

    Cognitiv

    Bellevue, WA
    12 days ago
  • $124.9k - $228.9k

     ...time to building systems that operate reliably on a global scale. When you work here,...  ...the tools and infrastructure that help engineers at The Trade Desk understand and operate...  ...touches distributed systems, Kubernetes, observability pipelines, and web-based tooling... 
    Full time
    Temporary work
    Local area
    Worldwide

    The Trade Desk

    Seattle, WA
    4 days ago
  •  ...Junior Software Developer - Observability at Canonical Canonical is a leading provider of open source software and operating systems...  ...initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world’s leading... 
    Work at office
    Remote work
    Work from home

    Canonical

    Seattle, WA
    2 days ago
  • $202.16k - $368.22k

    Senior Site Reliability Engineer - Foundational Storage, ByteStore Location: Seattle Team: Infrastructure Employment Type: Regular Job Code...  ...- Explore new reliability patterns, chaos engineering, observability techniques, and cost‑efficient storage hardware/software.... 
    Temporary work
    Local area

    ByteDance

    Seattle, WA
    3 days ago
  • Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of...  ...cleanup. 4. Monitoring, Alerting & Incident Response Observability: Build and manage comprehensive dashboards using Prometheus... 
    Local area

    Tiger Analytics

    Seattle, WA
    2 days ago
  •  ...and is responsible for the reliability, performance, security, and...  ...databases invisible: product engineers should be able to provision,...  ...service platforms, and unified observability. You will join a team that...  ...What you'll do As a Senior/Staff Software Engineer on the Database... 
    Worldwide

    Airwallex-

    Seattle, WA
    1 day ago
  • $150k - $180k

     ...improve cloud infrastructure reliability, scalability, and...  ...platforms and tools that enable engineering teams to provision services...  ...engineering, cloud infrastructure, or site reliability engineering....  ...releases. Experience using observability tools such as APM, logging,... 

    Axon Enterprise

    Seattle, WA
    2 days ago
  •  ...system performance, and ensure reliability in production environments....  ...Computer Science, Software Engineering, or a related technical...  ...: 4+ years of experience in Site Reliability Engineering (SRE...  ...Practical experience integrating observability and monitoring into... 
    Work at office
    Local area
    Remote work
    Work from home

    F5 Networks, Inc.

    Seattle, WA
    3 days ago
  • $150k - $215k

    Nscale is looking for a Principal Observability Platform Engineer to lead the technical direction of their observability platform. This role demands expertise in owning observability infrastructure, driving impactful decisions, and simplifying complex systems. Candidates... 

    Nscale

    Seattle, WA
    2 days ago
  • $139k - $242k

     ...Senior Software Engineer, Server Fleet Infrastructure Livingston...  ...deep in Linux environments, observability/monitoring stacks, and leveraging...  ...to the company's delivery of reliable and efficient infrastructure....  ...problems of scale for multi-site deployment and management of... 
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Bellevue, WA
    1 day ago
  • $150k - $215k

    Principal Observability Platform Engineer US Principal Observability Platform Engineer - Nscale About Nscale Nscale is the GPU cloud engineered for...  ...technology that powers the future. About the Role As a Principal/Staff Observability Platform Engineer, you'll own the technical... 
    Flexible hours

    Nscale

    Seattle, WA
    2 days ago
  •  ...world's most complex and mission-critical systems. As a Site Reliability Engineer III - DevOps Engineer at JPMorgan Chase within the Commercial...  ...CircleCI, AWS CodePipeline, Spinnaker. Experience with observability, monitoring, and logging tools like Prometheus, Grafana,... 

    JPMorgan Chase & Co.

    Seattle, WA
    3 days ago
  • $260k - $385k

     ...Role We are seeking a Software Engineer, Security Observability to join our Security team. In this role...  ...improve the resilience and reliability of data systems to ensure high platform...  ...technical domains such as databases, site reliability engineering (SRE), or security... 
    Remote work
    Relocation package

    OpenAI

    Seattle, WA
    4 days ago
  • About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure...  ...knowledge. DNS, TLS, certificates, ingresses, etc. Observability with log collection, metrics, APM, etc. preferably Datadog... 
    Remote job
    Work from home
    Sleeping nights

    Hopper

    Seattle, WA
    3 days ago
  • $139.5k - $258.1k

     ...Software and Services Apple Services Engineering (ASE) designs, builds, and operates the...  ...collaborative, and pragmatic Storage Site Reliability Engineer to join our team. In this...  ...design and implement automation, improve observability, strengthen incident response, and... 
    Relocation

    Apple Inc.

    Seattle, WA
    2 days ago
  •  ...candidates that will contribute to the diversification and enrichment of ideas and perspectives at AHEAD.  AHEAD’s Sr. Observability Solutions Engineers are the technical experts that collaborate with our AHEAD account teams to help identify, qualify, and build solutions... 
    Work at office

    AHEAD

    Seattle, WA
    6 days ago
  • $142.8k - $274.8k

     ...world. Microsoft's Azure Data engineering team is leading the transformation of analytics...  ...high granularity signals (real-time & observability) and complex data, converting those...  ...systems that deliver scalable, secure, reliable and resilient cloud-native platforms... 
    Ongoing contract
    Local area

    Microsoft Corporation

    Redmond, WA
    6 days ago
  • $194k - $267k

     ...on new concepts and tools. Position Overview: The Site Reliability Engineer (SRE) will play a key role in building and managing Kubernetes...  ...provide service-to-service communication, security, and observability within the Kubernetes clusters. Enable fine-grained... 
    Permanent employment
    Work at office
    Local area
    Worldwide
    Flexible hours

    Okta

    Bellevue, WA
    more than 2 months ago
  • $24.61 - $25.68 per hour

     ...Shelter Staff (Part-Time Weekends) Join Our Team as Shelter Staff and Make a Difference Are you passionate about supporting...  ...meal setups. Strong communication is key, as you will report observations and incidents to the shelter leadership team to ensure proper... 
    Hourly pay
    Part time
    Local area
    Weekend work

    The Sophia Way

    Bellevue, WA
    3 days ago
  • $109k - $145k

     ...ll Do: We are seeking Software Engineers to join our efforts in building,...  ..., and optimizing highly scalable, reliable, and secure systems. The Observability team is responsible for deploying...  ...experience in Software Engineering, Site Reliability Engineering, DevOps,... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Bellevue, WA
    more than 2 months ago
  • $325k

     ...Staff + Sr. Software Engineer, AI Reliability San Francisco, CA | New York City, NY | Seattle, WA About Anthropic...  ...and implement monitoring and observability systems across the token path Assist...  ...serving -- critical for both site reliability and Anthropic's safety... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    Seattle, WA
    2 days ago
  •  ...Staff + Sr. Software Engineer, Inference Deployment San Francisco, CA | New York City, NY | Seattle...  ...Anthropic's mission is to create reliable, interpretable, and steerable AI systems...  ...fleet sizes Extend deployment observability — dashboards and tooling that answer... 
    Work at office
    Visa sponsorship
    Flexible hours
    Shift work

    anthropic

    Seattle, WA
    2 days ago
  •  ...kitchen via the point-of-sale computerized register system. Observes guests and responds to any additional requests. Presents guest...  ..., sanitation, and workplace safety rules and procedures Reliable attendance and ability to work in a fast-paced environment Ability... 
    Hourly pay
    Full time
    Flexible hours
    Shift work

    NW-IHOP

    Bellevue, WA
    22 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Site Reliability Engineer - Observability. Be the first to apply!