Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Observability Platform Engineer (SRE)

$118.45k - $236.9k

Oak St. Health

Lead Platform Reliability Engineer

We're building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you'll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time.

CVS Health PBM is looking for hands-on, passionate people who want to join a high energy and growing team, who want to be on the forefront of digital innovation that aims to reinvent what a pharmacy and a health care company can be in the digital world.

As a Lead Platform Reliability Engineer, you will design and implement metrics and observability frameworks with a strong focus on service level objectives (SLOs), service level indicators (SLIs), error budgets, and cloud infrastructure scaling and capacity estimation. This individual contributor role is critical to enhancing our monitoring and observability capabilities, while also driving automation initiatives related to quality gates within the release engineering process. You will work closely with cross-functional teams to ensure the reliability, performance, and scalable growth of our cloud-based systems.

Expectations for the Role:

Metrics Development: Define, implement, and maintain key performance metrics, SLOs, and SLIs to measure system reliability and performance. Ensure alignment with business objectives and operational goals.

Error Budgets: Manage error budgets effectively, collaborating with development teams to balance reliability and feature delivery. Analyze incidents and outages to inform adjustments to error budgets.

Monitoring & Observability: Design and implement comprehensive monitoring solutions to provide real-time visibility into system health. Utilize tools such as Prometheus, Grafana, Loki, Temp and other observability platforms to create dashboards and alerts.

Cloud Infrastructure Scaling: Architect, design, and implement scalable cloud infrastructure capable of supporting multiple business applications, ensuring reliability, performance, and future growth.

Quality Gates Automation: Develop and implement automated quality gates that ensure all releases meet defined reliability and performance standards. Lead the release Devops team to integrate these gates into the CI/CD pipeline.

Incident Management: Assist in incident response efforts by providing insights from metrics and monitoring tools. Conduct post-mortem analyses to identify root causes and recommend preventive measures.

Required Qualifications
  • 10+ years of experience in Software Engineering, Platform Engineering, or SRE.
  • 7+ years of experience with observability practices, including SLIs/SLOs/SLAs, alerting, and incident management.
  • 7+ years building production-grade backend services in Java/python.
  • 7+ years implementing and operating OpenTelemetry, including OTLP, semantic conventions, and instrumentation patterns.
  • 7+ years with cloud-native and containerized platforms (Docker, Kubernetes, Argo CD).
  • 7+ years working with public cloud platforms (AWS, GCP, or Azure).
  • 5+ years designing and scaling distributed, high-volume data pipelines.
  • 5+ years working with Grafana OSS or comparable observability backends (e.g., Grafana, Loki, Tempo, Prometheus).
  • 5+ years with relational databases (PostgreSQL, MySQL).
Preferred Qualifications
  • Excellent analytical skills and the ability to communicate complex technical concepts to non-technical stakeholders
  • Experience with service meshes and networking technologies such as Envoy and Istio
  • Experience integrating or operating commercial observability platforms (Splunk, AppDynamics, etc.)
  • Experience with streaming and data platforms such as Kafka, Pulsar, or similar technologies
  • Familiarity with time-series, NoSQL, or analytical databases (ClickHouse, Bigtable, Cassandra, etc.)
  • Experience with Infrastructure as Code tools such as Terraform or CloudFormation
  • Experience with cost optimization and capacity planning for large-scale cloud infra
  • Experience with chaos engineering, resiliency testing, or fault injection
  • Background in security-aware platform design, including secure service-to-service communication
  • Experience mentoring senior engineers and influencing platform standards across organizations
  • Strong operational experience supporting 24x7 production systems, including on-call responsibilities
  • Knowledge of security best practices in cloud environments

Bachelor's degree or equivalent experience (HS diploma + 4 years relevant experience)

The typical pay range for this role is: $118,450.00 - $236,900.00. This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company's equity award program.

Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.

Great benefits for great people. We take pride in offering a comprehensive and competitive mix of pay and benefits that reflects our commitment to our colleagues and their families. This full-time position is eligible for a comprehensive benefits package designed to support the physical, emotional, and financial well-being of colleagues and their families. The benefits for this position include medical, dental, and vision coverage, paid time off, retirement savings options, wellness programs, and other resources, based on eligibility.

Vacancy posted 6 hours ago
Similar jobs that could be interesting for youBased on the Staff Observability Platform Engineer (SRE) in Scottsdale, AZ vacancy
  •  ...* Role Overview We are looking for a highly skilled SRE Engineer with strong hands-on experience in monitoring, production support...  ...The role requires strong expertise in Core Java , modern observability tools, and hands-on experience with databases, CI/CD, cloud... 
    Suggested
    Local area

    Purple Drive

    Phoenix, AZ
    3 hours ago
  •  ...Must Have Skills: SRE Observability skills GCP/Azure cloud Additional Skills Required: SRE skills DevOps GCP/Azure cloud Grafana, Prometheus, Loki open source tools Data Dog, Splunk monitoring tools Job Summary: We are seeking... 
    Suggested

    Omni Inclusive

    Phoenix, AZ
    1 day ago
  • $286.2k - $326.7k

     ...Sr. Distinguished Engineer, Acquisitions Platform & SRE Foundations As a Sr. Distinguished Engineer at Capital One, you will be a part of a community of technical experts working to define the future of banking in the cloud. You will work alongside our talented... 
    Suggested
    Full time
    Part time
    Local area
    Remote work

    Capital One

    Phoenix, AZ
    2 days ago
  •  ...is seeking an experienced Lead Software Engineer within Technology Engineering to design...  ...lead scalable backend applications and platform services. This role requires deep...  ...Design and develop scalable telemetry, observability, and analytics solutions to support real... 
    Suggested
    Work experience placement

    Wells Fargo

    Phoenix, AZ
    3 days ago
  •  ...DevOps Engineer ***Google Cloud Platform and Splunk Observability Cloud required*** Work Location Options: Onsite from day 1 Phoenix, AZ - onsite Charlotte...  ...destructive and resiliency testing Automate key SRE metrics and IT Service Operations processes... 
    Suggested

    Kaav Inc.

    Phoenix, AZ
    23 hours ago
  • $79.2k - $178.1k

     ...Description Role Summary Oracle Health Platform Engineering builds core platform capabilities that...  ...best practices (testing, CI/CD, observability, security). • Diagnose and resolve...  ...Collaborate with cross-functional stakeholders (SRE/Operations, Security, Product, and... 
    Temporary work
    Visa sponsorship
    Flexible hours

    Oracle

    Phoenix, AZ
    3 days ago
  • $100k - $125k

     ...Role - SRE Engineer Experience Required - 3+ Years Must Have Technical/Functional Skills Core Java, Splunk, Kibana, Grafana • Databases: Postgres, MongoDB • Experience in Production support engineering or SRE roles, preferably within the banking industry... 

    Tata Consultancy Services

    Phoenix, AZ
    4 days ago
  •  ...Hello, Job Title: (Site Reliability/Observability Engineer (SREs).) Phoenix, AZ Job Description: Objectives of this...  ...of system health. Build software and systems to manage platform infrastructure and applications. Improve reliability,... 

    E-Solutions

    Phoenix, AZ
    4 days ago
  • $100k - $125k

     ...Role - SRE with Data Engineer Experience Required - 8+ Years Must Have Technical/Functional Skills • In-depth knowledge of the...  ...and data integration techniques. • Experience with cloud platforms and big data tools (e.g., Google Big Query). • Strong analytical... 

    Tata Consultancy Services

    Phoenix, AZ
    4 days ago
  •  ..., automating, and maintaining security platforms that support enterprise cybersecurity operations...  ...cloud experience blended with platform engineering capabilities to mature the AI Security...  ...and enable real-time monitoring for observability. • Partner with incident response... 
    Immediate start
    Remote work
    Flexible hours

    Ford Motor Company

    Phoenix, AZ
    23 hours ago
  • $120k - $135k

     ...the culture. What You'll Be Doing: As a member of the Platform Engineering organization, you will be part of a team responsible for managing...  ...Network Engineer within our Site Reliability Engineering (SRE) organization, you'll play a pivotal role in building a secure... 
    Immediate start

    Evolent

    Phoenix, AZ
    23 hours ago
  • $65 - $75 per hour

     ...Platform Engineer – TEKsystems We are seeking a Platform Engineer to help build and scale our Kubernetes-based infrastructure. This...  ...native solutions, managing infrastructure as code, and ensuring observability across systems. You’ll work closely with our engineering... 
    Hourly pay
    Contract work
    Temporary work
    Remote work

    TEKsystems

    Phoenix, AZ
    3 days ago
  • $145.6k - $209.3k

     ...their days with our workforce operating platform. Helping people get paid, grow in their...  ...a Principal Cloud Platform Software Engineer in Enterprise Solutions and Experience...  ...automated deployment frameworks. Implement observability, monitoring, and logging solutions to... 
    Local area

    UKG

    Phoenix, AZ
    23 hours ago
  • $197.4k - $232k

     ...Remote Department Engineering Compensation: $197.4K -...  ...data doesn't sit still. Our platform puts information in motion,...  ...secure execution environments Observability & Operations: Drive operational...  ...product management, SRE, and other engineering teams... 
    Full time
    Remote work

    Confluent

    Phoenix, AZ
    1 day ago
  • $186.07k - $225k

     ...every day, as we build the emerging onchain platform — and with it, the future global...  ...for a Senior Machine Learning Platform Engineer to join our Machine Learning Platform team...  ...large volumes of data. Build tooling to observe the quality of data going into our... 
    Local area

    Coinbase

    Phoenix, AZ
    2 days ago
  • $100k - $109.5k

     ...communities we serve. The Senior Cloud & Backend Cloud Platform Engineer is responsible for designing, building, and operating cloud...  ...experience with GCP (Cloud Run or GKE, IAM, networking, CI/CD, observability, secrets management). ~ Working knowledge of AWS (IAM,... 
    Night shift

    Arizona State University

    Phoenix, AZ
    3 days ago
  • $99.6k - $223.4k

     ...the next generation of cloud-native EHR platforms that directly improve clinical outcomes. We're looking for senior engineers with deep Java expertise, exceptional debugging...  ...design for scalability, reliability, and observability Stay hands-on with coding while... 
    Full time
    Temporary work
    Remote work
    Flexible hours

    Oracle

    Phoenix, AZ
    2 days ago
  •  ...Senior AI Platform Engineer, Atlas AI USA (Phoenix) What Cognite Is: Relentless to Achieve Cognite operates at the forefront of...  ...performance vs. cost) for a given task. Implement evaluation and observability for all AI services. Create standardized frameworks for... 

    Cognite

    Phoenix, AZ
    3 days ago
  • $40 per hour

     ...hospitality industry around the world! As a Lead Cloud AI Platforms Engineer , you will bring your technical skills to a hospitality company...  ..., data engineering, AI and ML security, logging and observability. How We'll Help You Thrive At Hilton, the hospitality... 
    Work experience placement
    Remote work
    Worldwide
    Night shift

    Hilton

    Phoenix, AZ
    1 day ago
  • $200.72k - $222.68k

     ...Qualifications Bachelor's degree in Engineering, plus a minimum of 10 years of relevant...  ...this Position What You'll Own The platform architecture. You will define the...  ...CI/CD practices, testing expectations, observability requirements, operational toil automation... 
    Flexible hours
    Day shift

    General Dynamics Mission Systems

    Scottsdale, AZ
    4 days ago
  • $94.9k - $135.6k

     ...aligning development, testing, operations, and platform teams to deliver value safely and...  ...Cardinal Health is seeking a Release Engineer to lead iteration and release management...  ...Owners, Scrum Masters, Engineering, Testing, SRE, and Operations to align scope, sequencing... 
    Temporary work
    Local area
    Immediate start
    Flexible hours

    Cardinal Health

    Phoenix, AZ
    2 days ago
  •  ...Lead Data Platform Engineer Virtuous is evolving its data platform into an AI-ready foundation that powers trusted decision-making and...  ...or natural-language data access. Familiarity with data observability, lineage, or metadata tooling. Experience designing platforms... 
    Immediate start
    Flexible hours

    Virtuous

    Phoenix, AZ
    7 days ago
  •  ...to help build a hyper-scaling platform serving millions and want...  ...We’re hiring a Senior DevOps Engineer to scale, harden, and automate...  ...Improve system reliability, observability, and release consistency....  ...Qualifications ~5–8+ years in DevOps/SRE/platform engineering roles.... 
    Remote work

    TurboVets, Inc.

    Phoenix, AZ
    3 days ago
  • $79.2k - $178.1k

     ...Job Description Oracle Health is seeking an AI Platform Reliability Engineer to ensure our AI agent platform and AI-enabled analytics workflows are reliable, observable, measurable, and safe in production. This role will focus on the operational foundation for production... 
    Temporary work
    Flexible hours

    Oracle

    Phoenix, AZ
    1 day ago
  •  ...DevOps Engineer III Scottsdale, AZ LodgeLink is inviting a DevOps...  ..., secure, and reliable platforms across our multi-cloud environments...  ...ecosystem and our observability-first mindset. The DevOps...  ...years of experience in DevOps, SRE, or Platform Engineering roles... 
    Permanent employment
    Remote work

    Black Diamond Group

    Scottsdale, AZ
    1 day ago
  • $85.4k - $192.9k

     ...and experienced Senior DevOps Engineer to take a leading role in...  ...Site Reliability Engineering (SRE), and genuinely excited to learn...  ...tools. Utilize AI-driven observability for anomaly detection, predictive...  ...running on modern platforms like Cloud Run, Kubernetes (GKE... 
    Immediate start
    Remote work
    Relocation package
    Flexible hours

    Ford Motor Company

    Phoenix, AZ
    1 day ago
  • $104.9k - $174.7k

     ...Senior Site Reliability Engineer (SRE) About the Business: LexisNexis Risk Solutions...  ...infrastructure, writing Terraform, improving observability, and responding to real production...  ...operating monitoring and uptime platforms such as Grafana, Pingdom, and Uptrends... 
    Full time
    Work at office
    Local area
    Remote work
    Flexible hours

    RELX

    Paradise Valley, AZ
    4 days ago
  • $79.1k - $158.2k

     ...Oracle Health Data, Analytics Platform. This team will focus on...  ...contribution to make it a world class engineering center with the focus on...  ...Site Reliability Engineer (SRE), you will own shared,...  ...consumers ~ Experience designing observability and capacity models for... 
    Temporary work
    Immediate start
    Flexible hours

    Oracle

    Phoenix, AZ
    6 hours ago
  • $58.8k - $156.7k

     ...Site Reliability Engineer - Local to Phoenix, AZ Category: Software...  ...resilience of our critical platforms-spanning mainframe, ETL...  ...of monitoring, alerting, and observability . Strong understanding of...  ...QualityTroubleshootTechniques SRE (Site Reliability Engr.) TeamPlayer... 
    Permanent employment
    Full time
    Local area

    CGI Technologies and Solutions, Inc.

    Phoenix, AZ
    3 days ago
  • $109.2k - $223.4k

     ...complexity increase, OCI depends on hardware platforms that are both innovative and deployable...  ...gaps impacting scale Telemetry & Observability Define telemetry requirements (power...  ...Leadership Collaborate across OHD, engineering, operations, supply chain, and software... 
    Temporary work
    Flexible hours

    Oracle

    Phoenix, AZ
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Observability Platform Engineer (SRE). Be the first to apply!