Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability

$129k - $143k

Avaya

About Avaya Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships. We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya, you'll find a community that values your contributions and supports your growth every step of the way. Learn more at Description We are seeking a Site Reliability Engineer (SRE) who will drive stability, reliability, and performance across our Azure and GCP-based platforms. This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams. The ideal candidate will have hands-on experience with multi-cloud environments (Azure and GCP), IaC (Terraform/Ansible), CI/CD (Jenkins/GitHub Actions), and modern observability and AI-Ops systems. The engineer will also contribute to governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog. This position includes 24×7 support coverage (rotational) and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements. Key Responsibilities Reliability & Incident Management Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments. Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements. Maintain clear communication with cross-functional teams and leadership during major incidents. Monitoring, AI-Ops, Alerts & Prevention Build, tune, and maintain observability dashboards (Azure Monitor, GCP Operations Suite, Prometheus, Grafana, Datadog, Log Analytics). Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure. Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact. Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation. Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery. Analyze trends to prevent recurring issues and support teams in resilience engineering. Requirements Required Skills & Experience 5+ years in Site Reliability, DevOps, Cloud Operations, or Customer support roles. Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions. Expertise in Azure and GCP cloud operations and distributed system reliability. Understanding of Terraform, Ansible, and CI/CD pipelines (Jenkins, GitHub Actions). Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.). Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations). Excellent analytical, troubleshooting, and communication skills. Desired Behaviours Proactive Prevention: Identifies and resolves risks before they escalate into incidents. AI-Driven Mindset: Applies AI and automation to improve reliability and reduce human intervention. Accountability: Owns service reliability and communicates with clarity. Collaboration: Works seamlessly with platform, DevOps, and product teams. Efficiency: Focuses on automation to reduce manual effort and improve MTTR. Continuous Improvement: Learns from failures, iterates processes, and enhances documentation. The pay range for this opportunity is from $129,00 to $143,000 + performance-related bonus + benefits. This range represents the anticipated low and high end of the salary for this position. This role is also eligible to receive an annual bonus that aligns with individual and company performance. Actual salaries will vary and are based on factors such as a candidate’s qualifications, skills, competencies. Footer Applicants must be currently authorized to work in the United States without the need for visa sponsorship now or in the future. Avaya is an Equal Opportunity employer and a U.S. Federal Contractor. Our commitment to equality is a core value of Avaya. All qualified applicants and employees receive equal treatment without consideration for race, religion, sex, age, sexual orientation, gender identity, national origin, disability, status as a protected veteran or any other protected characteristic. In general, positions at Avaya require the ability to communicate and use office technology effectively. Physical requirements may vary by assigned work location. This job brief/description is subject to change. Nothing in this job description restricts Avaya right to alter the duties and responsibilities of this position at any time for any reason.

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability in United States vacancy
  •  ...Infrastructure And Site Reliability Engineering Leader At...  ...infrastructure, SRE, and AI-driven operations...  ...environments (Azure), supporting...  ...pipelines, and observability tools (e.g.,...  ...Infrastructure-as-Code (IaC), and managing...  ...architectures, DevSecOps, and compliance... 
    Suggested

    Resideo Technologies

    Greenwald, MN
    2 days ago
  • Senior SRE Engineer Azure Healthcare Observability Healthcare SaaS As we expand our customer deployments, we seek...  ...SRE / DevOps Engineer to ensure reliability, observability, and operational excellence...  ...), CTO Technologies Must-have: Site Reliability Engineering (... 
    Suggested
    Immediate start
    Flexible hours

    AppRecode, Inc.

    Middletown, NJ
    4 days ago
  •  ...alert: DevOps / DevSecOps Engineer - Azure | IaC | Security | Automation...  ...organizations and government agencies forge...  ...operations teams to drive reliability, scalability, and...  ...strategies with observability and alerting. DevSecOps...  ...in DevOps, Site Reliability, or... 
    Suggested
    For contractors
    Work at office
    Visa sponsorship

    Avaya

    Richmond, VA
    1 day ago
  •  ...This is a Software Engineering position at...  ...security and strong governance and promotes...  ...experienced and driven Site Reliability Engineer (SRE) to join our AI...  ..., Cloud (AWS, Azure, and/or Google),...  ...-as-code (IaC) for provisioning...  ...with monitoring / observability / logging / alerting... 
    Suggested
    Full time

    Morgan Stanley

    Alpharetta, GA
    20 hours ago
  •  ...and operate highly reliable cloud systems supporting...  ...workloads for U.S. Government customers. This role is centered on DevSecOps and site reliability engineering, with a strong...  ...ensuring systems are observable, fault-tolerant,...  ...professional experience as an SRE, DevOps,... 
    Suggested
    Permanent employment
    Remote work

    Quindar

    United States
    20 hours ago
  •  ...Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary...  ...infrastructure, design modern observability solutions, and build sustainable on-...  ...and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation for... 
    Remote work

    Govserviceshub

    New York, NY
    4 days ago
  • $164k - $200k

     ...Description Senior DevSecOps Engineer About us...  ...transform the Governance, Risk, and...  ...ensuring the reliability, scalability,...  ...understanding of Azure technologies....  ...Terraform/Terragrunt IaC pipeline for...  ...environments, observability systems, and...  ...experience in SRE, DevSecOps or... 
    For contractors
    Local area
    Immediate start
    Home office

    Hyperproof

    Bellevue, WA
    11 days ago
  •  ...We are seeking a Site Reliability Engineer (SRE) with deep expertise in AWS cloud infrastructure , Infrastructure as Code (IaC) , and large-scale production operations. This role is...  ...analysis (RCA), and postmortems Improve observability using logging, monitoring, and... 
    Remote work
    Shift work

    GR8 People

    New York, NY
    2 days ago
  •  ...Site Reliability Engineering (SRE) Platform Engineer (Lead) Job Number:...  ...software enhancements, observability, automation, and...  ...Infrastructure: Microsoft Azure (Software, Storage,...  ...troubleshooting. DevSecOps & Automation: Lead...  ...and change governance. Drive automation to... 
    Local area

    Eclaro

    Rochester, NY
    2 days ago
  • A leading consulting firm is seeking a Site Reliability Engineer (SRE) for a remote role based in Germany. The successful candidate will design and maintain observability platforms, automate deployments, and contribute to scalable monitoring solutions. Required qualifications... 
    Remote job
    Fixed term contract

    Starcom consulting limited

    New Bremen, OH
    2 days ago
  •  ...Technical Skills Azure DevOps (repos, pipelines) CI/CD pipelines...  ...JavaScript, Python) Grafana or observability tools SonarQube (code quality...  ...environments Knowledge of SRE principles and incident management...  ...resolution time, platform reliability, developer productivity, and... 

    Apex Systems

    New York, NY
    2 days ago
  • $140k - $160k

     ...seeking a Senior DevOps Engineer / Site Reliability Engineer (SRE) to architect and...  ...• Build and operate observability platforms and CI/CD pipelines...  ...cloud platform (AWS/Azure/GCP) with deep understanding...  ..., and service governance • Skilled in IaC (Infrastructure as Code... 
    Immediate start
    Remote work

    Thomas Talent Network

    Raleigh, NC
    14 days ago
  • $73.45k - $132.78k

     ...opening for a Site Reliability Automation and...  ...Orchestration Engineer on a high-...  ...transfer. SRE Engineering and...  ...supporting the federal government. Additional...  ...of Agile and DevSecOps/SRE concepts...  ...Jira and/or Azure DevOps workflows...  ...as Code (IaC) tools such as... 
    Local area
    Immediate start
    Remote work

    Leidos

    Fort Shafter, HI
    1 day ago
  • $141.6k - $212.4k

     ...Looking for a SRE Platform...  ...combining platform engineering + SRE +...  ...ops, schema governance, RBAC/ACLs,...  ...expertise in observability using Datadog...  ...building DevSecOps automation: CI/CD, IaC, security controls...  ...and improve reliability.DevSecOps...  ...Our Careers Site is only for... 
    Work experience placement
    Work from home
    Weekend work
    Weekday work

    Nutanix

    San Diego, CA
    1 day ago
  • $147k - $237.5k

     ...for a Principal SRE to join our InfoSec...  ...your mad SRE/DevSecOps skills, you’ll build...  ...Infrastructure as Code (IaC) using Terraform...  ...and end‑to‑end observability across production...  ...Computer Science/Engineering or equivalent...  ...handling US Federal Government project... 
    Full time
    Work at office
    Visa sponsorship
    Work visa

    Palo Alto Networks, Inc.

    Santa Clara, CA
    20 hours ago
  •  ...architecture, automation, governance, and compliance to protect...  ...including Infrastructure as Code (IaC), DevSecOps integration, identity and...  ...Support unified observability capabilities Configure...  ...~ Experience with AWS, Azure, or GCP 5+ years' experience... 
    Remote work

    RICEFW Technologies

    United States
    2 days ago
  • SRE / Dynatrace Observability & Automation Engineer Mount Laurel, NJ | 6 - 8 years of experience Job Description Strong hands‑on experience with Dynatrace...  ...of known failure patterns. Partner with platform, governance, and control teams to ensure observability,... 
    Permanent employment

    Tata Consultancy Services Limited

    Mount Laurel, NJ
    4 days ago
  • $85 - $90 per hour

    Senior SRE Engineer (AKS, Azure, Terraform, Kubernetes, and PowerShell.) JOB ID...  ...Month contract *** 4 days on-site *** -- We have a great...  ...Kubernetes. Experience using IAC tools such as Terraform,...  ...scripting Experience managing observability tools such as Grafana,... 
    Hourly pay
    Contract work
    Work experience placement

    CorGTA

    Dallas, TX
    20 hours ago
  •  ...Senior Platform Engineer, Azure, Terraform, IaC, Security, DR –...  ...automation and governance that empowers...  ...compliance and reliability locked in. Join...  ...infrastructure modules, observability, application...  ...in Systems/SRE/DevOps roles....  ...security gates (DevSecOps). ~ Expert in... 
    Permanent employment
    Work at office
    2 days per week
    3 days per week

    Riccione Resources

    Fort Worth, TX
    19 days ago
  •  ...seeking a results-driven DevOps / DevSecOps Engineer to automate and secure cloud infrastructure on Azure. The role demands deep...  ...expertise in Infrastructure as Code (IaC), CI/CD pipelines, and...  ...and security teams to enhance reliability and compliance. Responsibilities... 
    Remote work

    Avaya

    New York, NY
    4 days ago
  • $118.45k - $236.9k

     ...Lead Platform Reliability Engineer We're building a world of health around...  ...design and implement metrics and observability frameworks with a strong...  ..., Platform Engineering, or SRE. ~7+ years of experience with...  ...platforms (AWS, GCP, or Azure). ~5+ years designing and... 
    Hourly pay
    Full time
    Temporary work

    Oak St. Health

    Scottsdale, AZ
    1 day ago
  •  ...Senior Platform Engineer (DevSecOps) to architect...  ..., and is on site. What You’ll...  ...& IaC: Architect, implement...  ...infrastructure (AWS, Azure, GCP, OCI)...  .... Cloud Governance: Establish and...  .... Site Reliability Engineering (SRE): Implement robust observability (monitoring,... 
    Flexible hours
    Shift work

    Rippling

    San Antonio, TX
    1 day ago
  •  ...Senior Site Reliability Engineer - Operations As a leading financial services...  ...Site Reliability Engineer (SRE) to join our Operations team...  ...monitoring, alerting, and observability tools (e.g., Prometheus, Grafana...  ...infrastructure (AWS, Azure, GCP, VMware, etc.). Manage... 
    Ongoing contract
    Casual work
    Remote work
    Flexible hours

    SS&C Technologies Holdings

    Pensacola, FL
    4 days ago
  • $160k - $230k

     ...infrastructure Implementing observability and monitoring to ensure delivery...  ...COTS tools Write Terraform IaC code to install custom...  ...JBoss, Tomcat) Knowledge of DevSecOps, Agile-Scrum, JIRA methodologies...  ...Solid foundations of DevOps and SRE principles and roles Technical... 
    Permanent employment
    Full time
    Day shift

    Federal Reserve Bank of New York

    New York, NY
    8 hours ago
  •  ...secure, and user-friendly applications.Site Reliability Engineer (SRE)Job Title: Site Reliability Engineer...  ...that promote safe, frequent, and observable releases.Lead capacity planning and performance...  ...with a major cloud platform (AWS, Azure, or GCP).Background in capacity... 
    Full time
    H1b
    Local area
    Immediate start
    Remote work

    Bright Vision Technologies

    Frisco, TX
    3 days ago
  • $65 - $70 per hour

     ...leading organization in the technology sector, is seeking a Site Reliability Engineer – Observability & Performance Engineering to join their team. As a Site...  ...of experience in IT, with at least 6 years in DevOps, SRE, or performance engineering roles. ~ Proficiency in... 
    Weekly pay
    3 days per week

    Manpower Group Inc.

    Chandler, AZ
    4 days ago
  •  ...Site Reliability Engineer (SRE) | Lockheed Martin The 1LMX MES COE is seeking an engineer who will own infrastructure...  ...Cloud Service Providers (e.g., AWS, Azure) - Experience with automation tools / scripting (e.g., GitLab, Ansible, IaC) - Experience working with both Windows... 
    Remote work

    Lockheed Martin Corporation

    United States
    1 day ago
  •  ...We're seeking a highly skilled Site Reliability Engineer (SRE) to join our engineering team and help ensure...  ...and improve system performanceBuild observability into systems through metrics,...  ...experienceExperience with cloud platforms (e.g., Azure) and container orchestration (e.g.,... 
    Temporary work
    Interim role
    Remote work
    Flexible hours

    OutSolve - Beyond Compliance

    Mission, KS
    4 days ago
  • $1,000 per month

     ...role We're looking for a talented Site Reliability Engineer to join our infrastructure team and help...  ...our global banking platform. As an SRE at Bloxley, you'll be responsible for...  ...: Ensure high availability, observability, and security across all financial systems... 
    Full time
    Immediate start
    Remote work
    Worldwide
    Flexible hours

    Bloxley

    Mission, KS
    4 days ago
  •  ...DroneUp, LLC is hiring an SRE - Platform Engineer in the United States, focusing on the reliability and performance of their IT infrastructure while mentoring teams. Responsibilities include managing SLOs and incident response while working with cloud technologies such... 

    DroneUp

    New York, NY
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability. Be the first to apply!