Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability

$129k - $143k

Avaya

About Avaya Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships. We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya, you'll find a community that values your contributions and supports your growth every step of the way. Learn more at Description We are seeking a Site Reliability Engineer (SRE) who will drive stability, reliability, and performance across our Azure and GCP-based platforms. This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams. The ideal candidate will have hands-on experience with multi-cloud environments (Azure and GCP), IaC (Terraform/Ansible), CI/CD (Jenkins/GitHub Actions), and modern observability and AI-Ops systems. The engineer will also contribute to governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog. This position includes 24×7 support coverage (rotational) and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements. Key Responsibilities Reliability & Incident Management Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments. Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements. Maintain clear communication with cross-functional teams and leadership during major incidents. Monitoring, AI-Ops, Alerts & Prevention Build, tune, and maintain observability dashboards (Azure Monitor, GCP Operations Suite, Prometheus, Grafana, Datadog, Log Analytics). Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure. Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact. Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation. Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery. Analyze trends to prevent recurring issues and support teams in resilience engineering. Requirements Required Skills & Experience 5+ years in Site Reliability, DevOps, Cloud Operations, or Customer support roles. Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions. Expertise in Azure and GCP cloud operations and distributed system reliability. Understanding of Terraform, Ansible, and CI/CD pipelines (Jenkins, GitHub Actions). Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.). Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations). Excellent analytical, troubleshooting, and communication skills. Desired Behaviours Proactive Prevention: Identifies and resolves risks before they escalate into incidents. AI-Driven Mindset: Applies AI and automation to improve reliability and reduce human intervention. Accountability: Owns service reliability and communicates with clarity. Collaboration: Works seamlessly with platform, DevOps, and product teams. Efficiency: Focuses on automation to reduce manual effort and improve MTTR. Continuous Improvement: Learns from failures, iterates processes, and enhances documentation. The pay range for this opportunity is from $129,00 to $143,000 + performance-related bonus + benefits. This range represents the anticipated low and high end of the salary for this position. This role is also eligible to receive an annual bonus that aligns with individual and company performance. Actual salaries will vary and are based on factors such as a candidate’s qualifications, skills, competencies. Footer Applicants must be currently authorized to work in the United States without the need for visa sponsorship now or in the future. Avaya is an Equal Opportunity employer and a U.S. Federal Contractor. Our commitment to equality is a core value of Avaya. All qualified applicants and employees receive equal treatment without consideration for race, religion, sex, age, sexual orientation, gender identity, national origin, disability, status as a protected veteran or any other protected characteristic. In general, positions at Avaya require the ability to communicate and use office technology effectively. Physical requirements may vary by assigned work location. This job brief/description is subject to change. Nothing in this job description restricts Avaya right to alter the duties and responsibilities of this position at any time for any reason.

Apply

Vacancy posted 5 days ago

Similar jobs that could be interesting for youBased on the Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability in United States vacancy

Director, Site Reliability Engineering & Cloud Operations (SRE)
...Infrastructure And Site Reliability Engineering Leader At... ...infrastructure, SRE, and AI-driven operations... ...environments (Azure), supporting... ...pipelines, and observability tools (e.g.,... ...Infrastructure-as-Code (IaC), and managing... ...architectures, DevSecOps, and compliance...
Suggested
Resideo Technologies
Greenwald, MN
2 days ago
Senior SRE Engineer Azure Healthcare Observability Healthcare SaaS
Senior SRE Engineer Azure Healthcare Observability Healthcare SaaS As we expand our customer deployments, we seek... ...SRE / DevOps Engineer to ensure reliability, observability, and operational excellence... ...), CTO Technologies Must-have: Site Reliability Engineering (...
Suggested
Immediate start
Flexible hours
AppRecode, Inc.
Middletown, NJ
4 days ago
DevOps / DevSecOps Engineer - Azure | GCP | IaC | Security | Automation | AI-Ops
...alert: DevOps / DevSecOps Engineer - Azure | IaC | Security | Automation... ...organizations and government agencies forge... ...operations teams to drive reliability, scalability, and... ...strategies with observability and alerting. DevSecOps... ...in DevOps, Site Reliability, or...
Suggested
For contractors
Work at office
Visa sponsorship
Avaya
Richmond, VA
1 day ago
Site Reliability Engineer (SRE) - AI Platform & Cloud
...This is a Software Engineering position at... ...security and strong governance and promotes... ...experienced and driven Site Reliability Engineer (SRE) to join our AI... ..., Cloud (AWS, Azure, and/or Google),... ...-as-code (IaC) for provisioning... ...with monitoring / observability / logging / alerting...
Suggested
Full time
Morgan Stanley
Alpharetta, GA
20 hours ago
Site Reliability Engineer (SRE)
...and operate highly reliable cloud systems supporting... ...workloads for U.S. Government customers. This role is centered on DevSecOps and site reliability engineering, with a strong... ...ensuring systems are observable, fault-tolerant,... ...professional experience as an SRE, DevOps,...
Suggested
Permanent employment
Remote work
Quindar
United States
20 hours ago
Senior Site Reliability Engineer (SRE
...Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary... ...infrastructure, design modern observability solutions, and build sustainable on-... ...and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation for...
Remote work
Govserviceshub
New York, NY
4 days ago
Senior DevSecOps Engineer
$164k - $200k
...Description Senior DevSecOps Engineer About us... ...transform the Governance, Risk, and... ...ensuring the reliability, scalability,... ...understanding of Azure technologies.... ...Terraform/Terragrunt IaC pipeline for... ...environments, observability systems, and... ...experience in SRE, DevSecOps or...
For contractors
Local area
Immediate start
Home office
Hyperproof
Bellevue, WA
11 days ago
Site Reliability Engineer (SRE) - AWS Cloud (Terraform & Ansible Focus)
...We are seeking a Site Reliability Engineer (SRE) with deep expertise in AWS cloud infrastructure , Infrastructure as Code (IaC) , and large-scale production operations. This role is... ...analysis (RCA), and postmortems Improve observability using logging, monitoring, and...
Remote work
Shift work
GR8 People
New York, NY
2 days ago
Site Reliability Engineering (SRE) Platform Engineer (Lead)
...Site Reliability Engineering (SRE) Platform Engineer (Lead) Job Number:... ...software enhancements, observability, automation, and... ...Infrastructure: Microsoft Azure (Software, Storage,... ...troubleshooting. DevSecOps & Automation: Lead... ...and change governance. Drive automation to...
Local area
Eclaro
Rochester, NY
2 days ago
Remote SRE: Observability & Telemetry Engineer
A leading consulting firm is seeking a Site Reliability Engineer (SRE) for a remote role based in Germany. The successful candidate will design and maintain observability platforms, automate deployments, and contribute to scalable monitoring solutions. Required qualifications...
Remote job
Fixed term contract
Starcom consulting limited
New Bremen, OH
2 days ago
Azure DevOps Platform Engineer CI/CD, SRE & Observability
...Technical Skills Azure DevOps (repos, pipelines) CI/CD pipelines... ...JavaScript, Python) Grafana or observability tools SonarQube (code quality... ...environments Knowledge of SRE principles and incident management... ...resolution time, platform reliability, developer productivity, and...
Apex Systems
New York, NY
2 days ago
Senior DevOps Engineer / Site Reliability Engineer (SRE)
$140k - $160k
...seeking a Senior DevOps Engineer / Site Reliability Engineer (SRE) to architect and... ...• Build and operate observability platforms and CI/CD pipelines... ...cloud platform (AWS/Azure/GCP) with deep understanding... ..., and service governance • Skilled in IaC (Infrastructure as Code...
Immediate start
Remote work
Thomas Talent Network
Raleigh, NC
14 days ago
Site Reliability Engineering (SRE) Automation and Orchestration Engineer
$73.45k - $132.78k
...opening for a Site Reliability Automation and... ...Orchestration Engineer on a high-... ...transfer. SRE Engineering and... ...supporting the federal government. Additional... ...of Agile and DevSecOps/SRE concepts... ...Jira and/or Azure DevOps workflows... ...as Code (IaC) tools such as...
Local area
Immediate start
Remote work
Leidos
Fort Shafter, HI
1 day ago
SRE Platform Engineer Lead - Enterprise Integrations
$141.6k - $212.4k
...Looking for a SRE Platform... ...combining platform engineering + SRE +... ...ops, schema governance, RBAC/ACLs,... ...expertise in observability using Datadog... ...building DevSecOps automation: CI/CD, IaC, security controls... ...and improve reliability.DevSecOps... ...Our Careers Site is only for...
Work experience placement
Work from home
Weekend work
Weekday work
Nutanix
San Diego, CA
1 day ago
Principal SRE Engineer (US Citizen)
$147k - $237.5k
...for a Principal SRE to join our InfoSec... ...your mad SRE/DevSecOps skills, you’ll build... ...Infrastructure as Code (IaC) using Terraform... ...and end‑to‑end observability across production... ...Computer Science/Engineering or equivalent... ...handling US Federal Government project...
Full time
Work at office
Visa sponsorship
Work visa
Palo Alto Networks, Inc.
Santa Clara, CA
20 hours ago
DevSecOps / Cloud Security Engineer
...architecture, automation, governance, and compliance to protect... ...including Infrastructure as Code (IaC), DevSecOps integration, identity and... ...Support unified observability capabilities Configure... ...~ Experience with AWS, Azure, or GCP 5+ years' experience...
Remote work
RICEFW Technologies
United States
2 days ago
SRE / Dynatrace Observability, Automation Engineer
SRE / Dynatrace Observability & Automation Engineer Mount Laurel, NJ | 6 - 8 years of experience Job Description Strong hands‑on experience with Dynatrace... ...of known failure patterns. Partner with platform, governance, and control teams to ensure observability,...
Permanent employment
Tata Consultancy Services Limited
Mount Laurel, NJ
4 days ago
Senior SRE Engineer (AKS, Azure, Terraform, Kubernetes, and PowerShell.)
$85 - $90 per hour
Senior SRE Engineer (AKS, Azure, Terraform, Kubernetes, and PowerShell.) JOB ID... ...Month contract *** 4 days on-site *** -- We have a great... ...Kubernetes. Experience using IAC tools such as Terraform,... ...scripting Experience managing observability tools such as Grafana,...
Hourly pay
Contract work
Work experience placement
CorGTA
Dallas, TX
20 hours ago
Senior Platform Engineer, Azure, Terraform, IaC, Security, DR - Hybrid/Dallas, TX
...Senior Platform Engineer, Azure, Terraform, IaC, Security, DR –... ...automation and governance that empowers... ...compliance and reliability locked in. Join... ...infrastructure modules, observability, application... ...in Systems/SRE/DevOps roles.... ...security gates (DevSecOps). ~ Expert in...
Permanent employment
Work at office
2 days per week
3 days per week
Riccione Resources
Fort Worth, TX
19 days ago
Remote DevSecOps Engineer: Azure, IaC & Security Automation
...seeking a results-driven DevOps / DevSecOps Engineer to automate and secure cloud infrastructure on Azure. The role demands deep... ...expertise in Infrastructure as Code (IaC), CI/CD pipelines, and... ...and security teams to enhance reliability and compliance. Responsibilities...
Remote work
Avaya
New York, NY
4 days ago
Staff Observability Platform Engineer (SRE)
$118.45k - $236.9k
...Lead Platform Reliability Engineer We're building a world of health around... ...design and implement metrics and observability frameworks with a strong... ..., Platform Engineering, or SRE. ~7+ years of experience with... ...platforms (AWS, GCP, or Azure). ~5+ years designing and...
Hourly pay
Full time
Temporary work
Oak St. Health
Scottsdale, AZ
1 day ago
Senior Platform Engineer (DevSecOps - San Antonio)
...Senior Platform Engineer (DevSecOps) to architect... ..., and is on site. What You’ll... ...& IaC: Architect, implement... ...infrastructure (AWS, Azure, GCP, OCI)... .... Cloud Governance: Establish and... .... Site Reliability Engineering (SRE): Implement robust observability (monitoring,...
Flexible hours
Shift work
Rippling
San Antonio, TX
1 day ago
Senior Site Reliability Engineer (SRE) - Operations
...Senior Site Reliability Engineer - Operations As a leading financial services... ...Site Reliability Engineer (SRE) to join our Operations team... ...monitoring, alerting, and observability tools (e.g., Prometheus, Grafana... ...infrastructure (AWS, Azure, GCP, VMware, etc.). Manage...
Ongoing contract
Casual work
Remote work
Flexible hours
SS&C Technologies Holdings
Pensacola, FL
4 days ago
Cloud AWS Support Reliability Engineer (SRE)
$160k - $230k
...infrastructure Implementing observability and monitoring to ensure delivery... ...COTS tools Write Terraform IaC code to install custom... ...JBoss, Tomcat) Knowledge of DevSecOps, Agile-Scrum, JIRA methodologies... ...Solid foundations of DevOps and SRE principles and roles Technical...
Permanent employment
Full time
Day shift
Federal Reserve Bank of New York
New York, NY
8 hours ago
Site Reliability Engineer (SRE)
...secure, and user-friendly applications.Site Reliability Engineer (SRE)Job Title: Site Reliability Engineer... ...that promote safe, frequent, and observable releases.Lead capacity planning and performance... ...with a major cloud platform (AWS, Azure, or GCP).Background in capacity...
Full time
H1b
Local area
Immediate start
Remote work
Bright Vision Technologies
Frisco, TX
3 days ago
Site Reliability Engineer - Observability & Performance Engineering
$65 - $70 per hour
...leading organization in the technology sector, is seeking a Site Reliability Engineer – Observability & Performance Engineering to join their team. As a Site... ...of experience in IT, with at least 6 years in DevOps, SRE, or performance engineering roles. ~ Proficiency in...
Weekly pay
3 days per week
Manpower Group Inc.
Chandler, AZ
4 days ago
Site Reliability Engineer (SRE)
...Site Reliability Engineer (SRE) | Lockheed Martin The 1LMX MES COE is seeking an engineer who will own infrastructure... ...Cloud Service Providers (e.g., AWS, Azure) - Experience with automation tools / scripting (e.g., GitLab, Ansible, IaC) - Experience working with both Windows...
Remote work
Lockheed Martin Corporation
United States
1 day ago
Site Reliability Engineer (SRE)
...We're seeking a highly skilled Site Reliability Engineer (SRE) to join our engineering team and help ensure... ...and improve system performanceBuild observability into systems through metrics,... ...experienceExperience with cloud platforms (e.g., Azure) and container orchestration (e.g.,...
Temporary work
Interim role
Remote work
Flexible hours
OutSolve - Beyond Compliance
Mission, KS
4 days ago
Site Reliability Engineer (SRE)
$1,000 per month
...role We're looking for a talented Site Reliability Engineer to join our infrastructure team and help... ...our global banking platform. As an SRE at Bloxley, you'll be responsible for... ...: Ensure high availability, observability, and security across all financial systems...
Full time
Immediate start
Remote work
Worldwide
Flexible hours
Bloxley
Mission, KS
4 days ago
Platform SRE Engineer: Scalable Cloud & Observability
...DroneUp, LLC is hiring an SRE - Platform Engineer in the United States, focusing on the reliability and performance of their IT infrastructure while mentoring teams. Responsibilities include managing SLOs and incident response while working with cloud technologies such...
DroneUp
New York, NY
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability. Be the first to apply!