Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability
$129k - $143kAvaya
About Avaya Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships. We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya, you'll find a community that values your contributions and supports your growth every step of the way. Learn more at Description We are seeking a Site Reliability Engineer (SRE) who will drive stability, reliability, and performance across our Azure and GCP-based platforms. This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams. The ideal candidate will have hands-on experience with multi-cloud environments (Azure and GCP), IaC (Terraform/Ansible), CI/CD (Jenkins/GitHub Actions), and modern observability and AI-Ops systems. The engineer will also contribute to governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog. This position includes 24×7 support coverage (rotational) and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements. Key Responsibilities Reliability & Incident Management Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments. Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements. Maintain clear communication with cross-functional teams and leadership during major incidents. Monitoring, AI-Ops, Alerts & Prevention Build, tune, and maintain observability dashboards (Azure Monitor, GCP Operations Suite, Prometheus, Grafana, Datadog, Log Analytics). Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure. Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact. Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation. Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery. Analyze trends to prevent recurring issues and support teams in resilience engineering. Requirements Required Skills & Experience 5+ years in Site Reliability, DevOps, Cloud Operations, or Customer support roles. Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions. Expertise in Azure and GCP cloud operations and distributed system reliability. Understanding of Terraform, Ansible, and CI/CD pipelines (Jenkins, GitHub Actions). Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.). Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations). Excellent analytical, troubleshooting, and communication skills. Desired Behaviours Proactive Prevention: Identifies and resolves risks before they escalate into incidents. AI-Driven Mindset: Applies AI and automation to improve reliability and reduce human intervention. Accountability: Owns service reliability and communicates with clarity. Collaboration: Works seamlessly with platform, DevOps, and product teams. Efficiency: Focuses on automation to reduce manual effort and improve MTTR. Continuous Improvement: Learns from failures, iterates processes, and enhances documentation. The pay range for this opportunity is from $129,00 to $143,000 + performance-related bonus + benefits. This range represents the anticipated low and high end of the salary for this position. This role is also eligible to receive an annual bonus that aligns with individual and company performance. Actual salaries will vary and are based on factors such as a candidate’s qualifications, skills, competencies. Footer Applicants must be currently authorized to work in the United States without the need for visa sponsorship now or in the future. Avaya is an Equal Opportunity employer and a U.S. Federal Contractor. Our commitment to equality is a core value of Avaya. All qualified applicants and employees receive equal treatment without consideration for race, religion, sex, age, sexual orientation, gender identity, national origin, disability, status as a protected veteran or any other protected characteristic. In general, positions at Avaya require the ability to communicate and use office technology effectively. Physical requirements may vary by assigned work location. This job brief/description is subject to change. Nothing in this job description restricts Avaya right to alter the duties and responsibilities of this position at any time for any reason.
- ...Infrastructure And Site Reliability Engineering Leader At... ...infrastructure, SRE, and AI-driven operations... ...environments (Azure), supporting... ...pipelines, and observability tools (e.g.,... ...Infrastructure-as-Code (IaC), and managing... ...architectures, DevSecOps, and compliance...Suggested
- Senior SRE Engineer Azure Healthcare Observability Healthcare SaaS As we expand our customer deployments, we seek... ...SRE / DevOps Engineer to ensure reliability, observability, and operational excellence... ...), CTO Technologies Must-have: Site Reliability Engineering (...SuggestedImmediate startFlexible hours
- ...alert: DevOps / DevSecOps Engineer - Azure | IaC | Security | Automation... ...organizations and government agencies forge... ...operations teams to drive reliability, scalability, and... ...strategies with observability and alerting. DevSecOps... ...in DevOps, Site Reliability, or...SuggestedFor contractorsWork at officeVisa sponsorship
- ...This is a Software Engineering position at... ...security and strong governance and promotes... ...experienced and driven Site Reliability Engineer (SRE) to join our AI... ..., Cloud (AWS, Azure, and/or Google),... ...-as-code (IaC) for provisioning... ...with monitoring / observability / logging / alerting...SuggestedFull time
- ...and operate highly reliable cloud systems supporting... ...workloads for U.S. Government customers. This role is centered on DevSecOps and site reliability engineering, with a strong... ...ensuring systems are observable, fault-tolerant,... ...professional experience as an SRE, DevOps,...SuggestedPermanent employmentRemote work
- ...Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary... ...infrastructure, design modern observability solutions, and build sustainable on-... ...and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation for...Remote work
$164k - $200k
...Description Senior DevSecOps Engineer About us... ...transform the Governance, Risk, and... ...ensuring the reliability, scalability,... ...understanding of Azure technologies.... ...Terraform/Terragrunt IaC pipeline for... ...environments, observability systems, and... ...experience in SRE, DevSecOps or...For contractorsLocal areaImmediate startHome office- ...We are seeking a Site Reliability Engineer (SRE) with deep expertise in AWS cloud infrastructure , Infrastructure as Code (IaC) , and large-scale production operations. This role is... ...analysis (RCA), and postmortems Improve observability using logging, monitoring, and...Remote workShift work
- ...Site Reliability Engineering (SRE) Platform Engineer (Lead) Job Number:... ...software enhancements, observability, automation, and... ...Infrastructure: Microsoft Azure (Software, Storage,... ...troubleshooting. DevSecOps & Automation: Lead... ...and change governance. Drive automation to...Local area
- A leading consulting firm is seeking a Site Reliability Engineer (SRE) for a remote role based in Germany. The successful candidate will design and maintain observability platforms, automate deployments, and contribute to scalable monitoring solutions. Required qualifications...Remote jobFixed term contract
- ...Technical Skills Azure DevOps (repos, pipelines) CI/CD pipelines... ...JavaScript, Python) Grafana or observability tools SonarQube (code quality... ...environments Knowledge of SRE principles and incident management... ...resolution time, platform reliability, developer productivity, and...
$140k - $160k
...seeking a Senior DevOps Engineer / Site Reliability Engineer (SRE) to architect and... ...• Build and operate observability platforms and CI/CD pipelines... ...cloud platform (AWS/Azure/GCP) with deep understanding... ..., and service governance • Skilled in IaC (Infrastructure as Code...Immediate startRemote work$73.45k - $132.78k
...opening for a Site Reliability Automation and... ...Orchestration Engineer on a high-... ...transfer. SRE Engineering and... ...supporting the federal government. Additional... ...of Agile and DevSecOps/SRE concepts... ...Jira and/or Azure DevOps workflows... ...as Code (IaC) tools such as...Local areaImmediate startRemote work$141.6k - $212.4k
...Looking for a SRE Platform... ...combining platform engineering + SRE +... ...ops, schema governance, RBAC/ACLs,... ...expertise in observability using Datadog... ...building DevSecOps automation: CI/CD, IaC, security controls... ...and improve reliability.DevSecOps... ...Our Careers Site is only for...Work experience placementWork from homeWeekend workWeekday work$147k - $237.5k
...for a Principal SRE to join our InfoSec... ...your mad SRE/DevSecOps skills, you’ll build... ...Infrastructure as Code (IaC) using Terraform... ...and end‑to‑end observability across production... ...Computer Science/Engineering or equivalent... ...handling US Federal Government project...Full timeWork at officeVisa sponsorshipWork visa- ...architecture, automation, governance, and compliance to protect... ...including Infrastructure as Code (IaC), DevSecOps integration, identity and... ...Support unified observability capabilities Configure... ...~ Experience with AWS, Azure, or GCP 5+ years' experience...Remote work
- SRE / Dynatrace Observability & Automation Engineer Mount Laurel, NJ | 6 - 8 years of experience Job Description Strong hands‑on experience with Dynatrace... ...of known failure patterns. Partner with platform, governance, and control teams to ensure observability,...Permanent employment
$85 - $90 per hour
Senior SRE Engineer (AKS, Azure, Terraform, Kubernetes, and PowerShell.) JOB ID... ...Month contract *** 4 days on-site *** -- We have a great... ...Kubernetes. Experience using IAC tools such as Terraform,... ...scripting Experience managing observability tools such as Grafana,...Hourly payContract workWork experience placement- ...Senior Platform Engineer, Azure, Terraform, IaC, Security, DR –... ...automation and governance that empowers... ...compliance and reliability locked in. Join... ...infrastructure modules, observability, application... ...in Systems/SRE/DevOps roles.... ...security gates (DevSecOps). ~ Expert in...Permanent employmentWork at office2 days per week3 days per week
- ...seeking a results-driven DevOps / DevSecOps Engineer to automate and secure cloud infrastructure on Azure. The role demands deep... ...expertise in Infrastructure as Code (IaC), CI/CD pipelines, and... ...and security teams to enhance reliability and compliance. Responsibilities...Remote work
$118.45k - $236.9k
...Lead Platform Reliability Engineer We're building a world of health around... ...design and implement metrics and observability frameworks with a strong... ..., Platform Engineering, or SRE. ~7+ years of experience with... ...platforms (AWS, GCP, or Azure). ~5+ years designing and...Hourly payFull timeTemporary work- ...Senior Platform Engineer (DevSecOps) to architect... ..., and is on site. What You’ll... ...& IaC: Architect, implement... ...infrastructure (AWS, Azure, GCP, OCI)... .... Cloud Governance: Establish and... .... Site Reliability Engineering (SRE): Implement robust observability (monitoring,...Flexible hoursShift work
- ...Senior Site Reliability Engineer - Operations As a leading financial services... ...Site Reliability Engineer (SRE) to join our Operations team... ...monitoring, alerting, and observability tools (e.g., Prometheus, Grafana... ...infrastructure (AWS, Azure, GCP, VMware, etc.). Manage...Ongoing contractCasual workRemote workFlexible hours
$160k - $230k
...infrastructure Implementing observability and monitoring to ensure delivery... ...COTS tools Write Terraform IaC code to install custom... ...JBoss, Tomcat) Knowledge of DevSecOps, Agile-Scrum, JIRA methodologies... ...Solid foundations of DevOps and SRE principles and roles Technical...Permanent employmentFull timeDay shift- ...secure, and user-friendly applications.Site Reliability Engineer (SRE)Job Title: Site Reliability Engineer... ...that promote safe, frequent, and observable releases.Lead capacity planning and performance... ...with a major cloud platform (AWS, Azure, or GCP).Background in capacity...Full timeH1bLocal areaImmediate startRemote work
$65 - $70 per hour
...leading organization in the technology sector, is seeking a Site Reliability Engineer – Observability & Performance Engineering to join their team. As a Site... ...of experience in IT, with at least 6 years in DevOps, SRE, or performance engineering roles. ~ Proficiency in...Weekly pay3 days per week- ...Site Reliability Engineer (SRE) | Lockheed Martin The 1LMX MES COE is seeking an engineer who will own infrastructure... ...Cloud Service Providers (e.g., AWS, Azure) - Experience with automation tools / scripting (e.g., GitLab, Ansible, IaC) - Experience working with both Windows...Remote work
- ...We're seeking a highly skilled Site Reliability Engineer (SRE) to join our engineering team and help ensure... ...and improve system performanceBuild observability into systems through metrics,... ...experienceExperience with cloud platforms (e.g., Azure) and container orchestration (e.g.,...Temporary workInterim roleRemote workFlexible hours
$1,000 per month
...role We're looking for a talented Site Reliability Engineer to join our infrastructure team and help... ...our global banking platform. As an SRE at Bloxley, you'll be responsible for... ...: Ensure high availability, observability, and security across all financial systems...Full timeImmediate startRemote workWorldwideFlexible hours- ...DroneUp, LLC is hiring an SRE - Platform Engineer in the United States, focusing on the reliability and performance of their IT infrastructure while mentoring teams. Responsibilities include managing SLOs and incident response while working with cloud technologies such...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability. Be the first to apply!
- site reliability engineer United States
- site reliability engineer sre United States
- lead site reliability engineer United States
- site reliability engineer remote United States
- site reliability engineering manager United States
- azure specialist United States
- azure data architect United States
- azure security engineer United States
- devops engineer azure United States
- cloud engineer azure United States


