Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability
$129k - $143kAvaya Corporation
Select how often (in days) to receive an alert: Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability Date: Mar 17, 2026 Location: Remote, US Requisition ID: 37592 About Avaya Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships. We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya, you'll find a community that values your contributions and supports your growth every step of the way. Learn more at Description We are seeking a Site Reliability Engineer (SRE) who will drive stability, reliability, and performance across our Azure and GCP-based platforms . This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams. The ideal candidate will have hands-on experience with multi-cloud environments (Azure and GCP) , IaC (Terraform/Ansible) , CI/CD (Jenkins/GitHub Actions) , and modern observability and AI-Ops systems . The engineer will also contribute to governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog. This position includes 24×7 support coverage (rotational) and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements. Key Responsibilities Reliability & Incident Management Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments. Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements. Maintain clear communication with cross-functional teams and leadership during major incidents. Monitoring, AI-Ops, Alerts & Prevention Build, tune, and maintain observability dashboards ( Azure Monitor , GCP Operations Suite , Prometheus , Grafana , Datadog , Log Analytics ). Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure. Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact. Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation. Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery. Analyze trends to prevent recurring issues and support teams in resilience engineering. Requirements Required Skills & Experience 5+ years in Site Reliability, DevOps, Cloud Operations , or Customer support roles. Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions. Expertise in Azure and GCP cloud operations and distributed system reliability. Understanding of Terraform , Ansible , and CI/CD pipelines (Jenkins, GitHub Actions). Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.). Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations). Excellent analytical, troubleshooting, and communication skills. Desired Behaviours Proactive Prevention: Identifies and resolves risks before they escalate into incidents. AI-Driven Mindset: Applies AI and automation to improve reliability and reduce human intervention. Accountability: Owns service reliability and communicates with clarity. Collaboration: Works seamlessly with platform, DevOps, and product teams. Efficiency: Focuses on automation to reduce manual effort and improve MTTR. Continuous Improvement: Learns from failures, iterates processes, and enhances documentation. The pay range for this opportunity is from $129,00 to $143,000 + performance-related bonus + benefits. This range represents the anticipated low and high end of the salary for this position. This role is also eligible to receive an annual bonus that aligns with individual and company performance. Actual salaries will vary and are based on factors such as a candidate’s qualifications, skills, competencies. Footer Applicants must be currently authorized to work in the United States without the need for visa sponsorship now or in the future. Avaya is an Equal Opportunity employer and a U.S. Federal Contractor. Our commitment to equality is a core value of Avaya. All qualified applicants and employees receive equal treatment without consideration for race, religion, sex, age, sexual orientation, gender identity, national origin, disability, status as a protected veteran or any other protected characteristic. In general, positions at Avaya require the ability to communicate and use office technology effectively. Physical requirements may vary by assigned work location. This job brief/description is subject to change. Nothing in this job description restricts Avaya right to alter the duties and responsibilities of this position at any time for any reason. #J-18808-Ljbffr Avaya Corporation
- Freelanceshop is looking for a remote SRE Observability Engineer (Datadog Specialist) to enhance our cloud-based platforms. This critical role involves designing monitoring systems to ensure reliability and performance. You will collaborate with various teams to provide...SuggestedRemote job
- ...Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary... ...infrastructure, design modern observability solutions, and build sustainable on-... ...and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation for...SuggestedRemote work
$113.9k - $189.9k
...The One Policy Engine is a unified policy platform... ...(IaC) compliant from code... ...hands-on experience with Azure, AWS and DevSecOps. The ideal candidate... ...to deliver scalable, reliable, and highperformance... ...processes, improve system observability, and ensure high availability...SuggestedPart timeInternship$182.3k - $220k
...that mission depends on reliable, secure, and scalable systems. As a Senior SRE on the infrastructure... ...tools that empower our engineers to ship safely and confidently... ..., performance and observability – partnering closely... ...infrastructure workflows using IaC and other cloud native...SuggestedLocal areaFlexible hours$211.7k - $292k
...that mission depends on reliable, secure, and scalable systems. As a Staff SRE on the infrastructure... ...tools that empower our engineers to ship safely and confidently... ..., performance and observability – partnering closely... ...infrastructure workflows using IaC and other cloud native...SuggestedLocal areaFlexible hours- ...Versana is seeking a motivated SRE/DevOps Engineer with strong observability experience to join our... .... • Improve system reliability and resiliency. •... ...years of experience as a Site Reliability Engineer or similar... ...with public cloud (Azure, AWS or GCP). • 3+ years...Work experience placementLocal area
$60 - $65 per hour
...SRE Engineer (W2) Jersey City, NJ (Onsite) 6 Months Contract to Hire Job Description: Proficient in application development skills for... ...thirdparty applications and integrations Familiarity with observability practices such as white and black box monitoring, service level...Full timeContract workWork experience placement$175k - $200k
...Magazine. The Role As a Senior Site Reliability Engineer on the Platform team, you... ...through automation, observability, performance tuning, and capacity... ...infrastructure as code (IaC) practices using tools such... ...environments Familiarity with SRE methodologies including...Part timeWork at officeFlexible hours- jobr.pro is seeking a Senior Site Reliability Engineer in New York, NY, to enhance platform reliability and engineering excellence. You will be instrumental in implementing observability, security, and CI/CD practices. This role involves coaching teams and optimizing workflows...
$126k - $255k
...within the TechOps SRE team, you'll work... ...closely with our engineering partners to help... ...specializing in site reliability. The Skills and... ...as Code (IaC) standards to improve... ...and maintaining observability (logging, monitoring... ...Fidelity’s business is governed by the provisions...Work from home$136k - $180k
As a Staff Site Reliability Engineer, you will be a key technical leader responsible... ...Infrastructure as Code (IaC) strategy, and ensure our... ...the expert for scalability, observability, and building the robust, automated... ...Experience: 8+ years in an SRE, DevOps, or Infrastructure...Remote work- ...delivers secure, reliable technology... ...standards and governance. Pay and Benefits... ...a Lead DevOps Engineer within DTCC's... ...reliability, observability, and secure... ...practices in DevSecOps and cybersecurity... ...Development (IAC) skills using... ...experience Azure (e.g., Azure...Remote workFlexible hours
$185k - $231k
...operating system for governed financial intelligence... ...for a Senior Platform Engineer to join our growing Platform... ...Build observability and logging infrastructure... ...engineering, DevOps, SRE, or infrastructure roles... ...Cloud Platform (GCP) and Azure ~ Proficiency with infrastructure...$194k - $267k
...Identity belongs to you. We are seeking a highly technical Observability Site Reliability Engineer with a specialty in Google Cloud, to own and expand our... ..., scalable Observability Platform that enables our SRE teams and business partners. You will treat infrastructure...Permanent employmentFull timeWork at officeLocal areaFlexible hours$130k - $200k
..., and local government entities. The... ...level AI/ML Engineer to deploy AI... ...for reliability and scalability. Apply Site Reliability... ...Engineering (SRE) principles... ...(AWS, GCP, Azure) and MLOps workflows... ...Kubernetes, and IaC tools (... ...optimization, and ML observability (Prometheus,...Work at officeLocal areaRemote work- ...was a machine learning research engineer at Scale AI. The rest of our team... ...state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding... ...and building the automation and observability that keep Unify fast and reliable...
- Remote Sr. Azure Infrastructure Engineer This role is responsible... ...as Code (IaC) using tools such... ...Recovery & Azure Site Recovery (ASR):... ...Security, Compliance & Governance: Implement... ...management. Drive observability improvements by... ...to secure, reliable systems. #J-188...Remote work
- ...Requisition: 1429 Job Title: DevSecOps Engineer Location: Remote Clearance... ...CD), Infrastructure as Code (IaC), and Cloud Development Environments... ...Automated Infrastructure & Governance Utilize low‑code or... ...Experience: 3-5 years in DevSecOps, SRE, or Cloud Engineering....Remote workShift work
- ...Observability Engineer Neuberger's Technology team is seeking... ...Monitoring (RUM) to improve reliability, accelerate incident... ...with application, SRE/DevOps,... ...cost optimization, data governance, and scaling strategies... ...with cloud platforms (Azure and AWS) and centralizing...Work at office
$170k - $240k
SENIOR SOFTWARE ENGINEER - OBSERVABILITY AND RELIABILITY ABOUT THE ROLE We are growing... ...infrastructure (GCP, AWS, Azure) * Startup experience... ...moving data or breaking governance. Sigma supports a spreadsheet... ...job application on this site, Sigma processes your personal...Full timeWork at officeFlexible hours$110k - $150k
Agile Defense, LLC is looking for a DevSecOps Engineer to join their remote team. This role requires 3-5 years of experience in DevSecOps, focusing on building and sustaining a secure software delivery environment. The ideal candidate should have knowledge of Continuous...Remote job- ...Description Forhyre is looking for engineers who can bring unique... ...building a culture of reliability and observability Engage in and improve the... ...subject matter expert in an SRE mindset, best practices, and... ...cloud infrastructure, AWS, Azure & Google Cloud Strong sense...
- ...Director, Splunk Platform Engineering & SRE At BNY, our... ...center of enterprise observability and cybersecurity. This... ...Drive platform reliability, capacity, observability... ...models, ensuring strong governance and compliance Design... ...Strong foundation in Site Reliability...Work experience placementWorldwide
$200k - $250k
...the Role We’re looking for a Staff Site Reliability Engineer to lead the evolution of Tabs’ platform... ...operate systems that are reliable, observable, and easy to develop on. You’ll own... ...across teams Experience ~10+ years in SRE, infrastructure, or backend...Full timeContract workWork at office$111k - $222k
...development experience for engineering and product teams by delivering reliable data platforms. You... ...seamless data movement and governance Our Mission:... ...as Code (IaC), and CI/CD pipelines... ...Experience with monitoring and observability tools (Prometheus, Grafana...Work at officeFlexible hours- ...for Oracle Cloud along with AWS, Azure, frameworks, governance models, and cloud standards for... ...This role may require guiding engineering teams or supplementing their... ...Terraform, Ansible, API/CLI automation SRE and DevOps: Monitoring, Observability, SLAs/SLOs/SLIs Key Skills OCI:...
$187k - $240k
...for a product-minded engineer to help us quickly define... ...emphasis on building reliable, production-quality AI... ...in Bits AI SRE. Develop customer-facing... ...authorizations from the US government. This job is available... ...Datadog is the leading observability and security platform...Work at office- ...Senior Software Engineer/SRE - TRAX Observability TRAde Automation and eXecution (TRAX) is part of Bloomberg Enterprise Products Engineering.... ...tools and analysis required to reason about performance and reliability. We partner closely with TRAX engineering teams and our...
$117.8k - $189k
...postings on employment sites will direct... ...The Cloud DevSecOps Engineer III is responsible... ...Operations with governance mechanisms · Help... ...highly available, reliable, stable products... ...similar tools for observability · Solid understanding... ...knowledge on Azure and an understanding...Temporary workRemote workFlexible hoursDay shift$150k - $200k
Join to apply for the Senior Site Reliability Engineer role at Gradle Inc. Develocity is a first‑of‑its‑kind toolchain observability and acceleration platform that helps software teams... ...results. Who You Are We’re building a new SRE team and looking for founding members to...Full timeLocal areaRemote workWork from home
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability. Be the first to apply!
- site reliability engineer remote New York, NY
- site reliability engineer sre New York, NY
- site reliability engineer New York, NY
- site reliability engineering manager New York, NY
- azure security engineer New York, NY
- microsoft azure architect New York, NY
- azure specialist New York, NY
- azure developer New York, NY
- azure solution architect New York, NY
- devops engineer azure New York, NY



