Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability

$129k - $143k

Avaya Corporation

Select how often (in days) to receive an alert: Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability Date: Mar 17, 2026 Location: Remote, US Requisition ID: 37592 About Avaya Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships. We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya, you'll find a community that values your contributions and supports your growth every step of the way. Learn more at Description We are seeking a Site Reliability Engineer (SRE) who will drive stability, reliability, and performance across our Azure and GCP-based platforms . This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams. The ideal candidate will have hands-on experience with multi-cloud environments (Azure and GCP) , IaC (Terraform/Ansible) , CI/CD (Jenkins/GitHub Actions) , and modern observability and AI-Ops systems . The engineer will also contribute to governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog. This position includes 24×7 support coverage (rotational) and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements. Key Responsibilities Reliability & Incident Management Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments. Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements. Maintain clear communication with cross-functional teams and leadership during major incidents. Monitoring, AI-Ops, Alerts & Prevention Build, tune, and maintain observability dashboards ( Azure Monitor , GCP Operations Suite , Prometheus , Grafana , Datadog , Log Analytics ). Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure. Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact. Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation. Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery. Analyze trends to prevent recurring issues and support teams in resilience engineering. Requirements Required Skills & Experience 5+ years in Site Reliability, DevOps, Cloud Operations , or Customer support roles. Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions. Expertise in Azure and GCP cloud operations and distributed system reliability. Understanding of Terraform , Ansible , and CI/CD pipelines (Jenkins, GitHub Actions). Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.). Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations). Excellent analytical, troubleshooting, and communication skills. Desired Behaviours Proactive Prevention: Identifies and resolves risks before they escalate into incidents. AI-Driven Mindset: Applies AI and automation to improve reliability and reduce human intervention. Accountability: Owns service reliability and communicates with clarity. Collaboration: Works seamlessly with platform, DevOps, and product teams. Efficiency: Focuses on automation to reduce manual effort and improve MTTR. Continuous Improvement: Learns from failures, iterates processes, and enhances documentation. The pay range for this opportunity is from $129,00 to $143,000 + performance-related bonus + benefits. This range represents the anticipated low and high end of the salary for this position. This role is also eligible to receive an annual bonus that aligns with individual and company performance. Actual salaries will vary and are based on factors such as a candidate’s qualifications, skills, competencies. Footer Applicants must be currently authorized to work in the United States without the need for visa sponsorship now or in the future. Avaya is an Equal Opportunity employer and a U.S. Federal Contractor. Our commitment to equality is a core value of Avaya. All qualified applicants and employees receive equal treatment without consideration for race, religion, sex, age, sexual orientation, gender identity, national origin, disability, status as a protected veteran or any other protected characteristic. In general, positions at Avaya require the ability to communicate and use office technology effectively. Physical requirements may vary by assigned work location. This job brief/description is subject to change. Nothing in this job description restricts Avaya right to alter the duties and responsibilities of this position at any time for any reason. #J-18808-Ljbffr Avaya Corporation

Apply

Vacancy posted 12 hours ago

Similar jobs that could be interesting for youBased on the Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability in New York, NY vacancy

Remote Datadog Observability SRE Engineer
Freelanceshop is looking for a remote SRE Observability Engineer (Datadog Specialist) to enhance our cloud-based platforms. This critical role involves designing monitoring systems to ensure reliability and performance. You will collaborate with various teams to provide...
Suggested
Remote job
Freelanceshop
New York, NY
3 days ago
Senior Site Reliability Engineer (SRE
...Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary... ...infrastructure, design modern observability solutions, and build sustainable on-... ...and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation for...
Suggested
Remote work
Govserviceshub
New York, NY
22 hours ago
Senior Platform Engineer - DevSecOps
$113.9k - $189.9k
...The One Policy Engine is a unified policy platform... ...(IaC) compliant from code... ...hands-on experience with Azure, AWS and DevSecOps. The ideal candidate... ...to deliver scalable, reliable, and highperformance... ...processes, improve system observability, and ensure high availability...
Suggested
Part time
Internship
LSEG (London Stock Exchange Group)
New York, NY
1 day ago
Senior Site Reliability Engineer
$182.3k - $220k
...that mission depends on reliable, secure, and scalable systems. As a Senior SRE on the infrastructure... ...tools that empower our engineers to ship safely and confidently... ..., performance and observability – partnering closely... ...infrastructure workflows using IaC and other cloud native...
Suggested
Local area
Flexible hours
Ro
New York, NY
a month ago
Staff Site Reliability Engineer
$211.7k - $292k
...that mission depends on reliable, secure, and scalable systems. As a Staff SRE on the infrastructure... ...tools that empower our engineers to ship safely and confidently... ..., performance and observability – partnering closely... ...infrastructure workflows using IaC and other cloud native...
Suggested
Local area
Flexible hours
Ro
New York, NY
2 days ago
SRE/DevOps Engineer
...Versana is seeking a motivated SRE/DevOps Engineer with strong observability experience to join our... .... • Improve system reliability and resiliency. •... ...years of experience as a Site Reliability Engineer or similar... ...with public cloud (Azure, AWS or GCP). • 3+ years...
Work experience placement
Local area
Versana
New York, NY
24 days ago
SRE Engineer
$60 - $65 per hour
...SRE Engineer (W2) Jersey City, NJ (Onsite) 6 Months Contract to Hire Job Description: Proficient in application development skills for... ...thirdparty applications and integrations Familiarity with observability practices such as white and black box monitoring, service level...
Full time
Contract work
Work experience placement
Pinnacle Group
Jersey City, NJ
22 hours ago
Senior Site Reliability Engineer
$175k - $200k
...Magazine. The Role As a Senior Site Reliability Engineer on the Platform team, you... ...through automation, observability, performance tuning, and capacity... ...infrastructure as code (IaC) practices using tools such... ...environments Familiarity with SRE methodologies including...
Part time
Work at office
Flexible hours
Order.co
New York, NY
22 hours ago
Senior Site Reliability Engineer — Observability & CI/CD
jobr.pro is seeking a Senior Site Reliability Engineer in New York, NY, to enhance platform reliability and engineering excellence. You will be instrumental in implementing observability, security, and CI/CD practices. This role involves coaching teams and optimizing workflows...
jobr.pro
New York, NY
22 hours ago
Director, Site Reliability Engineering - Digital Assets
$126k - $255k
...within the TechOps SRE team, you'll work... ...closely with our engineering partners to help... ...specializing in site reliability. The Skills and... ...as Code (IaC) standards to improve... ...and maintaining observability (logging, monitoring... ...Fidelity’s business is governed by the provisions...
Work from home
Fidelity Investments
Weehawken, NJ
22 hours ago
Staff Site Reliability Engineer
$136k - $180k
As a Staff Site Reliability Engineer, you will be a key technical leader responsible... ...Infrastructure as Code (IaC) strategy, and ensure our... ...the expert for scalability, observability, and building the robust, automated... ...Experience: 8+ years in an SRE, DevOps, or Infrastructure...
Remote work
Kevala Inc.
New York, NY
22 hours ago
Lead Cloud DevSecOps Engineer
...delivers secure, reliable technology... ...standards and governance. Pay and Benefits... ...a Lead DevOps Engineer within DTCC's... ...reliability, observability, and secure... ...practices in DevSecOps and cybersecurity... ...Development (IAC) skills using... ...experience Azure (e.g., Azure...
Remote work
Flexible hours
Dtcc
Jersey City, NJ
3 days ago
Senior Platform Engineer
$185k - $231k
...operating system for governed financial intelligence... ...for a Senior Platform Engineer to join our growing Platform... ...Build observability and logging infrastructure... ...engineering, DevOps, SRE, or infrastructure roles... ...Cloud Platform (GCP) and Azure ~ Proficiency with infrastructure...
Monstro
New York, NY
13 days ago
Staff Site Reliability Engineer - Observability
$194k - $267k
...Identity belongs to you. We are seeking a highly technical Observability Site Reliability Engineer with a specialty in Google Cloud, to own and expand our... ..., scalable Observability Platform that enables our SRE teams and business partners. You will treat infrastructure...
Permanent employment
Full time
Work at office
Local area
Flexible hours
Okta
New York, NY
more than 2 months ago
AI Engineer
$130k - $200k
..., and local government entities. The... ...level AI/ML Engineer to deploy AI... ...for reliability and scalability. Apply Site Reliability... ...Engineering (SRE) principles... ...(AWS, GCP, Azure) and MLOps workflows... ...Kubernetes, and IaC tools (... ...optimization, and ML observability (Prometheus,...
Work at office
Local area
Remote work
Metropolitan Commercial Bank
New York, NY
4 days ago
Senior Site Reliability Engineer
...was a machine learning research engineer at Scale AI. The rest of our team... ...state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding... ...and building the automation and observability that keep Unify fast and reliable...
Unify
New York, NY
22 hours ago
Sr. Azure Infrastructure Engineer [Disaster Recovery enablement]
Remote Sr. Azure Infrastructure Engineer This role is responsible... ...as Code (IaC) using tools such... ...Recovery & Azure Site Recovery (ASR):... ...Security, Compliance & Governance: Implement... ...management. Drive observability improvements by... ...to secure, reliable systems. #J-188...
Remote work
New Era Technology
New York, NY
22 hours ago
DevSecOps Engineer
...Requisition: 1429 Job Title: DevSecOps Engineer Location: Remote Clearance... ...CD), Infrastructure as Code (IaC), and Cloud Development Environments... ...Automated Infrastructure & Governance Utilize low‑code or... ...Experience: 3-5 years in DevSecOps, SRE, or Cloud Engineering....
Remote work
Shift work
Agile Defense
New York, NY
22 hours ago
Observability Platform Engineer
...Observability Engineer Neuberger's Technology team is seeking... ...Monitoring (RUM) to improve reliability, accelerate incident... ...with application, SRE/DevOps,... ...cost optimization, data governance, and scaling strategies... ...with cloud platforms (Azure and AWS) and centralizing...
Work at office
Neuberger Berman
New York, NY
3 days ago
Senior Software Engineer - Observability and Reliability
$170k - $240k
SENIOR SOFTWARE ENGINEER - OBSERVABILITY AND RELIABILITY ABOUT THE ROLE We are growing... ...infrastructure (GCP, AWS, Azure) * Startup experience... ...moving data or breaking governance. Sigma supports a spreadsheet... ...job application on this site, Sigma processes your personal...
Full time
Work at office
Flexible hours
Sigma Computing
New York, NY
4 days ago
Remote Security-Driven DevSecOps Engineer for CI/CD & IaC
$110k - $150k
Agile Defense, LLC is looking for a DevSecOps Engineer to join their remote team. This role requires 3-5 years of experience in DevSecOps, focusing on building and sustaining a secure software delivery environment. The ideal candidate should have knowledge of Continuous...
Remote job
Agile Defense, LLC
New York, NY
3 days ago
Site Reliability Engineering
...Description Forhyre is looking for engineers who can bring unique... ...building a culture of reliability and observability Engage in and improve the... ...subject matter expert in an SRE mindset, best practices, and... ...cloud infrastructure, AWS, Azure & Google Cloud Strong sense...
Forhyre
New York, NY
24 days ago
Director, Splunk Platform Engineering & SRE
...Director, Splunk Platform Engineering & SRE At BNY, our... ...center of enterprise observability and cybersecurity. This... ...Drive platform reliability, capacity, observability... ...models, ensuring strong governance and compliance Design... ...Strong foundation in Site Reliability...
Work experience placement
Worldwide
BNY
New York, NY
4 days ago
Staff Site Reliability Engineer
$200k - $250k
...the Role We’re looking for a Staff Site Reliability Engineer to lead the evolution of Tabs’ platform... ...operate systems that are reliable, observable, and easy to develop on. You’ll own... ...across teams Experience ~10+ years in SRE, infrastructure, or backend...
Full time
Contract work
Work at office
Tabs
New York, NY
21 days ago
Sr Data Platforms Engineer
$111k - $222k
...development experience for engineering and product teams by delivering reliable data platforms. You... ...seamless data movement and governance Our Mission:... ...as Code (IaC), and CI/CD pipelines... ...Experience with monitoring and observability tools (Prometheus, Grafana...
Work at office
Flexible hours
DoubleVerify
New York, NY
4 days ago
Oracle Cloud Architect
...for Oracle Cloud along with AWS, Azure, frameworks, governance models, and cloud standards for... ...This role may require guiding engineering teams or supplementing their... ...Terraform, Ansible, API/CLI automation SRE and DevOps: Monitoring, Observability, SLAs/SLOs/SLIs Key Skills OCI:...
Neotecra
New York, NY
3 days ago
Senior Software Engineer - Bits AI SRE
$187k - $240k
...for a product-minded engineer to help us quickly define... ...emphasis on building reliable, production-quality AI... ...in Bits AI SRE. Develop customer-facing... ...authorizations from the US government. This job is available... ...Datadog is the leading observability and security platform...
Work at office
Datadog
New York, NY
2 days ago
Senior Software Engineering Manage
...Senior Software Engineer/SRE - TRAX Observability TRAde Automation and eXecution (TRAX) is part of Bloomberg Enterprise Products Engineering.... ...tools and analysis required to reason about performance and reliability. We partner closely with TRAX engineering teams and our...
Bloomberg
New York, NY
3 days ago
DevSecOps Engineer III
$117.8k - $189k
...postings on employment sites will direct... ...The Cloud DevSecOps Engineer III is responsible... ...Operations with governance mechanisms · Help... ...highly available, reliable, stable products... ...similar tools for observability · Solid understanding... ...knowledge on Azure and an understanding...
Temporary work
Remote work
Flexible hours
Day shift
Kapitus
New York, NY
24 days ago
Senior Site Reliability Engineer
$150k - $200k
Join to apply for the Senior Site Reliability Engineer role at Gradle Inc. Develocity is a first‑of‑its‑kind toolchain observability and acceleration platform that helps software teams... ...results. Who You Are We’re building a new SRE team and looking for founding members to...
Full time
Local area
Remote work
Work from home
Gradle Inc.
New York, NY
22 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability. Be the first to apply!