Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability

$129k - $143k

Avaya Corporation

Select how often (in days) to receive an alert: Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability Date: Mar 17, 2026 Location: Remote, US Requisition ID: 37592 About Avaya Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships. We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya, you'll find a community that values your contributions and supports your growth every step of the way. Learn more at Description We are seeking a Site Reliability Engineer (SRE) who will drive stability, reliability, and performance across our Azure and GCP-based platforms . This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams. The ideal candidate will have hands-on experience with multi-cloud environments (Azure and GCP) , IaC (Terraform/Ansible) , CI/CD (Jenkins/GitHub Actions) , and modern observability and AI-Ops systems . The engineer will also contribute to governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog. This position includes 24×7 support coverage (rotational) and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements. Key Responsibilities Reliability & Incident Management Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments. Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements. Maintain clear communication with cross-functional teams and leadership during major incidents. Monitoring, AI-Ops, Alerts & Prevention Build, tune, and maintain observability dashboards ( Azure Monitor , GCP Operations Suite , Prometheus , Grafana , Datadog , Log Analytics ). Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure. Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact. Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation. Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery. Analyze trends to prevent recurring issues and support teams in resilience engineering. Requirements Required Skills & Experience 5+ years in Site Reliability, DevOps, Cloud Operations , or Customer support roles. Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions. Expertise in Azure and GCP cloud operations and distributed system reliability. Understanding of Terraform , Ansible , and CI/CD pipelines (Jenkins, GitHub Actions). Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.). Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations). Excellent analytical, troubleshooting, and communication skills. Desired Behaviours Proactive Prevention: Identifies and resolves risks before they escalate into incidents. AI-Driven Mindset: Applies AI and automation to improve reliability and reduce human intervention. Accountability: Owns service reliability and communicates with clarity. Collaboration: Works seamlessly with platform, DevOps, and product teams. Efficiency: Focuses on automation to reduce manual effort and improve MTTR. Continuous Improvement: Learns from failures, iterates processes, and enhances documentation. The pay range for this opportunity is from $129,00 to $143,000 + performance-related bonus + benefits. This range represents the anticipated low and high end of the salary for this position. This role is also eligible to receive an annual bonus that aligns with individual and company performance. Actual salaries will vary and are based on factors such as a candidate’s qualifications, skills, competencies. Footer Applicants must be currently authorized to work in the United States without the need for visa sponsorship now or in the future. Avaya is an Equal Opportunity employer and a U.S. Federal Contractor. Our commitment to equality is a core value of Avaya. All qualified applicants and employees receive equal treatment without consideration for race, religion, sex, age, sexual orientation, gender identity, national origin, disability, status as a protected veteran or any other protected characteristic. In general, positions at Avaya require the ability to communicate and use office technology effectively. Physical requirements may vary by assigned work location. This job brief/description is subject to change. Nothing in this job description restricts Avaya right to alter the duties and responsibilities of this position at any time for any reason. #J-18808-Ljbffr Avaya Corporation

Vacancy posted 12 hours ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability in New York, NY vacancy
  • Freelanceshop is looking for a remote SRE Observability Engineer (Datadog Specialist) to enhance our cloud-based platforms. This critical role involves designing monitoring systems to ensure reliability and performance. You will collaborate with various teams to provide... 
    Suggested
    Remote job

    Freelanceshop

    New York, NY
    3 days ago
  •  ...Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary...  ...infrastructure, design modern observability solutions, and build sustainable on-...  ...and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation for... 
    Suggested
    Remote work

    Govserviceshub

    New York, NY
    22 hours ago
  • $113.9k - $189.9k

     ...The One Policy Engine is a unified policy platform...  ...(IaC) compliant from code...  ...hands-on experience with Azure, AWS and DevSecOps. The ideal candidate...  ...to deliver scalable, reliable, and highperformance...  ...processes, improve system observability, and ensure high availability... 
    Suggested
    Part time
    Internship

    LSEG (London Stock Exchange Group)

    New York, NY
    1 day ago
  • $182.3k - $220k

     ...that mission depends on reliable, secure, and scalable systems. As a Senior SRE on the infrastructure...  ...tools that empower our engineers to ship safely and confidently...  ..., performance and observability – partnering closely...  ...infrastructure workflows using IaC and other cloud native... 
    Suggested
    Local area
    Flexible hours

    Ro

    New York, NY
    a month ago
  • $211.7k - $292k

     ...that mission depends on reliable, secure, and scalable systems. As a Staff SRE on the infrastructure...  ...tools that empower our engineers to ship safely and confidently...  ..., performance and observability – partnering closely...  ...infrastructure workflows using IaC and other cloud native... 
    Suggested
    Local area
    Flexible hours

    Ro

    New York, NY
    2 days ago
  •  ...Versana is seeking a motivated SRE/DevOps Engineer with strong observability experience to join  our...  .... • Improve system reliability and resiliency. •...  ...years of experience as a Site Reliability Engineer or similar...  ...with public cloud (Azure, AWS or GCP). • 3+ years... 
    Work experience placement
    Local area

    Versana

    New York, NY
    24 days ago
  • $60 - $65 per hour

     ...SRE Engineer (W2) Jersey City, NJ (Onsite) 6 Months Contract to Hire Job Description: Proficient in application development skills for...  ...thirdparty applications and integrations Familiarity with observability practices such as white and black box monitoring, service level... 
    Full time
    Contract work
    Work experience placement

    Pinnacle Group

    Jersey City, NJ
    22 hours ago
  • $175k - $200k

     ...Magazine. The Role As a Senior Site Reliability Engineer on the Platform team, you...  ...through automation, observability, performance tuning, and capacity...  ...infrastructure as code (IaC) practices using tools such...  ...environments Familiarity with SRE methodologies including... 
    Part time
    Work at office
    Flexible hours

    Order.co

    New York, NY
    22 hours ago
  • jobr.pro is seeking a Senior Site Reliability Engineer in New York, NY, to enhance platform reliability and engineering excellence. You will be instrumental in implementing observability, security, and CI/CD practices. This role involves coaching teams and optimizing workflows... 

    jobr.pro

    New York, NY
    22 hours ago
  • $126k - $255k

     ...within the TechOps SRE team, you'll work...  ...closely with our engineering partners to help...  ...specializing in site reliability. The Skills and...  ...as Code (IaC) standards to improve...  ...and maintaining observability (logging, monitoring...  ...Fidelity’s business is governed by the provisions... 
    Work from home

    Fidelity Investments

    Weehawken, NJ
    22 hours ago
  • $136k - $180k

    As a Staff Site Reliability Engineer, you will be a key technical leader responsible...  ...Infrastructure as Code (IaC) strategy, and ensure our...  ...the expert for scalability, observability, and building the robust, automated...  ...Experience: 8+ years in an SRE, DevOps, or Infrastructure... 
    Remote work

    Kevala Inc.

    New York, NY
    22 hours ago
  •  ...delivers secure, reliable technology...  ...standards and governance. Pay and Benefits...  ...a Lead DevOps Engineer within DTCC's...  ...reliability, observability, and secure...  ...practices in DevSecOps and cybersecurity...  ...Development (IAC) skills using...  ...experience Azure (e.g., Azure... 
    Remote work
    Flexible hours

    Dtcc

    Jersey City, NJ
    3 days ago
  • $185k - $231k

     ...operating system for governed financial intelligence...  ...for a Senior Platform Engineer to join our growing Platform...  ...Build observability and logging infrastructure...  ...engineering, DevOps, SRE, or infrastructure roles...  ...Cloud Platform (GCP) and Azure ~ Proficiency with infrastructure... 

    Monstro

    New York, NY
    13 days ago
  • $194k - $267k

     ...Identity belongs to you. We are seeking a highly technical Observability Site Reliability Engineer with a specialty in Google Cloud, to own and expand our...  ..., scalable Observability Platform that enables our SRE teams and business partners. You will treat infrastructure... 
    Permanent employment
    Full time
    Work at office
    Local area
    Flexible hours

    Okta

    New York, NY
    more than 2 months ago
  • $130k - $200k

     ..., and local government entities. The...  ...level AI/ML Engineer to deploy AI...  ...for reliability and scalability. Apply Site Reliability...  ...Engineering (SRE) principles...  ...(AWS, GCP, Azure) and MLOps workflows...  ...Kubernetes, and IaC tools (...  ...optimization, and ML observability (Prometheus,... 
    Work at office
    Local area
    Remote work

    Metropolitan Commercial Bank

    New York, NY
    4 days ago
  •  ...was a machine learning research engineer at Scale AI. The rest of our team...  ...state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding...  ...and building the automation and observability that keep Unify fast and reliable... 

    Unify

    New York, NY
    22 hours ago
  • Remote Sr. Azure Infrastructure Engineer This role is responsible...  ...as Code (IaC) using tools such...  ...Recovery & Azure Site Recovery (ASR):...  ...Security, Compliance & Governance: Implement...  ...management. Drive observability improvements by...  ...to secure, reliable systems. #J-188... 
    Remote work

    New Era Technology

    New York, NY
    22 hours ago
  •  ...Requisition: 1429 Job Title: DevSecOps Engineer Location: Remote Clearance...  ...CD), Infrastructure as Code (IaC), and Cloud Development Environments...  ...Automated Infrastructure & Governance Utilize low‑code or...  ...Experience: 3-5 years in DevSecOps, SRE, or Cloud Engineering.... 
    Remote work
    Shift work

    Agile Defense

    New York, NY
    22 hours ago
  •  ...Observability Engineer Neuberger's Technology team is seeking...  ...Monitoring (RUM) to improve reliability, accelerate incident...  ...with application, SRE/DevOps,...  ...cost optimization, data governance, and scaling strategies...  ...with cloud platforms (Azure and AWS) and centralizing... 
    Work at office

    Neuberger Berman

    New York, NY
    3 days ago
  • $170k - $240k

    SENIOR SOFTWARE ENGINEER - OBSERVABILITY AND RELIABILITY ABOUT THE ROLE We are growing...  ...infrastructure (GCP, AWS, Azure) * Startup experience...  ...moving data or breaking governance. Sigma supports a spreadsheet...  ...job application on this site, Sigma processes your personal... 
    Full time
    Work at office
    Flexible hours

    Sigma Computing

    New York, NY
    4 days ago
  • $110k - $150k

    Agile Defense, LLC is looking for a DevSecOps Engineer to join their remote team. This role requires 3-5 years of experience in DevSecOps, focusing on building and sustaining a secure software delivery environment. The ideal candidate should have knowledge of Continuous... 
    Remote job

    Agile Defense, LLC

    New York, NY
    3 days ago
  •  ...Description Forhyre is looking for engineers who can bring unique...  ...building a culture of reliability and observability Engage in and improve the...  ...subject matter expert in an SRE mindset, best practices, and...  ...cloud infrastructure, AWS, Azure & Google Cloud Strong sense... 

    Forhyre

    New York, NY
    24 days ago
  •  ...Director, Splunk Platform Engineering & SRE At BNY, our...  ...center of enterprise observability and cybersecurity. This...  ...Drive platform reliability, capacity, observability...  ...models, ensuring strong governance and compliance Design...  ...Strong foundation in Site Reliability... 
    Work experience placement
    Worldwide

    BNY

    New York, NY
    4 days ago
  • $200k - $250k

     ...the Role We’re looking for a Staff Site Reliability Engineer to lead the evolution of Tabs’ platform...  ...operate systems that are reliable, observable, and easy to develop on. You’ll own...  ...across teams Experience ~10+ years in SRE, infrastructure, or backend... 
    Full time
    Contract work
    Work at office

    Tabs

    New York, NY
    21 days ago
  • $111k - $222k

     ...development experience for engineering and product teams by delivering reliable data platforms. You...  ...seamless data movement and governance Our Mission:...  ...as Code (IaC), and CI/CD pipelines...  ...Experience with monitoring and observability tools (Prometheus, Grafana... 
    Work at office
    Flexible hours

    DoubleVerify

    New York, NY
    4 days ago
  •  ...for Oracle Cloud along with AWS, Azure, frameworks, governance models, and cloud standards for...  ...This role may require guiding engineering teams or supplementing their...  ...Terraform, Ansible, API/CLI automation SRE and DevOps: Monitoring, Observability, SLAs/SLOs/SLIs Key Skills OCI:... 

    Neotecra

    New York, NY
    3 days ago
  • $187k - $240k

     ...for a product-minded engineer to help us quickly define...  ...emphasis on building reliable, production-quality AI...  ...in Bits AI SRE. Develop customer-facing...  ...authorizations from the US government. This job is available...  ...Datadog is the leading observability and security platform... 
    Work at office

    Datadog

    New York, NY
    2 days ago
  •  ...Senior Software Engineer/SRE - TRAX Observability TRAde Automation and eXecution (TRAX) is part of Bloomberg Enterprise Products Engineering....  ...tools and analysis required to reason about performance and reliability. We partner closely with TRAX engineering teams and our... 

    Bloomberg

    New York, NY
    3 days ago
  • $117.8k - $189k

     ...postings on employment sites will direct...  ...The Cloud DevSecOps Engineer III is responsible...  ...Operations with governance mechanisms · Help...  ...highly available, reliable, stable products...  ...similar tools for observability · Solid understanding...  ...knowledge on Azure and an understanding... 
    Temporary work
    Remote work
    Flexible hours
    Day shift

    Kapitus

    New York, NY
    24 days ago
  • $150k - $200k

    Join to apply for the Senior Site Reliability Engineer role at Gradle Inc. Develocity is a first‑of‑its‑kind toolchain observability and acceleration platform that helps software teams...  ...results. Who You Are We’re building a new SRE team and looking for founding members to... 
    Full time
    Local area
    Remote work
    Work from home

    Gradle Inc.

    New York, NY
    22 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability. Be the first to apply!