Senior Site Reliability Engineer
Las Vegas Sands Corp
Senior Site Reliability Engineer
The primary responsibility of the Senior Site Reliability Engineer (SRE) to lead reliability engineering initiatives across our Azure estate and Command Center operations. This role focuses on scripting, automation, and observability to ensure uptime, performance, and rapid incident response. The Senior SRE will design and implement monitoring-as-code, optimize alerting, and build self-healing automation that reduces toil and accelerates recovery.
As part of our journey from traditional operations toward a mature SRE model, the Senior SRE will partner with product engineering, platform teams, and the Command Center including Service Desk and Major Incident Command (MIC) to deliver measurable improvements in service reliability.
All duties are to be performed in accordance with departmental and Las Vegas Sands Corp.'s policies, practices, and procedures. All Las Vegas Sands Corp. Team Members are expected to conduct and carry themselves in a professional manner at all times. Team Members are required to observe the company's standards, work requirements and rules of conduct.
Essential Duties & Responsibilities
Observability & Monitoring
Architect end-to-end monitoring using Azure Monitor, Log Analytics, Application Insights, and ITRS Geneos.
Implement monitoring-as-code with Terraform/Bicep, including alerts, dashboards, and diagnostic settings.
Create actionable dashboards (Azure Workbooks, Grafana) for SLIs/SLOs and real-time service health.
Alerting & Incident Response
Design alert taxonomies with severity mapping (P0–P4), dynamic thresholds, and escalation policies.
Reduce alert noise and ensure 100% alert-to-runbook mapping.
Support Major Incident Command (MIC) during P0/P1 bridges with technical expertise and rapid remediation.
Automation & Tooling
Build automation using PowerShell, Python, and Azure Functions for alert lifecycle, runbooks, and self-healing workflows.
Integrate with ITSM (ServiceNow/Jira) for automated ticket enrichment and routing.
Eliminate repetitive operational tasks and reduce toil through automation-first practices.
Reliability Engineering
Define and enforce SLIs/SLOs, error budgets, and resilience patterns (bulkheads, retries, timeouts).
Conduct production readiness reviews, chaos drills, and failover rehearsals.
Partner with app teams to embed instrumentation and structured logging.
Governance & Compliance
Enforce desired state with Azure Policy, DSC/Guest Configuration, and drift detection.
Harden networking (VNet, NSGs, Private Link, Firewall), identity (Entra ID), and secrets (Key Vault).
Ensure auditability and compliance across environments.
Perform job duties in a safe manner.
Attend work as scheduled on a consistent and regular basis.
Perform other related duties as assigned.
Minimum Qualifications
At least 21 years of age.
Proof of authorization to work in the United States.
Bachelor's degree in Computer Science or IT field, or equivalent experience.
Must be able to obtain and maintain any certification or license, as required by law or policy.
7+ years of experience in SRE/DevOps/Platform roles, with 4+ years focused on Azure in production at scale.
Expert knowledge in Infrastructure as Code (Terraform or Bicep) and Git-based workflows (GitHub Actions/Azure DevOps).
Proficiency in CI/CD, deployment strategies (canary, blue-green), and automated rollbacks.
Proficiency in PowerShell and Python for automation; experience building reusable modules.
Demonstrated experience with AKS, App Services, Functions, VM Scale Sets, and Azure networking/security.
Deep knowledge of:
Azure: AKS, App Services, Functions, VMSS, Storage, Front Door, API Management, Load Balancers, Monitor, Log Analytics, App Insights, Key Vault, Policy, Defender
Automation & IaC: Terraform/Bicep, PowerShell, Python, GitHub Actions/Azure DevOps
Observability: Azure Monitor, Log Analytics, App Insights, Prometheus/OpenTelemetry; experience with ITRS Geneos.
Service Management: ServiceNow, Jira
Proficiency in SRE fundamentals: SLIs/SLOs, error budgets, capacity planning, chaos testing, and toil reduction.
Demonstrated experience leading incidents and collaborating across teams.
Strong interpersonal skills with the ability to communicate effectively and interact appropriately with management, other Team Members and outside contacts of different backgrounds and levels of experience.
Must be available to work varied shifts including nights, weekends, and holidays, to ensure 24/7 coverage.
Provide off-hours support on an infrequent, but as needed basis during critical incidents. (Potential shifts may run 24/7 due to the need of the business.)
Team Members are required to be on site within the IT Command Center.
Preferred Qualifications
Certifications & Training
AZ-400: Azure DevOps Engineer Expert
AZ-305: Azure Solutions Architect Expert or AZ-104: Azure Administrator
AZ-500: Azure Security Engineer Associate
ITIL v4 for operational rigor
SRE Foundation/Practitioner Certification (DevOps Institute or equivalent)
Physical Requirements
Must be able to:
Lift or carry 50 pounds, unassisted, in the performance of specific tasks, as assigned.
Physically access assigned workspace areas with or without reasonable accommodation.
Work indoors and be exposed to various environmental factors such as, but not limited to, CRT, noise, and dust.
Utilize laptop and standard keyboard to perform essential functions of the job.
- ...Senior Site Reliability Engineer Our client is seeking a Senior Site Reliability Engineer for a month 6-month contract in Irving, TX. Will be working on an onsite schedule. Contract Duration: 6 Months Required Skills & Experience ~ Bachelors/4 Year Degree ~5...SeniorFull timeContract workTemporary workWork experience placementFlexible hours
$129.1k - $189.34k
...servers, and databases hosted in an on-prem environment. The Sr. Site Reliability Engineer (SRE) will be responsible for ensuring that the SaaS... ...IFIaaS) Cloud Platform spans multiple on‑prem environments. The Senior Site Reliability Engineer (SRE) will play a critical role...SeniorWork at officeLocal areaRemote workRelocationFlexible hours3 days per week- Role: Senior SRE Engineer Location: Washington DC - Hybrid Job Description We are seeking a high... ...Davis AI and Grail to drive proactive reliability, mentoring cross-functional DevOps... ...Location/Flexibility: Ability to work on-site in the Washington, DC area as required...SeniorWork from homeFlexible hours
$103.5k - $172.5k
Overview SeniorManager, Site Reliability Engineering The Site Reliability Engineering Manager is responsible for overseeing the daily operations... ...recovery. Communicate timely updates and incident reports to senior leadership during and after critical events. Lead...SeniorContract workTemporary workShift work- ...Sr. Site Reliability Engineer Location- Dallas, TX | New York, NY | Salt Lake City, UT (5 Days Onsite) Duration- 6-12+ Months Our client, a top tier IT Consulting firm is looking for several qualified Site Reliability Engineers to join a Top-Tier Investment Bank...Senior
$136.88k - $200.75k
...good. Make good. Please note that we do not offer visa sponsorship for this position. ROLE SUMMARY The Senior Cloud Platform & Site Reliability Engineering Lead partners with business and technical stakeholders to lead cloud platform design, engineering, and...SeniorHourly payFull timeWork at officeFlexible hours- ...Senior Site Reliability Engineer Come join a growing bank at the heart of the innovation, technology, green tech and life sciences space. We continue to expand our global footprint and our banking technology is at the core of everything we do. As a Senior Site Reliability...Senior
- Compunnel, Inc. is seeking a Senior Cloud Engineer to join the Cloud SRE team in Dallas, Texas. In this role, you will design and develop cloud solutions, ensuring platform reliability and engineering reliability tools. The ideal candidate will have over 7 years of software...Senior
$100.6k - $199k
...layers of Azure Services, presenting unique engineering challenges. This role also offers great... ...that will improve the availability, reliability, efficiency, observability, and performance... ...performance or functionality of Live Site service and escalates as necessary. Reviews...Ongoing contractLocal area- ...improving platform infrastructure and applications with high reliability, resiliency, performance & quality, and faster time-to-market... ...documentation, including runbooks/playbooks; and, Using Chaos Engineering to test the robustness of the systems and applications....
- Mandatory Skills: AWS/Azure/GCP (GCP is not used very much ). Kubernetes /Helm,Docker,Gitlab,Grafana,Cyberark/Hashicorp Vault, Terraform etc. Experience utilizing Java, Perl, Python, Go and scripting experience in Shell and Perl to automate reports and monitor enterprise...
$100k - $115k
...Internal Developer Platform Engineer Analytic Partners is a global leader in commercial... ...teams as customers and optimizing for reliability, usability, and delivery velocity. Define... ...of experience in Platform Engineering, Site Reliability Engineering, DevOps, or...Temporary work- ...Qualifications: 8+ years of software engineering experience, or equivalent demonstrated through... ...implement and maintain scalable and reliable infrastructure on Google Cloud Platform... ...vendor resources. Willingness to work on-site at stated location in the job opening....For contractorsWork experience placement
- ...Site Reliability Engineer We are looking for a Site Reliability Engineer for our client location in Dallas TX with the following skills: Java Spring boot, Kubernetes, and eCommerce experience required. Key responsibilities include working with the applications, engineering...Work at office
- ...Site Reliability Engineer Location- Wilmington De, Washington DC, Dallas, TX (Onsite Position) Full time position Minimum Qualifications Bachelor’s degree in computer science, Engineering, or a related technical field. Minimum of 5 years of experience...Full time
- ...will play a strategic role in shaping GM Financials' release engineering and software delivery practices. You'll collaborate with engineering... ...AI-driven solutions that accelerate development and improve reliability. Your work will directly influence how GM Financial leverages...Full time
- ...Site Reliability Engineer III There's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. As a Site Reliability...Work at office
- ...ensure applications are highly available, reliable, and performant at a global scale.... ...Bachelor of Computer Science or related Engineering field required. Master's Degree preferred... ...Minimum of 1 year of lead experience of site reliability engineering team required....Contract workWork at office
- ...Role: Site Reliability Engineer 6+ months Contract role Remote About the Role We are looking for a dynamic and accomplished Site Reliability Engineer (SRE) who excels at solving complex reliability challenges and thrives in high-impact environments....Contract workRemote work
- ...Job Position:- Site Reliability Engineer Duration:- Long Term Client:- UPS This is a Hybrid Work Model (3x a week Onsite) and Location is Parsippany, NJ. Job Description: We are looking for a talented Site Reliability Engineer...
- .... Collaborate with cross-functional teams to identify reliability risks and improve system architecture. Develop and enhance... ...environments. Experience in customer-facing roles. Certifications in Site Reliability Engineering, DevOps, or Performance Engineering....
- ...Job Title: Site Reliability Engineer Location: Dallas TX (HYBRID) Duration :Full Time Job Description: Skill: Site Reliability Engineer • Ensures supported applications are functioning and available by minimizing downtime and maximizing performance...Full timeWork at office
- ISNetworld seeks an Advanced Site Reliability Administrator in Dallas, Texas, responsible for ensuring uptime and performance of cloud-based environments. You will manage both Windows and Linux systems, deploying resources effectively, and automating processes to maintain...SeniorWork at officeRemote workFlexible hours
- ...industries operate across the globe. We are looking for a Manager, Site Reliability Engineering to be part of revolutionizing these industries. We're... ...into everything we build. Working closely with the Senior Director of SRE & Cloud Operations, you'll transform reactive...
- ...grow, make an impact, and work with people who care, we'd love to meet you! ABOUT THE ROLE We are looking for a Site Reliability Engineer to ensure the reliability, security, and continuous operation of a multi-cloud application security platform. This role combines...Work at officeRemote workVisa sponsorshipWork visaFlexible hours
- ...DESCRIPTION: The Cloud Solutions Network Engineer is part of the Cloud Center of... ...Connect with all levels of the organization, senior leadership to professional staff Keeps... ...design and implementation of scalable, reliable, and high-performance application and database...Senior
- We are seeking an experienced Site Reliability Engineer to lead the migration of on‑prem applications to Cloud and to maintain the Cloud applications. This role is a hands‑on role involving design, coding, implementation of Azure Infrastructure and CI/CD pipelines. Furthermore...Permanent employmentContract workLocal area
- ...serve. The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure... ...), you will serve as the technical leader responsible for Site Reliability Engineering across IAM platform, overseeing and managing all critical...Remote workFlexible hours
- A leading technology company is seeking a Cloud Solutions Network Engineer to develop scalable cloud network solutions and lead cross-functional teams. The ideal candidate has extensive experience with AWS and cloud services, along with proven DevOps expertise. This role...Senior
- ...Senior Software Systems Engineer Immediate need for a talented Senior Software Systems Engineer with experience in Telecom Industry. This is an 11+ months contract opportunity with long-term potential and is located in Irving, TX. Key Requirements and Technology...SeniorContract workWork experience placementImmediate start
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!
- site reliability engineer Dallas, TX
- site reliability engineer sre Dallas, TX
- senior data management analyst Dallas, TX
- senior app developer Dallas, TX
- senior manager insurance Dallas, TX
- senior game producer Dallas, TX
- senior retail sales associate Dallas, TX
- senior manager quality engineering Dallas, TX
- senior software test automation engineer Dallas, TX
- senior quantitative risk analyst Dallas, TX

