Lead Site Reliability Engineer
Mastercard
Lead Site Reliability Engineer
Role Overview
We are seeking a highly technical Lead Site Reliability Engineer (SRE) to architect, engineer, and operate highly reliable, scalable, and secure platforms across multi-cloud (AWS, Azure) and hybrid (on-prem + cloud) environments.
This is a deeply hands-on engineering role requiring expertise in distributed systems, Kubernetes, hybrid networking, automation, CI/CD, observability, and production incident leadership. The Lead SRE will serve as the technical authority for reliability across interconnected cloud and datacenter ecosystems.
• Define and implement SLIs, SLOs, and error budgets across cloud-native and on-prem workloads.
• Architect high-availability designs spanning:
o AWS and Azure regions
o On-prem datacenters
o Cross-cloud failover patterns
• Design DR strategies (RTO/RPO driven) across hybrid environments.
• Eliminate single points of failure across network, compute, storage, and DNS layers.
• Conduct resilience validation, chaos testing, and failure scenario modeling. 2. Multi-Cloud Architecture & Engineering
• Engineer and operate workloads across:
o Amazon Web Services
o Microsoft Azure
• Design cross-cloud networking (VPN, ExpressRoute, Direct Connect, Transit Gateway).
• Implement workload portability and cloud-agnostic deployment strategies.
• Optimize cost, performance, and reliability across providers.
• Design cloud-native autoscaling, load balancing, and traffic routing strategies. 3. Hybrid Infrastructure (On-Prem + Cloud Integration)
• Integrate on-prem infrastructure with cloud platforms using:
o Active Directory / IAM federation
o Hybrid DNS architecture
o Secure certificate lifecycle management
• Troubleshoot hybrid connectivity issues (BGP routing, firewall policies, NAT, MTU mismatches).
• Manage hybrid Kubernetes deployments and private registry integrations.
• Support legacy-to-cloud modernization initiatives. 4. Kubernetes & Container Platform Engineering
• Architect and operate:
o Amazon EKS
o Azure Kubernetes Service
o Self-managed Kubernetes clusters (on-prem)
• Optimize cluster autoscaling, resource allocation, and performance.
• Implement cluster security hardening and RBAC governance.
• Troubleshoot CNI, ingress controllers, service mesh, and pod networking issues.
• Implement GitOps-driven deployments. 5. Observability Engineering Across Distributed Systems
• Build unified observability across hybrid environments using:
o Splunk
o Dynatrace
o Prometheus
o Grafana
o OpenTelemetry
• Implement centralized logging across cloud and on-prem workloads.
• Design distributed tracing across multi-cloud microservices.
• Engineer proactive alerting to reduce MTTR and improve signal quality. 6. CI/CD & Infrastructure Automation
• Engineer resilient CI/CD pipelines (Jenkins, GitHub Actions, Azure DevOps).
• Implement cross-cloud infrastructure as code using:
o Terraform
o CloudFormation
• Automate:
o Certificate rotation
o Auto-scaling policies
o Patch orchestration
o Drift detection
• Improve deployment reliability via blue-green and canary strategies. 7. Advanced Production Troubleshooting
• Lead technical investigation of:
o DNS resolution failures (private/public zones, hybrid forwarding)
o TLS/PKI certificate failures
o Network latency across hybrid circuits
o Memory leaks & kernel-level issues
o Thread contention & CPU throttling
• Perform packet-level debugging (tcpdump, netstat, traceroute).
• Analyze distributed system failures spanning multiple platforms. Technical Skills Required
• 7–10+ years in SRE / DevOps / Cloud Engineering roles.
• Deep hands-on experience in:
o AWS and Azure
o Hybrid networking
o Kubernetes (cloud & on-prem)
• Strong knowledge of:
o Linux internals
o TCP/IP, DNS, Load Balancing
o TLS/PKI and certificate lifecycle
o Distributed systems architecture
• Strong scripting/programming skills (Python preferred).
• Experience designing cross-cloud DR and failover models.
• Experience with infrastructure as code and GitOps. Preferred Certifications
• AWS Solutions Architect (Associate/Professional)
• Azure Architect / DevOps Engineer
• Certified Kubernetes Administrator (CKA) Work Schedule Requirement
This role supports globally distributed, business-critical systems operating 24x7.
The candidate must be willing to participate in rotational on-call shifts, including weekends and off-hours support, as part of a follow-the-sun enterprise support model. Key Success Metrics
• Improved cross-cloud resiliency and DR posture.
• Reduced hybrid networking incidents.
• Improved SLO compliance across platforms.
• Measurable MTTR reduction.
• Increased automation coverage.
• Reduced change failure rate. Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
- Abide by Mastercard’s security policies and practices;
- Ensure the confidentiality and integrity of the information being accessed;
- Report any suspected information security violation or breach, and
- Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.
- ...realize their greatest potential. Title and Summary Senior Site Reliability Engineer Senior Site Reliability Engineer - SRE - Java/Spring... ...modeling/planning, and sub linear operational scaling. • Lead incident handling, mitigation, RCA, and postmortems; clearly...SuggestedFull timeWorldwide
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Lead Site Reliability Engineer The BizOps team is looking for a Site Reliability Engineer who can help us solve problems, build our CI/CD pipeline...SuggestedFull timeWorldwideShift work
- ...their greatest potential. Title and Summary Senior Software Engineer - Agentic Systems Who is Mastercard? Mastercard is a global... ...experiences, while maintaining the highest standards of reliability, security, and trust. Business & Market Insights (B&MI) is...SuggestedFull timeWorldwide
- ...outside the U.S. As a hub for technological convergence, our engineering talent is a catalyst for innovation in multimedia network and... ...platforms of tomorrow. Job Duties and Responsibilities System Reliability & Performance: Design, implement, and maintain monitoring,...SuggestedLocal area
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Senior Platform Engineer Senior Platform Engineer (DevOps / Chef Automation) Role Summary The Senior Platform Engineer is a hands-on technical...SuggestedFull timeWorldwide
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Senior Software Engineer Who is Mastercard? Mastercard is a global technology company in the payments industry. Our mission is to connect and power...Full timeWorldwide
- ...their greatest potential. Title and Summary Senior Software Engineer Job Description Summary Overview • Quality Engineering... ...with other developers to ensure that systems are designed for reliability, performance and testability • Experience testing across a variety...Full timeWorldwide
- ...realize their greatest potential. Title and Summary Software Engineer II-1 Job Description Summary Job Overview: As part of... ...integration, and contract tests, ensuring high code coverage and reliability. • Drive technical decisions within the team while aligning...Full timeContract workWorldwide
- ...greatest potential. Title and Summary Software Engineer II Job Title: Test Automation and Site Reliability Engineer, Priceless Platform, Pune, India Overview... ...patterns. • Partner with platform engineering leads to support in regulating cloud infrastructure, reduce...Full timeWork experience placementWork at officeWorldwide
- ...realize their greatest potential. Title and Summary Software Engineer II About the Role We are looking for a Software Engineer II... ...with new developments in the financial technology industry and may lead the development of departmental standards and methodologies. ●...Full timeWorldwide
- ...deliver quality web and mobile applications at speed. Our industry-leading platform ensures continuous quality across the SDLC, using AI-... ...a technical leader you will work directly with business and engineering leadership across Sauce Labs and with engineers from other...Work at officeRemote work
- ...potential. Title and Summary Senior Software Engineer Senior Software Engineer (Full Stack)... .../private REST APIs. Collaborate & lead across product, design, security, and... .../tech talks. Design for scale & reliability: clean architecture, domain modeling, performance...Full timeWorldwide
- ...their greatest potential. Title and Summary Senior Software Engineer Overview Prepaid Management Services is the division of mastercard... ...forward Prepaid throughout the world with innovative and leading solutions that we integrate with Global brands. This role is...Full timeWorldwide
- ...potential. Title and Summary Senior Software Engineer- Quality Senior Software Engineer-... ...contributor role where you will lead by example , designing robust test strategies... ...release management processes to ensure smooth, reliable deployments. • Define and monitor...Full timeContract workWorldwideShift work
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Software Engineer II Overview Design an effective test automation suite for your SCRUM team to validate application functionality and performance...Full timeWorldwide
- ...realize their greatest potential. Title and Summary Senior Software Engineer (Mainframe Developer) Who is Mastercard? Mastercard is a... ...related metrics reporting for management review • Technical Lead for proposing solutions to the team Lead initiatives for...Full timeWorldwide
- ...their greatest potential. Title and Summary Senior Software Engineer, Workday Integrations (Payroll) Who is Mastercard? Mastercard... ..., and reusable integration solutions that improve system reliability and operational efficiency across the Workday ecosystem. Role...Full timeWorldwide
- ...realize their greatest potential. Title and Summary Senior Software Engineer Who is Mastercard Mastercard is a global technology... ...design, development, testing, deployment, and documentation. • Lead team prioritization discussions in collaboration with Product and...Full timeWorldwide
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Software Engineer II Job Overview: Provides support of applications software through programming, analysis, design, development and delivery of software...Full timeWork experience placementWorldwide
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Software Engineer II (DevOps) Support software engineering through programming, testing, implementation, documenting and delivery of software solutions...Full timeWork experience placementWorldwide
- ...realize their greatest potential. Title and Summary Senior Software Engineer Who is Mastercard? We work to connect and power an... ...expected to perform the following general responsibilities: · Help lead team prioritization discussions in close collaboration with...Full timeWorldwide
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Senior Software Engineer Overview MasterCard is a technology company in the global payments business. We connect consumers, financial institutions,...Full timeWork experience placementWorldwide
- ...realize their greatest potential. Title and Summary Senior Software Engineer - Android Who is Mastercard? Mastercard is a global... ...quality, within schedule and within estimated efforts. • Assist Lead Engineer in low level design • Provide estimate for the...Full timeWorldwide
- ...governments realize their greatest potential. Title and Summary Lead Technical Program Manager As a Lead Technical Program... ...Software Builders Experience, you will be working with Mastercard’s Engineering team for automated provisioning Engineering teams. The teams...Full timeWorldwide
- ...governments realize their greatest potential. Title and Summary Lead Technical Program Manager Our Purpose We work to connect... ...to be part of a highly visible, strategically important global engineering organization? Role • Plan and manage a group of related...Full timeWorldwide
- ...governments realize their greatest potential. Title and Summary Lead Technical Program Manager Overview Mastercard’s Services... ...a Technical Program Manager to lead complex, cross‑functional engineering programs that support high‑impact client engagements. In this...Full timeWorldwide
- ...governments realize their greatest potential. Title and Summary Lead Product Manager - Technical-AgenticAI product Lead Product... ...agent orchestration, and tool-use. 2. Partner with AI/ML and engineering teams on architecture, grounding, retrieval, safety, and...Full timeWorldwide
- ...of products and services that help people, businesses and governments realize their greatest potential. Title and Summary Lead Software Engineer Job Title: Lead Software Engineer (Java, Spring, Docker, Kubernetes, AWS) Company Overview: MasterCard is a...Full timeWorldwide
- ...DevOps Engineer | Midnight Foundation Location: This is a full-time remote role for candidates based in Asia, with 1–2 hours of daily... ...the status quo. Key Responsibilities Protocol Reliability: Ensure availability of our services, specifically targeting 9...Remote jobFull timeFlexible hoursNight shiftRotating shift
- ...realize their greatest potential. Title and Summary Senior Data Engineer - Data & AI Platform Senior Data Engineer – Data & AI... ...production-ready insights. The focus is on building systems that are reliable, repeatable, and easy for other teams to adopt at scale. As...Full timeWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Lead Site Reliability Engineer. Be the first to apply!
