Manager, Site Reliability Engineer
Mastercard
Manager, Site Reliability Engineer
Who is Mastercard?
At Mastercard technology, we work to connect and power an inclusive, digital economy that benefits everyone, everywhere, by making transactions safe, simple, smart, and accessible. Using secure data and networks, partnerships, and passion, our innovations and solutions help individuals, financial institutions, governments, and businesses realize their greatest potential. Our decency quotient, or DQ, drives our culture and everything we do inside and outside of our company. We cultivate a culture of inclusion for all employees that respects their individual strengths, views, and experiences. We believe that our differences enable us to be a better team – one that makes better decisions, drives innovation, and delivers better business results.
What we create today will define tomorrow. Revolutionary technologies that reshape the digital economy to be more connected and inclusive than ever before. Safer, faster, more sustainable.
And we need the best people to do it. Technologists who are energized by the challenges of a truly global network. With the talent and vision to create the critical systems and products that power global commerce and connect people everywhere to the vital goods and services they need every day.
Working at Mastercard means being part of a unique culture. Inclusive and diverse, a rich collaboration of ideas and perspectives. A place that celebrates your strengths, values your experiences, and offers you the flexibility to shape a career across disciplines and continents. And the opportunity to work alongside experts and leaders at every level of the business, improving what exists, and inventing what’s next. About the Role
The Business Operations team is seeking a highly motivated and experienced Manager, Site Reliability Engineer (SRE) to join our team. You will play a critical role in ensuring the reliability, scalability, and performance of our applications, supporting essential services that power Mastercard's global operations. As a thought leader in your field, you will bring technical expertise, a passion for automation, and the ability to mentor. The role of the Business Operations Site Reliability Engineer is to be the production readiness steward for Mastercard products. As Business Operations SRE, we are responsible for ensuring that our platform is stable and healthy. We break down barriers to running our products by fostering developer run ownership and empowering developers to build resilient products. We support our developers during the application build phase in software run principles that include operational design, automation, capacity planning, and monitoring that leads to fault-tolerant, scalable products. We see the big picture and help create and enforce operations standards while facilitating an agile and learning culture. We support daily operations with a hyper focus on triage, root cause by understanding the business impact of our products and subsequently performing blameless post-mortems. The goal of every Business Operations team is to engage early in the development lifecycle to be more proactive and upfront in the development process, and to proactively manage production and change activities to maximize customer experience and increase the overall value of supported applications. Business Operations teams also focus on risk management by tying all our activities together with an overarching responsibility for compliance and risk mitigation across all our environments. Ultimately, the role of Business Operations is to align Product and Customer Focused priorities with Operational needs by providing continuous feedback throughout the lifecycle. As part of the Business Operations team, you will:
• Oversee a team of individual contributors, supporting the execution of strategic initiatives by providing technical expertise and leadership within the Site Reliability Engineering discipline to analyze complex problems and provide novel solutions and/or improvements.
• Guide the team in automating routine tasks, troubleshooting complex issues, and optimizing system performance.
• Collaborate with cross-functional teams to develop strategies for system scalability and resilience, training team members on technical skills, operational best practices, and incident management.
• Oversee incident response efforts, ensuring timely resolution and comprehensive root cause analysis.
• Cultivate a culture of continuous improvement by promoting best practices, innovation, and proactive risk management.
• Support the implementation and maintenance of high-availability systems to ensure operational stability.
• Contribute to documentation, knowledge sharing, and best practices to improve team operational procedures.
• Lead automation and scripting efforts to streamline operational processes and incident response workflows.
• Manage a team of individual contributors(s) and/or technical lead(s), directing area processes and work to ensure that they align with functional best practices and organizational standards; conduct goal setting and performance appraisal processes to coach team members and support their professional development. Role qualifications:
The ideal candidate will apply leadership skills independently and consistently in complex or nuanced situations to support broader goals. Recognized as a key contributor and may coach or support others informally. As a leader, you will:
• Build diverse, high performing teams with a customer-focused mindset. Attract, grow, and develop exceptional, future-ready talent.
• Inspire teams to look beyond their function, connect their work to enterprise impact, think end to end, and act in the best interests of the whole company.
• Anticipate market shifts and use curiosity, innovation, and technology to turn insights into strategies that drive growth and competitive advantage.
• Lead through ambiguity across diverse markets and regulatory environments, connecting insights and stakeholders to create clarity with sound judgment and cross cultural awareness.
• Inspire and mobilize people and teams to act with speed, agility, and accountability in driving ambitious business outcomes, with a relentless focus on the customer.
• Explore new ideas, ways of working and technology. Set clear direction, aligns stakeholders, and remove barriers to progress. Guide teams through uncertainty with clarity, empathy, and resilience.
As this is a player/coach role, the ideal candidate will also apply the following skills independently and consistently in complex or nuanced situations, begin using the skills to support broader goals, and be recognized as a key contributor who may coach or support others informally. • Observability - Ability to use scripting and tooling to implement observability solutions, enabling the collection, analysis, and visualization of metrics, logs, and traces to support incident detection, diagnosis, and continuous service improvement.
• Programming and Scripting - Ability to write and maintain code and scripts to automate tasks, build operational tools, and support monitoring, deployment, and incident response using languages such as Python, Go, Bash, or similar.
• Systems and Network Administration - Ability to configure, operate, and troubleshoot Linux/Unix systems and network components, applying knowledge of networking concepts, protocols, security, and system reliability.
• Cloud Computing and Infrastructure - Ability to design, deploy, and manage applications and infrastructure on cloud platforms (e.g., AWS, Azure, GCP), ensuring scalability, security, availability, and operational efficiency.
• Reliability and Scalability - Ability to design and operate systems for high availability, fault tolerance, and disaster recovery, while ensuring systems can scale to meet current and future demand
• DevOps Practices - Ability to apply DevOps principles and practices, including CI/CD pipelines, containerization, and orchestration, to enable faster, more reliable software delivery and operations.
• Troubleshooting - Capability to systematically identify, diagnose, and resolve technical issues across systems, applications, and networks, using analytical methods and tools to restore functionality, minimize disruption, and ensure stable operations.
• Capacity Planning and Performance Optimization - Ability to monitor resource utilization, forecast future capacity needs, and optimize system performance to support growth, scalability, and efficient infrastructure usage.
• IT Service Management - Ability to apply IT service management principles to incident, problem, and change management, ensuring reliable service delivery, effective incident response, and continuous service improvement aligned to business needs.
• Proactive Monitoring and Improvement (SRE Applications) - The ability to use application reliability signals to anticipate issues, identify risks, and drive preventative improvements that enhance application performance and availability. Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
- Abide by Mastercard’s security policies and practices;
- Ensure the confidentiality and integrity of the information being accessed;
- Report any suspected information security violation or breach, and
- Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.
- ...realize their greatest potential. Title and Summary Senior Site Reliability Engineer Who is Mastercard? At Mastercard technology, we work... ...and upfront in the development process, and to proactively manage production and change activities to maximize customer...SuggestedFull timeWorldwide
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Software Engineer II Overview The Customer Delivery Test team is looking for a Senior Quality Assurance Analyst - Software Testing & Quality Assurance...SuggestedFull timeWorldwide
- ...secure trading experience for our clients. As a Senior Devops Engineer, you will be involved a number of technically challenging problems... ...and deploy cloud infrastructure in IaC manner. Improve and manage Terraform, K8S and CI/CD pipeline. Design and implement...SuggestedFull timeRemote workFree visa
$240k - $360k
...that welcomes you—because when you feel valued, you’re empowered to do your best work. Job Summary As a Senior Principal Engineer within the Core Product Services organization, you will lead and shape the engineering vision for horizontal platform services and...SuggestedFull timeTemporary workLocal areaRemote work$23.6 - $24.39 per hour
...Position Title: Front End Manager Department: Operations Supervisor: Branch Manager FLSA: Exempt (Salaried) Position Summary: Ensures that product is properly loaded, rung up and processed out the door on a timely basis as well as handling customer problems...SuggestedFull time- Microsoft Power Platform Developer Start: ASAP Co-operation length: 12 months + Location: Fully Remote Power Platform Requirements: Experience 3+ years including: Power Automate or/and Power Apps Power BI, with ability to code in DAX and Power...Contract workImmediate startRemote work
- ...Nethermind is a blockchain research and software engineering company building high-performance... ...and backend systems. Implement local state management, diagnostics, logging, and fallback behavior. Support reliable device-to-cloud communication. Work with backend...Remote jobFull timeLocal area
- ...Nethermind Nethermind is a blockchain research and software engineering company building high-performance infrastructure, security tooling... ...role, not a generic chatbot role. The systems must be useful, reliable, measurable, safe, and connected to real products and real...Full time
- ...Tune infrastructure and adjust system configurations to improve reliability and stability Resolve dependency compatibility issues... ...PostgreSQL database — query tuning, indexing, caching, and connection management Non-Core (but inevitable) Observability gaps remediation...Remote jobPart time
- ...Maternity Leave benefits available after 2 years Flexible schedules Employee Assistance Program Opportunities to advance to management positions Waffle House Stock ownership opportunities after 90 days Paid weekly through Direct deposit or pay card The...Weekly payFull timeImmediate startFlexible hoursShift workWeekend work
$100k - $130k
...targeting $50 billion in annual revenues within the next decade through accretive acquisitions and organic growth. The Project Manager is responsible for overseeing all aspects of project planning, execution, and delivery. This role leads cross-functional teams,...Full timeWork at office$40 - $60 per hour
...in design develop solutions for inbound and outbound interfaces using PL/SQL and C#.NET. • Code review and unit testing • Release management best practices systems optimization environmental validation and issue resolution. • Good understanding of the Facets Data Model...Hourly payWork experience placementRemote work- ...production applications and blockchain infrastructure. Environment Management: Oversee pre-production and staging environments to ensure... ...~ Strong communication to bridge between end-users and the engineering team such as ability to distill complex application/...Full timeRemote workFree visa
$20.78 - $21.48 per hour
...Position Title: Floor Manager Department: Floor Supervisor: Assistant Branch Manager/Branch Manager FLSA: Exempt Position Summary: Responsible for receiving product and ensures that aisles are stocked, labeled, clean and delivered product is packed out...Full time- ...operated by an independent franchisee, Castlerock Hospitality Management. The franchisee is a separate company and a separate employer... ...proposals, contracts, and customized event solutions. Conduct site tours, client meetings, sales calls, and networking activities...Full time
- ...is a blockchain research and software engineering company building high-performance infrastructure... ...reporting. Support security, reliability, observability, and scalability from... ...Must Haves: Experience with IoT, fleet management, retail technology, payments-adjacent...Remote jobFull time
- ...looking for a **Senior Sales Representative / Business Development Manager (EU Market). ** Requirements : ~3–7+ years of experience in... ...~ Experience collaborating with technical teams, presales engineers, or delivery teams ~ Fluent English (C1 or higher) is required...Remote jobFull time
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Manager, Site Reliability Engineer. Be the first to apply!
