Site Reliability Engineering Manager
Mastercard
Site Reliability Engineering Manager
The Xborder team is looking for a Site Reliability Engineering Manager who can help us solve problems, implement automation, and leverage best practices.
· Are you a born problem solver who loves to figure out how something works?
· Are you a detail -oriented individual who enjoys complex problem solving?
· Do you love determining the correct actions required to fix a problem?
· Do you have a low tolerance for manual work and look to automate everything you can?
• Lead and drive the end-to-end service lifecycle, ensuring teams effectively engage from inception and design through deployment, operations, and continuous improvement, while aligning with business objectives.
• Oversee ITSM practices across the platform, establishing governance and ensuring teams proactively identify operational gaps and resiliency risks, while driving action plans in partnership with engineering teams.
• Provide strategic direction for production readiness, guiding teams on system design consulting, capacity planning, and launch readiness reviews to ensure scalable and reliable service delivery.
• Own service reliability outcomes, by defining KPIs/SLOs and leading the monitoring of availability, latency, and system health, ensuring accountability across the team.
• Drive scalability and operational efficiency, promoting automation, standardization, and continuous improvement initiatives to enhance reliability, reduce toil, and accelerate delivery velocity.
• Lead incident management excellence, establishing best practices for sustainable incident response, ensuring blameless postmortems, and driving root cause remediation and preventive actions at scale.
• Champion a holistic, cross-stack problem-solving approach, enabling teams to effectively manage complex production incidents and improve mean time to recovery (MTTR).
• Manage and develop a high-performing global team, fostering collaboration across geographies and time zones while ensuring alignment, engagement, and productivity.
• Build and nurture talent, through coaching, mentoring, and career development, while promoting a strong culture of knowledge sharing and continuous learning. All about you • Bachelor’s degree in computer science, Information Technology, or a related technical field (e.g., Engineering, Physics, Mathematics), or equivalent practical experience. Experience in financial services is preferred.
• 8–15 years of relevant experience in Site Reliability Engineering, Infrastructure, or DevOps roles, with a combination of hands-on technical expertise and early leadership responsibilities.
• Strong technical foundation across enterprise platforms, Linux/UNIX systems, operating systems, and database environments (Oracle/SQL, DBA), with the ability to provide technical guidance and support to the team.
• Experience with observability and monitoring tools (e.g., Splunk, Dynatrace), driving improved system visibility, performance, and reliability.
• Solid experience in DevOps and CI/CD practices, with the ability to support and guide automation, deployment pipelines, and operational improvements.
• Proficiency in one or more programming or scripting languages such as Python, Java, Go, C/C++, Perl, or Ruby, with practical application in automation or system improvements.
• Proven exposure to automation initiatives, with the ability to contribute to and help scale solutions that reduce operational toil and improve efficiency.
• Working knowledge of ITSM processes, including incident, problem, and change management, with experience applying these practices in production environments.
• Experience supporting customer-facing platforms, ensuring service reliability, availability, and effective issue resolution.
• Strong analytical and problem-solving skills, with the ability to troubleshoot complex issues and support the team during high-severity incidents.
• Ability to prioritize, organize, and manage multiple workstreams, balancing operational needs with ongoing improvements.
• Effective communication and collaboration skills, with experience working across engineering, product, and operations teams in a global environment.
• Demonstrated experience mentoring and supporting junior engineers, contributing to team development and knowledge sharing (formal people management experience is a plus but not mandatory).
• Understanding of large-scale distributed systems, including basic design principles, performance considerations, and troubleshooting approaches.
• Exposure to Artificial Intelligence use cases and implementation is a plus, particularly in relation to automation, observability, or operational insights. Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
- Abide by Mastercard’s security policies and practices;
- Ensure the confidentiality and integrity of the information being accessed;
- Report any suspected information security violation or breach, and
- Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.
- ...potential. Title and Summary Director, Infrastructure & Site Reliability Engineering Who is Mastercard? Mastercard is a global... ...Lead modernization efforts including hardware lifecycle management, virtualization upgrades, and infrastructure optimization...SuggestedFull timeWorldwide
- ...realize their greatest potential. Title and Summary Senior Site Reliability Engineer The Xborder team is looking for a Senior Site... ...Solid understanding of ITSM processes (Change and Problem Management). • Experience with observability and monitoring tools such...SuggestedFull timeWorldwide
- ...governments realize their greatest potential. Title and Summary Site Reliability Engineer II Who is Mastercard? At Mastercard technology, we... ...and upfront in the development process, and to proactively manage production and change activities to maximize customer...SuggestedRemote jobFull timeWorldwide
- ...thinking organization, apply now. We are currently seeking a Site Reliability Engineer to join our team in Guadalajara, Jalisco (MX-JAL), Mexico... ...cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage. Troubleshoot Azure resources,...SuggestedWork at officeRemote workMonday to FridayFlexible hoursRotating shiftDay shift
- ...realize their greatest potential. Title and Summary Lead Site Reliability Engineer Overview: The role of Business Operations Organization... ...and upfront in the development process, and to proactively manage production and change activities to maximize customer...SuggestedFull timeWorldwideShift work
- ...Reforma Latino (97001), Mexico, Ciudad de Mexico, Ciudad de Mexico Lead Site Reliability Engineer We're building a Site Reliability Engineering center in Mexico City, and we're hiring a Manager-level Backend Engineer to own the reliability and operational maturity...InternshipLocal area
- ...greatest potential. Title and Summary Business Operations Site Reliability Engineer Overview: The role of Business Operations... ...and upfront in the development process, and to proactively manage production and change activities to maximize customer experience...Full timeWorldwideShift work
- ...JOB SUMMARY Manages all engineering/maintenance operations, including maintaining the building, grounds and physical plant with particular attention towards safety, security and asset protection. Accountable for managing the budget, capital expenditure projects,...Full timeFor contractorsWork at office
- ...Title and Summary Director, Platform Engineering Mastercard powers economies and empowers... ...infrastructure delivery, and improving reliability, consistency, and velocity across cloud... ..., including CI/CD pipelines, secrets management, artifact repositories, and...Full timeWorldwide
$6,000 per month
...Mobile Systems Developer , you are the engine under the hood of our app experience. While... ...that make our travel companion reliable in the real world. You will partner with... ...implement REST/GraphQL integrations and manage complex local storage (SQLite/Preferences...Contract workLocal area- ...Title and Summary Director, Software Engineering Overview The CNPF Data & AI organization... ...AI and agentic concepts into secure, reliable, observable, and production-grade... ...grade agentic systems, including context management, memory, tool integration, workflow orchestration...Full timeWorldwide
- ...de Mexico Senior Director, Software Engineering Capital One is seeking an experienced... ...help us build and grow our Technology Site in Mexico City. Based in Mexico City, the... ...native architectures; who focus on well managed experiences to empower leaders and application...Local areaShift work
- ...City, Mex, Mexico, Ciudad de Mexico, Ciudad de Mexico Sr. Manager Software Engineering IC Do you love building and pioneering in the... ...educational tools or other information available through this site. Capital One Financial is made up of several different...Local area
- ...potential. Title and Summary Software Engineer II Overview The CNPF Data & AI organization... ...emerging technologies into secure, reliable, and reusable capabilities that create... ...native development using Kubernetes and managed cloud platforms such as AWS or Azure •...Full timeWorldwide
- ...Mexico, Ciudad de Mexico Senior Software Engineer - Full Stack Do you love building... ...community Collaborate with digital product managers, and deliver robust cloud-based solutions... ...other information available through this site. Capital One Financial is made up of...InternshipLocal area
- ...Mex, Mexico, Ciudad de Mexico, Ciudad de Mexico Senior Manager, Software Engineering (People Leader) Do you love building and pioneering in... ...educational tools or other information available through this site. Capital One Financial is made up of several...InternshipLocal area
- ...potential. Title and Summary Lead Software Engineer Overview The CNPF Data & AI... ...design and delivery of secure, scalable, and reliable agentic applications that can reason,... ...native environments using Kubernetes and managed cloud services on AWS, Azure, or GCP •...Full timeTemporary workWorldwide
- ...part of NTT Group, which invests over $3 billion each year in R&D. Whenever possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored to each client’s needs. While many positions offer remote or...Work at officeRemote workFlexible hours
- ...development, implementation and management of applications,... ...to NTT DATA offices or client sites. This ensures we can provide... ...looking for a Senior DevOps Engineer with strong experience in infrastructure... ..., security, and application reliability. The candidate should be...Work at officeRemote workFlexible hours
- ...now. We are currently seeking a System Engineering - Azure to join our team in Guadalajara,... ...platforms & Applications Certificate Management and PKI Develop an architecture of... ...trusts, forest, domain tree structures, sites, DNS, GPOs, OU, FRS, DFSR. Good knowledge...Work at officeRemote workWork from homeHome officeFlexible hoursNight shiftWeekend work
- ...administration, monitoring, configurations, apply patches, change management, security and compliance activities ~ Test and deploy... ...Whenever possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored...Work at officeRemote workFlexible hours
- ...performance, scalability, and reliability Write clean, maintainable... ...architecture, and state management Backend Expertise in... ...with ETL pipelines or data engineering concepts Who You Are Energetic... ...NTT DATA offices or client sites. This ensures we can provide...Work at officeRemote workFlexible hours
- ...currently seeking a L3 Support Engineer (Python & MongoDB) to join... ...operations teams to ensure system reliability and performance. Key... ..., log analysis, and incident management. - Familiarity with ETL processes... ...NTT DATA offices or client sites. This ensures we can provide...Work at officeRemote workFlexible hours
- ...Guadalajara, Jalisco (MX-JAL), Mexico (MX). 1. L2 Production Support Engineer: Job Description Mandatory Skills: · 3+ years of relevant... ...possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored...Work at officeRemote workFlexible hours
- ...Salesforce Extend Salesforce capabilities to support various business functions, including custom applications and integrations. Management of user roles, permissions, profiles, flows, validation rules, and declarative automation. Leverage of Salesforce AI tools to...
$2,500 per month
...tune live offer funnels and e-commerce sites that take real orders from real customers... ..., and tracking platforms. ● Manage affiliate fraud. Monitor affiliate traffic... ...degree (or equivalent) in computer science, engineering, or a related technical field. ● A...Permanent employmentFull timeWorldwideTrial period- ...Mexico (MX). # L3 Production Support Engineer: Job Description Mandatory Qualifications... ...that automate provisioning and access management primary/secondary controls ~ Develop... ...locally to NTT DATA offices or client sites. This ensures we can provide timely and...Work at officeRemote workFlexible hours
- Job Title: COBOL Unisys Developer Job Summary We are looking for an experienced COBOL Unisys Developer with 5+ years of experience to support and enhance legacy systems. The ideal candidate will be responsible for maintaining applications, troubleshooting issues,...
- ...seeking a Remote Storage & Cloud engineer to join our team in... ...The role involves designing, managing, and decommissioning storage... ...Portworx, OpenEBS). Ensure reliable storage for stateful workloads... ...to NTT DATA offices or client sites. This ensures we can provide...Permanent employmentWork at officeImmediate startRemote workHome officeFlexible hours
- ...designing and developing interfaces between SAP and non-SAP 4. Manage OS/DB/Platform/Cloud migrations, with experience in at least 2... ...Whenever possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored...Work experience placementWork at officeImmediate startRemote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability Engineering Manager. Be the first to apply!
- on-site clinical research associate (traveling/remote) Mexico
- junior website developer Mexico
- site reliability engineering manager
- site reliability engineer remote
- lead site reliability engineer
- site reliability engineer sre
- site reliability engineer
- junior site reliability engineer
- site engineering manager
- site safety supervisor
