Lead Site Reliability Engineer
Capital One
Overview
WeWork Reforma Latino (97001), Mexico, Ciudad de Mexico, Ciudad de Mexico Lead Site Reliability EngineerWe're building a Site Reliability Engineering center in Mexico City, and we're hiring a Manager-level Backend Engineer to own the reliability and operational maturity of our settlement platforms. These are batch-critical systems that process every credit and debit transaction across the network.
This is a foundational role. You'll be one of the first engineers in CDMX responsible for ensuring settlement cycles complete accurately, on time, and in compliance with SOX and PCI-DSS requirements. You'll work across hybrid infrastructure (on-prem data centers and AWS), partner closely with UK-based engineers, and build the automation and observability that allows Mexico City to operate settlement.
What You'll Do
Own reliability for batch settlement systems - ensure cycle completion windows are met, data integrity is maintained, and failures are detected before they reach downstream consumers
Build and improve observability for settlement pipelines - dashboards, alerts, and anomaly detection that make system health legible and reduce reliance on tribal knowledge
Drive automation of operational toil - certificate rotation, environment provisioning, compliance artifact generation, and manual validation steps that currently require human intervention
Partner with UK-based settlement engineers - acquire domain expertise on Durbin compliance windows, cross-border DCI routing, and acquirer/issuer SLA adherence
Participate in incident management - respond to settlement failures, drive root cause analysis, and implement durable fixes that prevent recurrence
Contribute to regulatory readiness - ensure SRE practices produce audit-ready artifacts for SOX and PCI-DSS exams without manual toil
What Success Looks Like
Independently validate and troubleshoot settlement cycle failures
At least two manual settlement operations processes fully automated
Settlement observability coverage sufficient to detect anomalies before cycle deadlines
Documented runbooks and severity criteria for all critical settlement failure modes
The Environment
You'll work with batch processing systems that handle financial transactions across multiple on-prem data centers with active/active and active/passive configurations. The stack includes Java, Python, shell scripting, SQL, AWS, Kubernetes, OpenShift containers, Datadog, Observe, and legacy payment platforms. CI/CD pipelines, API automation, and secret management via HashiCorp Vault are part of daily operations. You'll leverage agentic AI automation (Claude Code or others) to accelerate development and build automation solutions. You'll need strong troubleshooting and debugging skills and be comfortable with both modern cloud-native tooling and traditional enterprise batch systems.
Basic Qualifications
- Professional English fluency
- Bachelor's degree
- At least 6 years of experience in SRE, production operations, or reliability engineering
- Experience in DevOps Engineering (internship experience does not apply)
- 5+ years of experience in at least one of the following: Java, Python, Go
- At least 4 years of experience with Cloud Native technologies (Amazon Web Services, Microsoft Azure, Google Cloud Platform)
- 3+ years of experience with container orchestration services including Docker or Kubernetes
- Experience with Shell or Bash scripting
- At least 3 years of Unix or Linux system administration experience
Preferred Qualifications
- Experience developing automation solutions using agentic AI tools (Claude Code, Copilot CLI)
- Troubleshooting and debugging skills across distributed systems
- Familiarity with payments, financial services, or other regulated high-availability domains
- Knowledge or experience of Networking concepts (TCP/DNS/TLS)
For technical support or questions about Capital One's recruiting process, please send an email to View email address on capitalonecareers.com
Capital One does not provide, endorse nor guarantee and is not liable for third-party products, services, educational tools or other information available through this site.
Capital One Financial is made up of several different entities. Please note that any position posted in Canada is for Capital One Canada, any position posted in the United Kingdom is for Capital One Europe, any position posted in the Philippines is for Capital One Service Corp (COPSSC), and any position posted in Mexico is for Capital One Technology Labs Mexico.
- ...realize their greatest potential. Title and Summary Senior Site Reliability Engineer The Xborder team is looking for a Senior Site... ...everything you can? Overview Business Operations is leading the Site Reliability Engineering (SRE) transformation at Mastercard...SuggestedFull timeWorldwide
- ...governments realize their greatest potential. Title and Summary Site Reliability Engineer II Who is Mastercard? At Mastercard technology, we... ...design, automation, capacity planning, and monitoring that leads to fault-tolerant, scalable products. We see the big...SuggestedRemote jobFull timeWorldwide
- ...thinking organization, apply now. We are currently seeking a Site Reliability Engineer to join our team in Guadalajara, Jalisco (MX-JAL), Mexico (... ...through responsible innovation. We are one of the world's leading AI and digital infrastructure providers, with unmatched...SuggestedWork at officeRemote workMonday to FridayFlexible hoursRotating shiftDay shift
- ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Lead Site Reliability Engineer Overview: The role of Business Operations Organization is to be the production readiness steward for Mastercard...SuggestedFull timeWorldwideShift work
- ...greatest potential. Title and Summary Business Operations Site Reliability Engineer Overview: The role of Business Operations... ...operational design, automation, capacity planning, monitoring that leads to fault-tolerant, scalable products. We see the big picture...SuggestedFull timeWorldwideShift work
- ...governments realize their greatest potential. Title and Summary Site Reliability Engineering Manager The Xborder team is looking for a Site... ...to automate everything you can? Business Operations is leading the Site Reliability Engineering (SRE) transformation at Mastercard...Full timeWorldwideShift work
- ...potential. Title and Summary Director, Infrastructure & Site Reliability Engineering Who is Mastercard? Mastercard is a global technology... ...seeking a Director of Site Reliability Engineering (SRE) to lead strategic initiatives that ensure the reliability, scalability...Full timeWorldwide
- ...Home based role ICON plc is a world-leading healthcare intelligence and clinical research organization. We’re proud to foster an inclusive... ...future of clinical development. We are currently seeking a Site Contracts Lead to join our diverse and dynamic team. As a Site...Remote jobContract workWork from homeFlexible hours
$6,000 per month
...As a Mobile Systems Developer , you are the engine under the hood of our app experience. While our UX... ...hardware integrations that make our travel companion reliable in the real world. You will partner with our Tech Lead to build a robust, offline-capable mobile...Contract workLocal area- ...We are currently seeking a Cloud DBA Lead to join our team in City of Mexico, Guanajuato... ...00) - REQUIRED Google Cloud Associate Engineer - REQUIRED Azure Database... ...hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective...Work at officeRemote workFlexible hoursShift work
- ...organization, apply now. We are currently seeking a SAP BASIS Team Lead to join our team in Mexican Republic, México (MX-MEX), Mexico (... ...possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored...Work experience placementWork at officeImmediate startRemote workFlexible hours
- ...organization, apply now. We are currently seeking a SAP BASIS Team Lead to join our team in Mexican Republic, México (MX-MEX), Mexico (... ...possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored...Work experience placementWork at officeRemote workFlexible hours
- ..., Mexico, Ciudad de Mexico, Ciudad de Mexico Senior Software Engineer - Full Stack Do you love building and pioneering in the technology... ..., educational tools or other information available through this site. Capital One Financial is made up of several different...InternshipLocal area
- ...potential. Title and Summary Software Engineer II Overview The CNPF Data & AI organization... ...emerging technologies into secure, reliable, and reusable capabilities that create... ...automation, CI/CD, and AI-assisted workflows • Lead by example through hands-on development,...Full timeWorldwide
- ...their greatest potential. Title and Summary Director, Platform Engineering Mastercard powers economies and empowers people in 200+... ...enabling self-service infrastructure delivery, and improving reliability, consistency, and velocity across cloud and on prem platforms...Full timeWorldwide
- ...Latino (97001), Mexico, Ciudad de Mexico, Ciudad de Mexico Lead Software Engineer - Full Stack Do you love building and pioneering in the... ...tools or other information available through this site. Capital One Financial is made up of several different...InternshipLocal area
- ...connectivity. We are one of the leading providers of digital and AI... ...NTT DATA offices or client sites. This ensures we can provide... ...looking for a Senior DevOps Engineer with strong experience in infrastructure... ..., security, and application reliability. The candidate should be...Work at officeRemote workFlexible hours
- ...and back-end team Provide the technical guidance to team and lead on issue resolution. Qualifications: ~7+ years of experience... ...possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored...Work at officeRemote workFlexible hours
- ...governments realize their greatest potential. Title and Summary Lead Software Engineer Overview The CNPF Data & AI organization is looking... ...will lead the design and delivery of secure, scalable, and reliable agentic applications that can reason, orchestrate tools,...Full timeTemporary workWorldwide
- ...now. We are currently seeking a System Engineering - Azure to join our team in Guadalajara,... ...trusts, forest, domain tree structures, sites, DNS, GPOs, OU, FRS, DFSR. Good... ...responsible innovation. We are one of the world's leading AI and digital infrastructure providers,...Work at officeRemote workWork from homeHome officeFlexible hoursNight shiftWeekend work
- ...Mexico (MX). Job Description: Lex Support Engineer (JAVA and Linux) Role Overview: We... ...innovation. We are one of the world's leading AI and digital infrastructure providers,... ...hire locally to NTT DATA offices or client sites. This ensures we can provide timely and...Permanent employmentWork experience placementWork at officeRemote workFlexible hours
- ...team in Guadalajara, Jalisco (MX-JAL), Mexico (MX). SRE – Site Reliability Engineer We are currently seeking a Site Reliability Engineer to... ...through responsible innovation. We are one of the world's leading AI and digital infrastructure providers, with unmatched capabilities...Work at officeRemote workMonday to FridayFlexible hoursRotating shiftDay shift
- ...currently seeking a L3 Support Engineer (Python & MongoDB) to join... ...operations teams to ensure system reliability and performance. Key... ...innovation. We are one of the world's leading AI and digital infrastructure... ...NTT DATA offices or client sites. This ensures we can provide...Work at officeRemote workFlexible hours
- ...operational efficiency and reduce data obscurity 2. Experience in Leading and drives solution discussions both with business and other IT... ...possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored...Work experience placementWork at officeRemote workFlexible hours
$2,500 per month
...test, and continuously tune live offer funnels and e-commerce sites that take real orders from real customers every hour of the... ...job demands. ● A degree (or equivalent) in computer science, engineering, or a related technical field. ● A genuine foundation in programming...Permanent employmentFull timeWorldwideTrial period- ...A/B testing results Ensure compliance with accessibility standards (WCAG, ADA) Collaborate closely with product managers, engineers, and stakeholders Present design concepts and rationale through clear storytelling and documentation Core Competencies #...
- ...part of an inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking a Sr. Salesforce Technical Lead to join our team in Mexico. Responsibilities : Design, Coding/Programming , develop, test and deploy custom applications...
- ..., Mexico (MX). # L3 Production Support Engineer: Job Description Mandatory Qualifications... ...and communication skills, being able to lead in a global environment ~... ...hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective...Work at officeRemote workFlexible hours
- Job Title: COBOL Unisys Developer Job Summary We are looking for an experienced COBOL Unisys Developer with 5+ years of experience to support and enhance legacy systems. The ideal candidate will be responsible for maintaining applications, troubleshooting issues,...
- ...Ciudad de Mexico Senior Manager, Software Engineering (People Leader) Do you love building... ...Capital One. What You’ll Do: Lead a portfolio of diverse technology projects... ...other information available through this site. Capital One Financial is made up of...InternshipLocal area
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Lead Site Reliability Engineer. Be the first to apply!
