Senior Site Reliability Engineer

LeanData

Senior Site Reliability Engineer

LeanData helps the world's fastest-growing companies automate, simplify, and accelerate revenue.

We are looking for a Senior Site Reliability Engineer to lead the strategic evolution of our cloud infrastructure. Reporting directly to the SVP of Engineering, this role is designed for a builder - someone who wants to move beyond maintenance and into the realm of architectural transformation.

You will have the autonomy to evaluate our existing AWS footprint and lead the charge in modernizing our environment. Your mission is to take a high-velocity system and implement the best practices, guardrails, and automated architectures that will support our next 10x of scale. You will be the primary authority on reliability, performance, and infrastructure security.

This is a hybrid role based in our Santa Clara, CA office, with an in-office schedule of two days per week – Monday and Wednesday.

Key Responsibilities

Architectural Modernization: Lead the design and implementation of a scalable, "Cloud-First" AWS architecture. You will drive the transition toward fully automated, state-of-the-art Infrastructure as Code (Terraform).
High Availability & Resilience: Design and implement robust Disaster Recovery (DR) and Business Continuity plans, moving our services toward a zero-downtime deployment model.
Performance & Capacity Engineering: Own the strategy for capacity planning and autoscaling. You will optimize our compute resources (EC2, Lambda) to handle bursty traffic patterns with precision and cost-efficiency.
Advanced Observability: Define our monitoring and alerting philosophy using New Relic for deep APM and system insights. Partner this with IncidentIO to ensure we catch and resolve issues before they impact customers.
Streamlined CI/CD: Partner with feature teams to refine Change Management and CI/CD pipelines, ensuring code moves from "commit" to "production" safely and predictably.
Cloud Security: Harden our network architecture and application security posture, including WAF management and secure service-to-service communication.

The Tech Stack

Cloud Infrastructure: AWS (EC2, Lambda, SQS, SNS, ALB, API Gateway, S3, WAF).
Observability & Incident Response: New Relic (APM/Infrastructure), IncidentIO.
Automation & Tools: Terraform, Redis/Elasticache, Shell Scripting, NPM/PM2.
Application Ecosystem: NodeJS, Python, C#, Angular, Apex.
Integration: Salesforce Managed Packages, MSFT Dynamics365.

Who You Are

Experienced Architect: 5+ years of experience in SRE, DevOps, or Systems Engineering, with a proven track record of managing complex AWS environments.
Proven Incident Commander: You demonstrate calm, decisive leadership during high-pressure outages. You have extensive experience running blameless postmortems and, crucially, driving the remediation work needed to prevent recurrence.
Observability Pro: You have deep experience configuring New Relic (or similar platforms) to create meaningful dashboards, SLIs, and SLOs.
Automation Advocate: You believe that manual intervention is a bug. You have deep experience with Terraform and a "Code-First" approach to infrastructure.
Strategic Problem Solver: You can look at a complex, "needs-based" architecture and formulate a clear, prioritized roadmap to move it toward industry best practices.
Collaborative Leader: You enjoy working with feature engineers to help them build "reliability-by-design" into their services.
Education: A Bachelor's degree in Computer Science, Engineering, or a related technical field (or equivalent professional experience).

Why work at LeanData:

LeanData covers employee insurance premiums up to 90%
Stock options in LeanData for all full-time employees
Flexible PTO
401K plan

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Santa Clara, CA vacancy

Senior Site Reliability Engineer
$159.2k - $301.6k
...running Graphs on the cloud. In this reliability-focused role, you will own the availability... .... You'll partner with the backend engineers building these APIs to make sure the system... ...Science. ~5-10 years of experience in site reliability engineering, infrastructure,...
Senior
Temporary work
Local area
Worldwide
Adobe
San Jose, CA
2 days ago
Senior/Staff Site Reliability Engineer
$180k - $260k
...effortless integration into customers' logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you will...
Senior
Odd job
Work at office
Remote work
Gatik AI
Santa Clara, CA
5 days ago
Senior Site Reliability Engineer- Sunnyvale, CA, the US
About the Role Senior Site Reliability Engineer (Payments Infrastructure) - Kody is seeking a Senior Site Reliability Engineer to ensure the reliability, availability, scalability, and operational excellence of our global payment platform. You will own production observability...
Senior
Kody
Sunnyvale, CA
3 days ago
Senior Software Engineer, Site Reliability Engineering
$174k - $252k
Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California...
Senior
Full time
Google Inc.
Sunnyvale, CA
12 hours ago
Senior Site Reliability Engineer - HPC
$152k - $241.5k
...artificial intelligence. We’re looking for a Senior SRE to join our Compute Farm team and... ...host lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑... ...Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through...
Senior
NVIDIA
Santa Clara, CA
3 days ago
Senior Site Reliability Engineer - CI/CD & Automation
Fortinet is seeking a talented Site Reliability Engineer to join our engineering team in the United States. This hands‑on role focuses on building, maintaining, and troubleshooting cloud service clusters, infrastructure, and monitoring systems to ensure high availability...
Senior
Fortinet
Sunnyvale, CA
3 days ago
Senior Site Reliability Engineer — Scale, Automation & Uptime
$145k - $165k
A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key...
Senior
Bolt Graphics, Inc.
Sunnyvale, CA
2 days ago
Site Reliability Engineer - Senior Staff
$118k - $170k
Overview Site Reliability Engineer - Senior Staff Req ID: 81736 Location: Sunnyvale, California, United States, 94089 In our ‘always on’ world, we believe it’s essential to have a genuine connection with the work you do. At Ruckus Networks, you will work on large-scale...
Senior
Work at office
Relocation
3 days per week
Vistance Networks
Sunnyvale, CA
3 days ago
Senior Systems Engineer, Site Reliability Engineering, Google Cloud
$166k - $244k
Overview Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have...
Senior
Full time
Google
Sunnyvale, CA
3 days ago
Senior Site Reliability Engineer
$90k - $180k
...generic medicines. Our 115,000 colleagues serve people in more than 160 countries. JOB DESCRIPTION: About the Role This Senior Site Reliability Engineer position works on-site out of our Sylmar, CA or Sunnyvale, CA location in the Cardiac Rhythm Management Division. We...
Senior
Full time
Remote work
Shift work
Abbott
Sunnyvale, CA
5 hours ago
Senior Director of Site Reliability Engineering
...talented teams in transformative projects. Together, let's push boundaries and achieve unparalleled success. As a Senior Director of Site Reliability Engineering at JPMorgan Chase within the I nfrastructure Platforms and Foundational Services (IPFS) team , you are...
Senior
J.P. Morgan
Palo Alto, CA
17 days ago
Senior Lead Site Reliability Engineer
...professionals for this role. JOB DESCRIPTION Elevate your engineering prowess to unprecedented levels by joining a team of... ...and position yourself among the top echelon in site reliability. As a Senior Lead Site Reliability Engineer at JPMorgan Chase within...
Senior
J.P. Morgan
Palo Alto, CA
1 day ago
Senior Site Reliability Engineer
...The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast...
Senior
Remote work
Nectar Social
Palo Alto, CA
1 day ago
Sr. Site Reliability Engineer
...keep the world running. Location: 5 on-site days a week in Sunnyvale, CA Headquarters. Our Team's Vision: Our Engineering team is shaping the future of cybersecurity... ...: We are looking for an experienced Senior Site Reliability Engineer (SRE) with a strong background...
Senior
Work experience placement
Immediate start
Illumio
Sunnyvale, CA
4 days ago
Senior Site Reliability Engineer
$13 per hour
...building America's mortgage rails. About the Job You'll own reliability and operational excellence for Pylon's production systems.... ...scale as we grow. You'll build tooling that makes the entire engineering team more effective, establish on-call rotations and runbooks...
Senior
Pylon
Palo Alto, CA
1 day ago
Senior SRE Engineer: Scale, Automate, Uptime
$166k - $244k
A leading technology company located in Sunnyvale, California, is seeking a Site Reliability Engineer responsible for building and maintaining large-scale systems. The ideal candidate should possess a degree in Computer Science and have significant experience in programming...
Senior
Google
Sunnyvale, CA
3 days ago
Senior Site Reliability Engineer - Remote & Scalable Impact
...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native...
Senior
Remote job
BuildBuddy
Palo Alto, CA
12 hours ago
Senior SRE/DevOps Engineer — AWS (Hybrid)
...Sunnyvale is seeking an experienced SRE DevOps to join their team. The ideal candidate will have at least 8 years of experience in Reliability Engineering or DevOps with advanced skills in Python and Java. This hybrid position requires strong communication and collaboration...
Senior
Cloud Hybrid Technologies, LLC
Sunnyvale, CA
3 days ago
Senior SRE/DevOps Engineer (AWS) - Hybrid Sunnyvale
...seeking an experienced SRE DevOps professional for a hybrid position. The ideal candidate has over 8 years of experience in Reliability Engineering and DevOps, with advanced skills in AWS and programming languages such as Python and Java. Responsibilities include...
Senior
Contract work
Digital Technologies, LLC
Sunnyvale, CA
3 days ago
Sr Site Reliability Engineer (Prisma Access)
$120k - $200k
Sr Site Reliability Engineer (Prisma Access) 2 days ago Be among the first 25 applicants Job Description This role requires US Citizenship. Your Career Palo Alto Networks runs a large infrastructure and is one of the biggest GCP customers. As a Principal SRE, you'll be...
Senior
Rotating shift
Palo Alto Networks
Santa Clara, CA
3 days ago
Senior Wireless Network SRE & Reliability Engineer
A leading technology firm is in search of a Senior Wireless Network Site Reliability Engineer to manage and enhance their wireless network infrastructure. The ideal candidate has over 8 years of experience in wireless network operations and a strong background in wireless...
Senior
TechDigital Group
Santa Clara, CA
1 day ago
Senior Site Reliability Engineer
$179.2k - $268.8k
...sensors and compute systems, test operations, systems and safety engineering - all dedicated to redefining the relationship between people and their vehicles for millions of customers. As a Site Reliability Engineer on the team, you will be responsible for helping to...
Senior
Permanent employment
Full time
Work at office
Immediate start
Visa sponsorship
Latitude AI
Palo Alto, CA
3 days ago
Senior SRE Engineer — AI-Driven Compute Platform
...technology leader is looking for an experienced SRE software engineer in Cupertino, California, to build and enhance compute infrastructure... .... Applicants should have at least 8 years of experience in site reliability engineering, a strong background in cloud infrastructure, and...
Senior
Apple Inc.
Cupertino, CA
4 days ago
Senior SRE Engineer: Scale & Reliability Leader
$174k - $252k
A leading tech company is seeking a Senior Software Engineer for Site Reliability Engineering based in Sunnyvale, CA. The role involves ensuring service reliability, leading technical projects, and enhancing systems performance. Candidates should have at least 5 years of...
Senior
Google Inc.
Sunnyvale, CA
12 hours ago
Sr Staff Site Reliability Engineer Veza
$165.5k - $289.6k
Sr Staff Site Reliability Engineer - Veza Full-time Employee Type: Regular Region: AMS - North America and Canada Work Persona: Flexible or Remote Veza is the pioneer in identity security, purpose-built to answer the fundamental question enterprises face: who can and...
Senior
Full time
Work at office
Remote work
Flexible hours
ServiceNow
Santa Clara, CA
3 days ago
Senior Staff Site Reliability Engineer, AViD, YouTube Ads
$262k - $365k
Senior Staff Site Reliability Engineer, AViD, YouTube Ads Mountain View, CA, USA; London, UK Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Mountain View, CA, USA; London, UK . Advanced Experience...
Senior
Google
Mountain View, CA
3 days ago
Senior PaaS Site Reliability Engineer (Cloud Ops)
$115.8k - $160k
Tencent is seeking a skilled professional to manage and optimize PaaS products in North America. This role involves monitoring product stability, resolving technical issues, and applying tools like CI/CD to enhance operational efficiency. Candidates should have a Bachelor...
Senior
Tencent
Palo Alto, CA
3 days ago
Senior SRE/DevOps Engineer - Reliability & Automation Lead
A technology services company is seeking a Senior Site Reliability Engineer / DevOps Engineer in Sunnyvale, CA. The ideal candidate will have over 8 years of experience in DevOps, expertise in Docker and Kubernetes, and proficiency with Terraform or Ansible. Responsibilities...
Senior
Donato Technologies Inc
Sunnyvale, CA
4 days ago
Site Reliability Engineer (SRE)
...Site Reliability Engineer (SRE) Location: Santa Clara Valley (Cupertino), California, Hybrid. Duration: 6+ Months Job Description Deploy, support and monitor new and existing services, platforms, and application stacks. Use scale testing to measure, tune...
Zortech Solutions
Cupertino, CA
3 days ago
Site Reliability Engineer
...Site Reliability Engineer Forward is transforming how the world's most complex networks are managed and secured. Founded in 2013 by four Stanford Ph.D.s, we built the industry's first network digital twin — a mathematically precise model of the production network that...
Night shift
Forward Networks Inc
Santa Clara, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!