Senior Site Reliability Engineer / DevOps Engineer

Prophet Town

Mountain View, United States | Posted on 05/12/2026 Location: Onsite - Mountain View, CA Experience Required: 5+ years Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems across multiple international regions. This role is for an engineer with 5+ years of experience building and running production‑grade cloud infrastructure. The right person understands where distributed systems fail and has learned the hard lessons that come from operating Kubernetes and cloud platforms at scale. The ideal candidate has deep hands‑on experience with Kubernetes, ArgoCD, Terraform, CI/CD pipelines, AWS infrastructure, and multi‑region platform reliability. They should understand the limitations, sharp edges, and operational failure modes of these tools. This is an onsite role working closely with platform engineering and leadership to build resilient global infrastructure. What You’ll Do Design and operate globally distributed production infrastructure across AWS regions and physical data center environments in South America and Europe Build highly available multi-region systems with strong disaster recovery and failover strategies Solve cross-region networking, latency, DNS routing, replication, and reliability challenges Build, scale, secure, and troubleshoot production Kubernetes clusters Handle cluster lifecycle management, upgrades, node failures, networking issues, storage problems, and control‑plane troubleshooting Tune workloads for resiliency, scheduling efficiency, autoscaling behavior, and resource optimization etcd instability networking overlays and CNI failures node pressure and eviction behaviorcluster upgrade regressions GitOps / ArgoCD Operations Design and maintain GitOps workflows using ArgoCD Manage promotion pipelines across environments and regions Resolve drift detection issues, sync conflicts, reconciliation failures, and deployment ordering challenges Build safe rollback and progressive deployment strategies Candidates should know why ArgoCD breaks, not just how to click “Sync.” Infrastructure as Code Build and maintain reusable Terraform modules for multi‑region infrastructure Manage state strategy, workspace isolation, secrets handling, and provider complexity Solve real‑world Terraform pain points, including: state corruption and locking conflicts module version drift provider upgrade regressions dependency graph surprises cross‑account provisioning complexity Build and optimize production CI/CD pipelines Improve deployment speed, safety, and repeatability Troubleshoot flaky pipelines, artifact inconsistencies, race conditions, environment drift, and rollback failures Reliability & Observability Establish SLIs/SLOs and production health standards Build alerting, monitoring, tracing, and incident response workflows Lead root cause analysis and postmortem improvements Reduce operational toil through automation Why This Role You’ll own foundational infrastructure decisions for globally distributed systems and help build resilient platform capabilities at international scale. This is a hands‑on engineering role for someone who wants meaningful ownership and complex technical problems. Requirements Required Experience 5+ years in Site Reliability Engineering, DevOps, or Platform Engineering Deep production experience with: ArgoCD Terraform AWS CI/CD systems Preferred Experience Experience operating infrastructure across multiple continents Experience with hybrid cloud or physical data center integration Strong networking knowledge, including BGP, VPNs, routing, DNS, and load balancing Experience with security hardening and compliance in production systems Software engineering background with Go, Python, or Bash What “Senior” Means Here You have enough production experience to have strong opinions because you have seen failures firsthand. You know: why Terraform plans sometimes lie why ArgoCD syncs can fail for non‑obvious reasons why Kubernetes upgrades can ruin your week why “works in staging” means very little why multi‑region failover diagrams often fail in production why observability usually breaks exactly when needed most You’ve solved these problems repeatedly and improved systems because of those lessons. #J-18808-Ljbffr Prophet Town

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer / DevOps Engineer in Mountain View, CA vacancy

Senior Site Reliability Engineer
$150k - $175k
...Site Reliability Engineer At ASAPP, our mission is simple: deliver the best AI-powered customer experience—faster than anyone else. To achieve that, we're guided by principles that shape how we think, build, and execute. We value customer obsession, purposeful speed...
Senior
Remote work
ASAPP
Mountain View, CA
5 days ago
Senior Site Reliability Engineer
...Senior Site Reliability Engineer Latitude AI develops automated driving technologies, including L3, for Ford vehicles at scale. We're driven by the opportunity to reimagine what it's like to drive and make travel safer, less stressful, and more enjoyable for everyone...
Senior
Work at office
Immediate start
Latitude AI
Palo Alto, CA
4 days ago
Senior Site Reliability Engineer: Cloud, Kubernetes & CI/CD
A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The... ...candidates will have over 4 years of experience in SRE or DevOps and a strong understanding of security best practices. This...
Senior
Amiri Recruiting
Mountain View, CA
2 days ago
Senior SRE/DevOps Engineer - Cloud Reliability & Automation
donato technologies is seeking a Senior SRE / DevOps Engineer in Sunnyvale, CA. The successful candidate will focus on ensuring system reliability and scalability while automating operations across all teams. Candidates should have over 8 years of experience in DevOps,...
Senior
donato technologies
Sunnyvale, CA
6 days ago
Senior Site Reliability Engineer - Remote & Scalable Impact
...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native...
Senior
Remote job
BuildBuddy
Palo Alto, CA
2 days ago
Senior/Staff Site Reliability Engineer
$180k - $260k
...operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our... ...in a related role such as Site Reliability Engineer, DevOps Engineer, or Infrastructure Engineer. Strong knowledge...
Senior
Odd job
Work at office
Remote work
Booster
Mountain View, CA
2 days ago
Senior Site Reliability Engineer
$140k - $220k
About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing... ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks...
Senior
Pylon
Palo Alto, CA
4 days ago
Senior Site Reliability Engineer
The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing...
Senior
Nectar
Palo Alto, CA
2 days ago
Senior Site Reliability Engineer - Cloud AI Infrastructure
Cerebras is looking for a Senior Site Reliability Engineer to join their Infrastructure team in Palo Alto, California. This role involves designing and optimizing infrastructure for distributed AI applications, contributing to the open-source Ray project, and ensuring...
Senior
Cerebras
Palo Alto, CA
3 days ago
Senior Director of Site Reliability Engineering
...talented teams in transformative projects. Together, let's push boundaries and achieve unparalleled success. As a Senior Director of Site Reliability Engineering at JPMorgan Chase within the Infrastructure Platforms and Foundational Services (IPFS) team, you are deemed as...
Senior
JPMorgan Chase & Co.
Palo Alto, CA
5 days ago
Senior Staff Software Engineer, Site Reliability Engineering
Senior Staff Software Engineer, Site Reliability Engineering In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees. Benefits for this role include: Health, dental, vision,...
Senior
Temporary work
Google Inc.
Sunnyvale, CA
6 days ago
Senior Director, AI-Driven Site Reliability Engineering
JPMorgan Chase & Co. is seeking a Director of Site Reliability Engineering to partner with the Infrastructure Platforms and Foundational Services team in Palo Alto. This role involves guiding stakeholders through complex projects, leading the application of AI capabilities...
Senior
JPMorgan Chase & Co.
Palo Alto, CA
3 days ago
Senior Site Reliability Engineer — Scale, Automation & Uptime
$145k - $165k
A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key...
Senior
Bolt Graphics, Inc.
Sunnyvale, CA
4 days ago
Remote Senior DevOps Engineer - AI Infrastructure & Scale
An innovative AI solutions company is seeking a Senior DevOps Engineer to architect and maintain the core infrastructure supporting cutting... ...seamless deployments, and championing best practices in system reliability. Ideal candidates should have over 7 years of experience...
Senior
Full time
Remote work
Flexible hours
New Code Inc
Palo Alto, CA
4 days ago
Senior Site Reliability Engineer
$83k - $187k
...Senior Site Reliability Engineer OCI Incident Response is the first line of defense in maintaining the high availability of Oracle's cloud. We... ...~3+ years' experience in Site Reliability Engineering, DevOps, or System Engineering. ~ Must have public cloud operations...
Senior
Temporary work
Work experience placement
Flexible hours
Oracle
Santa Clara, CA
1 day ago
Senior Staff Site Reliability Engineer
$126k - $204.5k
...and XPANSE. As a member of the Cortex DevOps team, your role involves operating and... ...you will collaborate closely with our engineering teams to develop innovative solutions that... ...operability of the product and ensure the reliability and availability of our services....
Senior
Full time
Work at office
Palo Alto Networks
Santa Clara, CA
3 days ago
Senior SRE & Software Engineer: Infra-as-Code & Cloud
$150k - $180k
...focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role emphasizes...
Senior
Full time
Verrus, LLC
Mountain View, CA
4 days ago
Senior Site Reliability Engineer
$181.69k - $213.75k
...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects founders, investors, and limited partners through world-class software, purpose-built for everyone in venture capital, private...
Senior
Full time
Work at office
Carta
Santa Clara, CA
4 days ago
Sr. SRE / DevOps Engineer - Sunnyvale, CA (Only Local candidate)
Title: Sr. SRE / DevOps Engineer Location: Sunnyvale, CA Job Summary - For this role, we are looking for a Sr. SRE / DevOps Engineer at Sunnyvale, California location. As Site Reliability Engineer, the individual will work closely with multi‑functional teams, automate...
Senior
Local area
donato technologies
Sunnyvale, CA
6 days ago
Sr. Site Reliability Engineer
$128k - $216k
...another millions of times a day - quickly, reliably, and securely. Any time you swipe your... ...at Fiserv. Job Title Sr. Site Reliability Engineer About Clover Clover is a pioneer... ...confidence. What does a successful Senior Site Reliability Engineer do at Clover...
Senior
Worldwide
Fiserv
Sunnyvale, CA
3 days ago
Senior DevOps Engineer
...Description Primary Function of Position We are seeking a Senior DevOps Engineer to join the software team within the Endoluminal business unit... ...and tooling that supports scalable, secure, and reliable data platforms and APIs. Essential Job Duties Design and...
Senior
Intuitive
Sunnyvale, CA
2 days ago
Sr. Site Reliability Engineer
...world running. Location: 5 on-site days a week in Sunnyvale,... .... Our Team's Vision: Our Engineering team is shaping the future... ...looking for an experienced Senior Site Reliability Engineer (SRE) with a strong... ...experience with tools such as Azure DevOps, Jenkins, or GitLab CI/CD...
Senior
Work experience placement
Illumio
Sunnyvale, CA
6 days ago
Senior Staff DevOps Engineer - Firmware CI/CD for Safety
$197.54k - $278.46k
42dot Inc. in Sunnyvale, CA is seeking a Sr. Staff DevOps Engineer to own the infrastructure for verifying and validating software-defined vehicles. You will optimize CI/CD pipelines, manage physical and virtual regression farms, and lead system integration with Hyundai...
Senior
42dot Inc.
Sunnyvale, CA
5 days ago
Senior Staff DevOps Engineer, Embedded Firmware CI/CD Lead
$197.54k - $278.46k
42dot, located in Sunnyvale, California, is seeking a Sr. Staff DevOps Engineer to oversee the infrastructure for their Verification and Validation engine. This role involves designing CI/CD pipelines and integrating systems with Hyundai's global networks. Ideal candidates...
Senior
42dot
Sunnyvale, CA
2 days ago
Senior Site Reliability Engineer, AIOPs
...that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for... ...GPU fleets. Join our team of innovative engineers who are building this platform and... ...operating production distributed systems as SRE/DevOps/Platform Ops. Proven ownership of...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior DevOps Engineer
...drive the future of AI—you’ll help define it. Role Overview Senior DevOps Engineer - architect and maintain the core infrastructure that... ...secure deployments. Champion best practices: own system reliability, automation, security, and infrastructure architecture. Drive...
Senior
Full time
Newcode.ai
Palo Alto, CA
2 days ago
Senior DevOps Engineer Hybrid Cloud & On-Prem
Dormont Manufacturing Co is expanding its team with a Senior DevOps Engineer who will drive the establishment of automated CI/CD pipelines across a hybrid infrastructure. This role is crucial for ensuring seamless integration and delivery of software in both cloud and...
Senior
Dormont Manufacturing Company
Sunnyvale, CA
2 days ago
Senior DevOps Engineer
$135k - $170k
We are looking for a motivated and enthusiastic Senior DevOps Engineer to join our IT team. The ideal candidate will have a foundational... ...Qualifications 10+ years of experience as a DevOps Engineer or Site Reliability Engineer Bachelor's degree in Computer Science or related...
Senior
Valid8 Financial, Inc.
Sunnyvale, CA
6 days ago
Senior DevOps Engineer at LVIS Palo Alto, CA
Senior DevOps Engineer job at LVIS. Palo Alto, CA. Company Description LVIS is a leader in cutting-edge neural information analysis technologies... ...experience Monitoring system and services performance, reliability. Developing and Managing deployment strategies...
Senior
Full time
Work at office
Carlsbad Tech
Palo Alto, CA
5 days ago
Senior DevOps Engineer for Web3 Platform & Cloud
Ernst & Young Oman is seeking an FSO DevOps Engineer Senior Analyst to join their team in Palo Alto. In this role, you will drive the delivery... ...in DevOps, including CI/CD automation and infrastructure reliability. Your key responsibilities include designing solutions,...
Senior
Ernst & Young Oman
Palo Alto, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer / DevOps Engineer. Be the first to apply!