Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer / DevOps Engineer

Prophet Town

Mountain View, United States | Posted on 05/12/2026 Location: Onsite - Mountain View, CA Experience Required: 5+ years Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems across multiple international regions. This role is for an engineer with 5+ years of experience building and running production‑grade cloud infrastructure. The right person understands where distributed systems fail and has learned the hard lessons that come from operating Kubernetes and cloud platforms at scale. The ideal candidate has deep hands‑on experience with Kubernetes, ArgoCD, Terraform, CI/CD pipelines, AWS infrastructure, and multi‑region platform reliability. They should understand the limitations, sharp edges, and operational failure modes of these tools. This is an onsite role working closely with platform engineering and leadership to build resilient global infrastructure. What You’ll Do Design and operate globally distributed production infrastructure across AWS regions and physical data center environments in South America and Europe Build highly available multi-region systems with strong disaster recovery and failover strategies Solve cross-region networking, latency, DNS routing, replication, and reliability challenges Build, scale, secure, and troubleshoot production Kubernetes clusters Handle cluster lifecycle management, upgrades, node failures, networking issues, storage problems, and control‑plane troubleshooting Tune workloads for resiliency, scheduling efficiency, autoscaling behavior, and resource optimization etcd instability networking overlays and CNI failures node pressure and eviction behaviorcluster upgrade regressions GitOps / ArgoCD Operations Design and maintain GitOps workflows using ArgoCD Manage promotion pipelines across environments and regions Resolve drift detection issues, sync conflicts, reconciliation failures, and deployment ordering challenges Build safe rollback and progressive deployment strategies Candidates should know why ArgoCD breaks, not just how to click “Sync.” Infrastructure as Code Build and maintain reusable Terraform modules for multi‑region infrastructure Manage state strategy, workspace isolation, secrets handling, and provider complexity Solve real‑world Terraform pain points, including: state corruption and locking conflicts module version drift provider upgrade regressions dependency graph surprises cross‑account provisioning complexity Build and optimize production CI/CD pipelines Improve deployment speed, safety, and repeatability Troubleshoot flaky pipelines, artifact inconsistencies, race conditions, environment drift, and rollback failures Reliability & Observability Establish SLIs/SLOs and production health standards Build alerting, monitoring, tracing, and incident response workflows Lead root cause analysis and postmortem improvements Reduce operational toil through automation Why This Role You’ll own foundational infrastructure decisions for globally distributed systems and help build resilient platform capabilities at international scale. This is a hands‑on engineering role for someone who wants meaningful ownership and complex technical problems. Requirements Required Experience 5+ years in Site Reliability Engineering, DevOps, or Platform Engineering Deep production experience with: ArgoCD Terraform AWS CI/CD systems Preferred Experience Experience operating infrastructure across multiple continents Experience with hybrid cloud or physical data center integration Strong networking knowledge, including BGP, VPNs, routing, DNS, and load balancing Experience with security hardening and compliance in production systems Software engineering background with Go, Python, or Bash What “Senior” Means Here You have enough production experience to have strong opinions because you have seen failures firsthand. You know: why Terraform plans sometimes lie why ArgoCD syncs can fail for non‑obvious reasons why Kubernetes upgrades can ruin your week why “works in staging” means very little why multi‑region failover diagrams often fail in production why observability usually breaks exactly when needed most You’ve solved these problems repeatedly and improved systems because of those lessons. #J-18808-Ljbffr Prophet Town

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer / DevOps Engineer in Mountain View, CA vacancy
  • $150k - $175k

     ...Site Reliability Engineer At ASAPP, our mission is simple: deliver the best AI-powered customer experience—faster than anyone else. To achieve that, we're guided by principles that shape how we think, build, and execute. We value customer obsession, purposeful speed... 
    Senior
    Remote work

    ASAPP

    Mountain View, CA
    5 days ago
  •  ...Senior Site Reliability Engineer Latitude AI develops automated driving technologies, including L3, for Ford vehicles at scale. We're driven by the opportunity to reimagine what it's like to drive and make travel safer, less stressful, and more enjoyable for everyone... 
    Senior
    Work at office
    Immediate start

    Latitude AI

    Palo Alto, CA
    4 days ago
  • A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The...  ...candidates will have over 4 years of experience in SRE or DevOps and a strong understanding of security best practices. This... 
    Senior

    Amiri Recruiting

    Mountain View, CA
    2 days ago
  • donato technologies is seeking a Senior SRE / DevOps Engineer in Sunnyvale, CA. The successful candidate will focus on ensuring system reliability and scalability while automating operations across all teams. Candidates should have over 8 years of experience in DevOps,... 
    Senior

    donato technologies

    Sunnyvale, CA
    6 days ago
  •  ...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native... 
    Senior
    Remote job

    BuildBuddy

    Palo Alto, CA
    2 days ago
  • $180k - $260k

     ...operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our...  ...in a related role such as Site Reliability Engineer, DevOps Engineer, or Infrastructure Engineer. Strong knowledge... 
    Senior
    Odd job
    Work at office
    Remote work

    Booster

    Mountain View, CA
    2 days ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing...  ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks... 
    Senior

    Pylon

    Palo Alto, CA
    4 days ago
  • The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing... 
    Senior

    Nectar

    Palo Alto, CA
    2 days ago
  • Cerebras is looking for a Senior Site Reliability Engineer to join their Infrastructure team in Palo Alto, California. This role involves designing and optimizing infrastructure for distributed AI applications, contributing to the open-source Ray project, and ensuring... 
    Senior

    Cerebras

    Palo Alto, CA
    3 days ago
  •  ...talented teams in transformative projects. Together, let's push boundaries and achieve unparalleled success. As a Senior Director of Site Reliability Engineering at JPMorgan Chase within the Infrastructure Platforms and Foundational Services (IPFS) team, you are deemed as... 
    Senior

    JPMorgan Chase & Co.

    Palo Alto, CA
    5 days ago
  • Senior Staff Software Engineer, Site Reliability Engineering In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees. Benefits for this role include: Health, dental, vision,... 
    Senior
    Temporary work

    Google Inc.

    Sunnyvale, CA
    6 days ago
  • JPMorgan Chase & Co. is seeking a Director of Site Reliability Engineering to partner with the Infrastructure Platforms and Foundational Services team in Palo Alto. This role involves guiding stakeholders through complex projects, leading the application of AI capabilities... 
    Senior

    JPMorgan Chase & Co.

    Palo Alto, CA
    3 days ago
  • $145k - $165k

    A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key... 
    Senior

    Bolt Graphics, Inc.

    Sunnyvale, CA
    4 days ago
  • An innovative AI solutions company is seeking a Senior DevOps Engineer to architect and maintain the core infrastructure supporting cutting...  ...seamless deployments, and championing best practices in system reliability. Ideal candidates should have over 7 years of experience... 
    Senior
    Full time
    Remote work
    Flexible hours

    New Code Inc

    Palo Alto, CA
    4 days ago
  • $83k - $187k

     ...Senior Site Reliability Engineer OCI Incident Response is the first line of defense in maintaining the high availability of Oracle's cloud. We...  ...~3+ years' experience in Site Reliability Engineering, DevOps, or System Engineering. ~ Must have public cloud operations... 
    Senior
    Temporary work
    Work experience placement
    Flexible hours

    Oracle

    Santa Clara, CA
    1 day ago
  • $126k - $204.5k

     ...and XPANSE. As a member of the Cortex DevOps team, your role involves operating and...  ...you will collaborate closely with our engineering teams to develop innovative solutions that...  ...operability of the product and ensure the reliability and availability of our services.... 
    Senior
    Full time
    Work at office

    Palo Alto Networks

    Santa Clara, CA
    3 days ago
  • $150k - $180k

     ...focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role emphasizes... 
    Senior
    Full time

    Verrus, LLC

    Mountain View, CA
    4 days ago
  • $181.69k - $213.75k

     ...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects founders, investors, and limited partners through world-class software, purpose-built for everyone in venture capital, private... 
    Senior
    Full time
    Work at office

    Carta

    Santa Clara, CA
    4 days ago
  • Title: Sr. SRE / DevOps Engineer Location: Sunnyvale, CA Job Summary - For this role, we are looking for a Sr. SRE / DevOps Engineer at Sunnyvale, California location. As Site Reliability Engineer, the individual will work closely with multi‑functional teams, automate... 
    Senior
    Local area

    donato technologies

    Sunnyvale, CA
    6 days ago
  • $128k - $216k

     ...another millions of times a day - quickly, reliably, and securely. Any time you swipe your...  ...at Fiserv. Job Title Sr. Site Reliability Engineer About Clover Clover is a pioneer...  ...confidence. What does a successful Senior Site Reliability Engineer do at Clover... 
    Senior
    Worldwide

    Fiserv

    Sunnyvale, CA
    3 days ago
  •  ...Description Primary Function of Position We are seeking a Senior DevOps Engineer to join the software team within the Endoluminal business unit...  ...and tooling that supports scalable, secure, and reliable data platforms and APIs. Essential Job Duties Design and... 
    Senior

    Intuitive

    Sunnyvale, CA
    2 days ago
  •  ...world running. Location: 5 on-site days a week in Sunnyvale,...  .... Our Team's Vision: Our Engineering team is shaping the future...  ...looking for an experienced Senior Site Reliability Engineer (SRE) with a strong...  ...experience with tools such as Azure DevOps, Jenkins, or GitLab CI/CD... 
    Senior
    Work experience placement

    Illumio

    Sunnyvale, CA
    6 days ago
  • $197.54k - $278.46k

    42dot Inc. in Sunnyvale, CA is seeking a Sr. Staff DevOps Engineer to own the infrastructure for verifying and validating software-defined vehicles. You will optimize CI/CD pipelines, manage physical and virtual regression farms, and lead system integration with Hyundai... 
    Senior

    42dot Inc.

    Sunnyvale, CA
    5 days ago
  • $197.54k - $278.46k

    42dot, located in Sunnyvale, California, is seeking a Sr. Staff DevOps Engineer to oversee the infrastructure for their Verification and Validation engine. This role involves designing CI/CD pipelines and integrating systems with Hyundai's global networks. Ideal candidates... 
    Senior

    42dot

    Sunnyvale, CA
    2 days ago
  •  ...that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for...  ...GPU fleets. Join our team of innovative engineers who are building this platform and...  ...operating production distributed systems as SRE/DevOps/Platform Ops. Proven ownership of... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...drive the future of AI—you’ll help define it. Role Overview Senior DevOps Engineer - architect and maintain the core infrastructure that...  ...secure deployments. Champion best practices: own system reliability, automation, security, and infrastructure architecture. Drive... 
    Senior
    Full time

    Newcode.ai

    Palo Alto, CA
    2 days ago
  • Dormont Manufacturing Co is expanding its team with a Senior DevOps Engineer who will drive the establishment of automated CI/CD pipelines across a hybrid infrastructure. This role is crucial for ensuring seamless integration and delivery of software in both cloud and... 
    Senior

    Dormont Manufacturing Company

    Sunnyvale, CA
    2 days ago
  • $135k - $170k

    We are looking for a motivated and enthusiastic Senior DevOps Engineer to join our IT team. The ideal candidate will have a foundational...  ...Qualifications 10+ years of experience as a DevOps Engineer or Site Reliability Engineer Bachelor's degree in Computer Science or related... 
    Senior

    Valid8 Financial, Inc.

    Sunnyvale, CA
    6 days ago
  • Senior DevOps Engineer job at LVIS. Palo Alto, CA. Company Description LVIS is a leader in cutting-edge neural information analysis technologies...  ...experience Monitoring system and services performance, reliability. Developing and Managing deployment strategies... 
    Senior
    Full time
    Work at office

    Carlsbad Tech

    Palo Alto, CA
    5 days ago
  • Ernst & Young Oman is seeking an FSO DevOps Engineer Senior Analyst to join their team in Palo Alto. In this role, you will drive the delivery...  ...in DevOps, including CI/CD automation and infrastructure reliability. Your key responsibilities include designing solutions,... 
    Senior

    Ernst & Young Oman

    Palo Alto, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer / DevOps Engineer. Be the first to apply!