Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer / DevOps Engineer

Prophet Town

Mountain View, United States | Posted on 05/12/2026 Location: Onsite - Mountain View, CA Experience Required: 5+ years Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems across multiple international regions. This role is for an engineer with 5+ years of experience building and running production‑grade cloud infrastructure. The right person understands where distributed systems fail and has learned the hard lessons that come from operating Kubernetes and cloud platforms at scale. The ideal candidate has deep hands‑on experience with Kubernetes, ArgoCD, Terraform, CI/CD pipelines, AWS infrastructure, and multi‑region platform reliability. They should understand the limitations, sharp edges, and operational failure modes of these tools. This is an onsite role working closely with platform engineering and leadership to build resilient global infrastructure. What You’ll Do Design and operate globally distributed production infrastructure across AWS regions and physical data center environments in South America and Europe Build highly available multi-region systems with strong disaster recovery and failover strategies Solve cross-region networking, latency, DNS routing, replication, and reliability challenges Build, scale, secure, and troubleshoot production Kubernetes clusters Handle cluster lifecycle management, upgrades, node failures, networking issues, storage problems, and control‑plane troubleshooting Tune workloads for resiliency, scheduling efficiency, autoscaling behavior, and resource optimization etcd instability networking overlays and CNI failures node pressure and eviction behaviorcluster upgrade regressions GitOps / ArgoCD Operations Design and maintain GitOps workflows using ArgoCD Manage promotion pipelines across environments and regions Resolve drift detection issues, sync conflicts, reconciliation failures, and deployment ordering challenges Build safe rollback and progressive deployment strategies Candidates should know why ArgoCD breaks, not just how to click “Sync.” Infrastructure as Code Build and maintain reusable Terraform modules for multi‑region infrastructure Manage state strategy, workspace isolation, secrets handling, and provider complexity Solve real‑world Terraform pain points, including: state corruption and locking conflicts module version drift provider upgrade regressions dependency graph surprises cross‑account provisioning complexity Build and optimize production CI/CD pipelines Improve deployment speed, safety, and repeatability Troubleshoot flaky pipelines, artifact inconsistencies, race conditions, environment drift, and rollback failures Reliability & Observability Establish SLIs/SLOs and production health standards Build alerting, monitoring, tracing, and incident response workflows Lead root cause analysis and postmortem improvements Reduce operational toil through automation Why This Role You’ll own foundational infrastructure decisions for globally distributed systems and help build resilient platform capabilities at international scale. This is a hands‑on engineering role for someone who wants meaningful ownership and complex technical problems. Requirements Required Experience 5+ years in Site Reliability Engineering, DevOps, or Platform Engineering Deep production experience with: ArgoCD Terraform AWS CI/CD systems Preferred Experience Experience operating infrastructure across multiple continents Experience with hybrid cloud or physical data center integration Strong networking knowledge, including BGP, VPNs, routing, DNS, and load balancing Experience with security hardening and compliance in production systems Software engineering background with Go, Python, or Bash What “Senior” Means Here You have enough production experience to have strong opinions because you have seen failures firsthand. You know: why Terraform plans sometimes lie why ArgoCD syncs can fail for non‑obvious reasons why Kubernetes upgrades can ruin your week why “works in staging” means very little why multi‑region failover diagrams often fail in production why observability usually breaks exactly when needed most You’ve solved these problems repeatedly and improved systems because of those lessons. #J-18808-Ljbffr Prophet Town

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer / DevOps Engineer in Mountain View, CA vacancy
  • $210k - $270k

    Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production...  ...improve future uptime Enforce a culture around strong DevOps and where product teams share a big role in site... 
    Senior
    Flexible hours

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  • A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The...  ...candidates will have over 4 years of experience in SRE or DevOps and a strong understanding of security best practices. This... 
    Senior

    Amiri Recruiting

    Mountain View, CA
    1 day ago
  • $210k - $270k

    Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities... 
    Senior

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  •  ..., and the challenges of building in a high-growth startup, we’d love to talk. This is more than a job—it’s a journey. Site Reliability Engineers (SREs) are responsible for the overall performance and reliability of ASAPP's infrastructure and products. The team owns... 
    Senior
    Remote work

    ASAPP

    Mountain View, CA
    1 day ago
  •  ...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native... 
    Senior
    Remote job

    BuildBuddy

    Palo Alto, CA
    1 day ago
  • $180k - $260k

     ...operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our...  ...in a related role such as Site Reliability Engineer, DevOps Engineer, or Infrastructure Engineer. Strong knowledge... 
    Senior
    Odd job
    Work at office
    Remote work

    Booster

    Mountain View, CA
    1 day ago
  • The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing... 
    Senior

    Nectar

    Palo Alto, CA
    1 day ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing...  ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks... 
    Senior

    Pylon

    Palo Alto, CA
    3 days ago
  • $174k - $252k

    Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California... 
    Senior
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • Zocdoc, located in Silicon Valley, CA, is seeking a Senior Site Reliability Engineer to monitor and maintain cloud-based systems ensuring uptime for millions of patients. You'll work with cutting-edge technology in a diverse and collaborative environment. This role requires... 
    Senior

    Dormont Manufacturing Co

    Palo Alto, CA
    4 days ago
  • $145k - $165k

    A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key... 
    Senior

    Bolt Graphics, Inc.

    Sunnyvale, CA
    3 days ago
  • $179.2k - $268.8k

     ...sensors and compute systems, test operations, systems and safety engineering - all dedicated to making a real, positive impact on the...  ..., Mich., and Palo Alto, Calif. Meet the team: As a Site Reliability Engineer on the team, you will be responsible for helping to... 
    Senior
    Permanent employment
    Full time
    Work at office
    Immediate start
    Visa sponsorship

    Latitude AI

    Palo Alto, CA
    4 days ago
  •  ...send me a copy of your updated resumes Title: Sr. SRE / DevOps Engineer Location: Sunnyvale, CA (Only Local candidate) Client...  .../ DevOps Engineer at Sunnyvale, California location. As Site Reliability Engineer, the individual will work closely with multi-functional... 
    Senior
    Local area
    Immediate start

    Donato Technologies Inc

    Sunnyvale, CA
    16 days ago
  •  ...Senior DevOps Engineer Location: Sunnyvale, CA Onsite position Fulltime position JD: Must Have Skills: AWS, EKS, IAM, S3, Kubernetes, Kustomize, Flux, Crossplane, CRDs, Python, Github, Kafka, Linux, Trino Strong... 
    Senior
    Full time

    SARIAN Co

    Sunnyvale, CA
    2 days ago
  • $150k - $180k

     ...focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role emphasizes... 
    Senior
    Full time

    Verrus, LLC

    Mountain View, CA
    3 days ago
  •  ...join one of America's most beloved eCommerce companies as a Senior Release Engineer. This role will work across all web based brands and you'll...  ...Skill Set Specific experience deploying large scale web sites/products Experience deploying cloud based apps Strong... 
    Senior

    Black Swan Search

    Mountain View, CA
    3 days ago
  •  ...Job description Company is helping our client find a Senior DevOps Engineer to provide follow-the-sun coverage for the ADAS line of business...  ..., scaling), and platform teams to maintain uptime, reliability, and operational excellence across multiple production environments... 
    Senior

    Comrise

    Palo Alto, CA
    4 days ago
  •  ...Looper, Kubernetes, or Concord. Collaborate with developers, QA, DevOps, and product teams to ensure high-quality and timely releases....  ...on release progress, risks, and dependencies. Mentor junior engineers and promote best practices in release engineering and... 
    Senior

    Tranzeal

    Sunnyvale, CA
    3 days ago
  •  ...world running. Location: 5 on-site days a week in Sunnyvale,...  .... Our Team's Vision: Our Engineering team is shaping the future...  ...looking for an experienced Senior Site Reliability Engineer (SRE) with a strong...  ...experience with tools such as Azure DevOps, Jenkins, or GitLab CI/CD... 
    Senior
    Work experience placement

    Illumio

    Sunnyvale, CA
    5 days ago
  • $175k - $219k

     ...to 1 phase of building a new product. We are looking for a Senior DevOps Engineer who is a builder, not a maintainer. You will architect the...  ...Android), ensuring our developers can ship code instantly and reliably. This is not a role where you wait for a ticket or perfect... 
    Senior

    Drivemode

    Mountain View, CA
    3 days ago
  •  ...that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for...  ...GPU fleets. Join our team of innovative engineers who are building this platform and...  ...operating production distributed systems as SRE/DevOps/Platform Ops. Proven ownership of... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $232k - $263k

     ...future of SaaS security! Sr. Staff Site Reliability Engineer As a Sr. Staff SRE at Obsidian ,...  ...will operate as a strategic partner to DevOps and Platform Engineering leadership, shaping...  ...roles ~3+ years operating at a senior or technical leadership level (Staff or... 
    Senior
    Work from home
    Flexible hours

    Obsidian Security

    Palo Alto, CA
    9 days ago
  •  ...patients worldwide. We're a team of engineers, clinicians, and innovators united by one...  ...of Position We are seeking a Senior DevOps Engineer to join the software team within...  ...tooling that supports scalable, secure, and reliable data platforms and APIs. Essential... 
    Senior
    Local area
    Worldwide
    Flexible hours

    Intuitive

    Sunnyvale, CA
    1 day ago
  •  ...Description Primary Function of Position We are seeking a Senior DevOps Engineer to join the software team within the Endoluminal business unit...  ...and tooling that supports scalable, secure, and reliable data platforms and APIs. Essential Job Duties Design and... 
    Senior

    Intuitive

    Sunnyvale, CA
    1 day ago
  • Poshmark, Inc. is seeking a talented Site Reliability Engineer to ensure the health and performance of our web-scale systems. You will collaborate with development teams and focus on automating and monitoring systems for high reliability. The ideal candidate has 5 years... 
    Senior

    Poshmark, Inc.

    Redwood City, CA
    4 days ago
  • $198k - $260k

     ...Senior Staff DevOps Engineer - CI/CD & Release Engineering At Sonatus, we're driving the transformation...  ...— your job is to make releases reliable, repeatable, and auditable. Artifact...  ...lunches, snacks, and beverages during on-site working days Wellness benefit... 
    Senior
    Work at office
    Worldwide
    Flexible hours
    Shift work

    Sonatus

    Sunnyvale, CA
    4 days ago
  • An innovative AI solutions company is seeking a Senior DevOps Engineer to architect and maintain the core infrastructure supporting cutting...  ...seamless deployments, and championing best practices in system reliability. Ideal candidates should have over 7 years of experience... 
    Senior
    Remote job
    Full time
    Flexible hours

    New Code Inc

    Palo Alto, CA
    3 days ago
  • $207k - $300k

    Google Inc. is looking for a Staff Software Engineer specializing in Site Reliability Engineering in Sunnyvale, CA. This role combines software and systems engineering to build and manage distributed systems, ensuring high reliability and uptime. The ideal candidate should... 
    Senior

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $197.54k - $278.46k

    42dot Inc. in Sunnyvale, CA is seeking a Sr. Staff DevOps Engineer to own the infrastructure for verifying and validating software-defined vehicles. You will optimize CI/CD pipelines, manage physical and virtual regression farms, and lead system integration with Hyundai... 
    Senior

    42dot Inc.

    Sunnyvale, CA
    4 days ago
  • $100k - $300k

    Job Title: Senior DevOps Engineer Position Type: FTE Location: Palo Alto, CA Salary Range / Rate (Currency): $100,000 - $300,000 Job ID#...  ...infrastructure (GCP, Docker, Terraform, etc.). Develop reliability and observability strategies to ensure system performance... 
    Senior

    Ipro Networks Pte. Ltd.

    Palo Alto, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer / DevOps Engineer. Be the first to apply!