Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

DevOps Manager: AWS, Kubernetes, SRE Lead

AppZen

Requirements 8+ years of experience in DevOps, SRE, infrastructure, or platform engineering, with at least 2 years leading or managing engineers (formal or tech-lead capacity)

  • Deep, hands-on AWS experience across compute, networking, IAM, data, and observability services; comfortable designing for multi-account, multi-region SaaS
  • Strong production experience with Kubernetes (preferably EKS), including upgrades, autoscaling, and securing multi-tenant clusters
  • Demonstrated hands on operations experience with PostgreSQL at scale — query and index tuning, replication, HA/failover, backups, and version upgrades — and with Elasticsearch / OpenSearch (cluster sizing, shard strategy, ingest tuning, and incident response)
  • Working knowledge of additional datastores commonly used in SaaS: Redis, Kafka or other message brokers, and object storage; comfortable evaluating tradeoffs between managed services (RDS, Aurora, ElastiCache, MSK, OpenSearch Service) and self-managed options
  • Proficient with Terraform and modern IaC patterns; clear opinions on module design, state management, and PR-driven workflows
  • Solid scripting and automation skills in at least one of Python, Go, or Bash
  • Track record of designing and operating CI/CD pipelines at scale (GitHub Actions, Jenkins, ArgoCD, or similar)
  • Experience running production workloads under SOC 2 or comparable compliance frameworks; comfortable partnering with Security on audits and remediation
  • Excellent communication and stakeholder skills; able to translate infrastructure tradeoffs into language product, finance, and customer teams understand
  • (Desirable) Experience supporting AI/ML or data heavy SaaS workloads (GPU fleets, vector stores, large async pipelines)
  • (Desirable) Familiarity with service mesh (Istio, Linkerd) and progressive delivery (Argo Rollouts, feature flags)
  • (Desirable) Background scaling FinOps practices and managing cloud spend at $5M+ annual run-rate
  • (Desirable) Experience operating multitenant SaaS with strict data isolation requirements for enterprise finance customers
  • (Desirable) Exposure to multi-cloud or hybrid-cloud environments (Azure, GCP)
What the job involves As Manager, DevOps you will lead a devops team responsible for the AWS-based infrastructure, Kubernetes platform, CI/CD systems, production datastores (PostgreSQL, Elasticsearch, Redis, and more), and observability stack that power AppZen
  • You'll set technical direction, coach engineers, partner closely with Product Engineering and Security, and stay close enough to the work to tune a slow Postgres query, debug an Elasticsearch cluster under load, write Terraform, or review a Helm chart yourself
  • This is a builder-manager role. We expect roughly 60% leadership and delivery management, and 40% hands-on technical contribution
  • Manage, coach, and grow a team of 3-6 DevOps and platform engineers; own hiring, performance, growth plans, and 1:1s
  • Set quarterly priorities aligned to engineering and business goals; communicate progress and risk clearly to leadership
  • Build a healthy on-call culture: balanced rotations, blameless postmortems, and continuous reduction of toil
  • Own the architecture, cost, and reliability of AppZen's AWS footprint across multiple regions and accounts
  • Drive infrastructure-as-code standards using Terraform; champion modular, reviewable, version-controlled infrastructure
  • Partner with Security and Compliance on SOC 2, ISO 27001, GDPR, and customer audit requirements; harden IAM, network, and secrets management
  • Manage cloud spend: visibility, forecasting, and ongoing optimization (Savings Plans, rightsizing, multi-tenant efficiency)
  • Hands on ownership of PostgreSQL in production: schema reviews, index and query tuning, vacuum/bloat management, replication, failover, point-in-time recovery, and major-version upgrades (RDS / Aurora)
  • Run and scale Elasticsearch / OpenSearch clusters: shard and index design, JVM and heap tuning, snapshot strategy, hot-warm tiers, and incident response under heavy ingest or query load
  • Operate supporting datastores such as Redis (caching, queues), Kafka or SQS/SNS (streaming and async), and S3-backed data lakes; define patterns for high availability, durability, and disaster recovery
  • Partner with engineering on capacity planning, performance benchmarking, data tier cost optimization, backup/restore drills, and customer data isolation for multi-tenant workloads
  • Operate and improve our EKS-based Kubernetes platform: cluster lifecycle, autoscaling, multi tenancy, and workload isolation
  • Define golden paths for service teams using Helm, Kustomize, and GitOps tooling such as ArgoCD or Flux
  • Set patterns for service mesh, ingress, and zero-downtime deployments
  • Lead the design of internal developer platform capabilities so product teams can ship safely and quickly without infra friction
  • Maintain and improve build, test, and deploy pipelines (e.g., GitHub Actions, Jenkins, ArgoCD); enforce supply-chain security and artifact provenance
  • Drive measurable improvements in DORA metrics: lead time, deploy frequency, change failure rate, and MTTR
  • Own the observability stack (e.g., Datadog, Prometheus, Grafana, OpenTelemetry); ensure consistent metrics, logs, and traces across services
  • Define and operationalize SLOs and error budgets in partnership with service owners
  • Lead incident command for high-severity events and convert learnings into durable systemic fixes
#J-18808-Ljbffr AppZen

Vacancy posted 21 hours ago
Similar jobs that could be interesting for youBased on the DevOps Manager: AWS, Kubernetes, SRE Lead in San Francisco, CA vacancy
  • $127k - $249k

     ...an experienced Senior or Staff Engineer for their SRE, InfraSec team based in San Francisco. This role involves...  ...guiding the security of cloud infrastructure, leading teams, and implementing security solutions across AWS, Azure, and GCP. Candidates should have over six years... 
    Amazon Web Service
    Flexible hours

    Insider, Inc.

    San Francisco, CA
    1 day ago
  • Lead Site Reliability Engineer — Scalable Financial Technology...  ...redefining how B2B organizations manage accounts receivable,...  ...Infrastructure Strong experience with AWS, Kubernetes (EKS), containerization (...  ...relational databases. CI/CD & DevOps Experience working with... 
    Amazon Web Service
    Permanent employment
    Full time
    Contract work
    Temporary work
    Flexible hours

    Andiamo

    San Francisco, CA
    5 days ago
  • $165k - $225k

     ...Senior Site Reliability Engineer to build and manage the infrastructure supporting engineering...  ...ensure system reliability, scale the AWS/GCP infrastructure, and collaborate with...  ...in cloud operations and experience with Kubernetes. Competitive salary range of $165,000 - $... 
    Amazon Web Service

    Stellar

    San Francisco, CA
    4 days ago
  • A technology firm is seeking a Lead Site Reliability Engineer to design and implement automated infrastructure and manage Kubernetes workloads. The role involves refining CI/CD pipelines and leading incident response efforts, requiring expertise in Terraform, Prometheus... 
    Suggested

    Axiom Pursuits

    San Francisco, CA
    3 days ago
  • E2B is a fast-growing Series A startup based in San Francisco, seeking an Infrastructure Engineer to manage Terraform and Kubernetes for AI agent sandboxes. Your role involves migrating to Kubernetes, building reusable components, and enhancing infrastructure observability... 
    Suggested

    E2B

    San Francisco, CA
    1 day ago
  •  ...San Francisco, is looking for a Staff Site Reliability Engineer to lead the reliability, scalability, and observability strategies across...  ...focus on distributed systems and cloud environments, preferably AWS. You will guide best practices, mentor engineers, and ensure... 
    Amazon Web Service
    Flexible hours

    Fieldguide

    San Francisco, CA
    4 days ago
  • $170k - $215k

     ...Reliability Engineer in San Francisco to manage site reliability processes, elevate their deployment confidence, and drive AWS infrastructure solutions. This role demands...  ...software development skills and experience in SRE or DevOps roles. The ideal candidate will enjoy high... 
    Amazon Web Service

    Bonfirevc

    San Francisco, CA
    1 day ago
  • A leading AI technology company based in San Francisco is looking for a seasoned Software Engineer with expertise in cloud architecture...  ...possess 6+ years of experience with strong skills in Python or Go, AWS, and Terraform. This role offers a chance to make a significant... 
    Amazon Web Service

    Hayden AI

    San Francisco, CA
    3 days ago
  •  ...looking for an experienced Site Reliability Engineer (SRE) to join our team. The role involves designing and maintaining...  ...in SRE or related roles, along with expertise in Kubernetes and cloud infrastructure (GCP or AWS). We offer comprehensive benefits and a fully remote... 
    Amazon Web Service
    Remote job

    EngFlow

    San Francisco, CA
    1 day ago
  • $150k - $220k

    TrueML is looking for a Senior Manager, DevOps to lead infrastructure and platform engineering efforts in...  ...candidate will have 10+ years in DevOps/SRE, a Bachelor's degree in Computer Science, and expertise in AWS and Kubernetes. The position offers a salary range of $1... 
    Amazon Web Service

    TrueML

    San Francisco, CA
    1 day ago
  • $175k - $210k

     ...Senior Manager, DevOps & SRE – Platform Reliability & Global Operations Location...  ...platforms. This role leads a blended DevOps and SRE...  ...DevOps practices for CI/CD, Kubernetes, IaC, automation, and cost optimization...  ..., Helm, Ansible (Azure & AWS) IAM across Azure and AWS... 
    Amazon Web Service
    Work at office
    3 days per week

    Q-Cells

    San Francisco, CA
    2 days ago
  •  ...seeking a Machine Learning Engineer to lead a portfolio of technology projects. This...  ...involves collaborating with product managers to deliver cloud-based solutions and mentoring...  ..., with strong skills in Python, AWS, and Kubernetes. A Bachelor's degree is required, and a... 
    Amazon Web Service

    Capital One

    San Francisco, CA
    3 days ago
  •  ...Role : DevOps Technical Release Manager Advanced Preferred Location : Onsite (SF Bay Area) • The DevOps Technical...  ...• WebServices • Amazon Web Services (AWS) • AEM • CI/CD • Content Management Systems • DevOps/SRE • HTML/CSS/JavaScript • python
    Amazon Web Service
    Flexible hours

    Info Way Solutions

    San Francisco, CA
    4 days ago
  •  ...Lead the technical design, architecture, and development...  ...with product managers, UX designers, and stakeholders...  ...cloud services (e.g., AWS, Azure, Google Cloud...  ...containerization (e.g., Docker, Kubernetes ). Familiarity...  ...Understanding of DevOps, infrastructure as code... 
    Amazon Web Service

    Texas State Library and Archives Commision

    San Francisco, CA
    2 days ago
  •  ...Reliability Engineering, DevOps, or a similar role...  ...Deep expertise in SRE principles and...  ...automation, incident management, and post-mortems ,...  ...cloud platforms (AWS, GCP, Azure), including...  ...platforms (Kubernetes) , Proficiency in designing...  ...& Response: Lead efforts in incident... 
    Amazon Web Service

    Fireworks AI

    San Francisco, CA
    1 day ago
  •  ...Overview: Position: Sr AWS/Python Developer Lead Location: San Francisco, CA (Onsite) Experience: 8-10 Years (Preferred...  ...architecture, and containerization ( Docker, Kubernetes ). Knowledge of DevOps practices including CI/CD pipelines and Infrastructure... 
    Amazon Web Service
    Contract work

    Purple Drive

    San Francisco, CA
    5 days ago
  • Position Frontend Lead (Amplience CMS OR any other Ecommerce CMS) Location San...  ...Platform (GCP) is our primary cloud. We use AWS and Azure for specific applications....  ...& Orchestration: Docker, Kubernetes (GKE), Helm CI/CD & DevOps: GitLab CI, Jenkins, Spinnaker, Terraform... 
    Amazon Web Service

    Inherent Technologies

    San Francisco, CA
    2 days ago
  •  ...cryptocurrency project, patiently leading the asset-backed currency...  ...experience, specialization as an SRE, and a love of scaling. They...  ...(preferably via automation) Manage blockchain nodes for maximum uptime...  ...with Dokploy / Swarm / Kubernetes / etc Help architect solutions... 
    Work at office
    Flexible hours

    ABC Labs

    San Francisco, CA
    15 days ago
  •  ...operate multicloud infrastructure across AWS, GCP, and Azure. The ideal candidate...  ...strong hands-on experience with Kubernetes and Terraform, and will manage production systems for Fortune 500...  ...options, and the opportunity to work with leading technology in a dynamic environment,... 
    Amazon Web Service

    Pragmatike

    San Francisco, CA
    3 days ago
  • $150k

    About the Role We're looking for a Lead, Applied AI to architect and...  ...organizational enablement. You will manage the core technical...  ...orchestration Production cloud experience (AWS, GCP, or Azure) and containerization (Docker, Kubernetes) Integration experience with... 
    Amazon Web Service
    Full time

    Uber

    San Francisco, CA
    2 days ago
  • A leading cloud security startup in the US is seeking a Platform Engineer to design and maintain AWS infrastructure for multi-tenant SaaS platforms. This role requires strong proficiency in AWS, Kubernetes, and infrastructure-as-code tools such as Terraform. You will work... 
    Amazon Web Service
    Remote work

    Trades Workforce Solutions

    San Francisco, CA
    4 days ago
  • $73.15k - $174k

     ...Lead TypeScript Backend Architect Choosing Capgemini means...  ...standards. • Work closely with DevOps engineers to build and deploy cloud infrastructure using AWS managed services. • Implement and...  ...especially Swagger (OpenAPI),Docker,Kubernetes,Helm, Istio, Argo •... 
    Amazon Web Service
    Permanent employment
    Full time
    Contract work
    Local area

    Capgemini

    San Francisco, CA
    3 days ago
  • $132.5k - $338.3k

     ...You Are:The Integration Lead Architect playing a pivotal...  ...expectations are managed and aligned with product...  ...integration patterns across AWS or Azure, including containerized and Kubernetes-based deployment models....  ...familiarity with Agile and DevOps practices is highly... 
    Amazon Web Service
    Work experience placement
    Live in
    Work at office
    Local area

    Accenture

    San Francisco, CA
    2 days ago
  • $230k - $270k

    A tech-driven company is seeking an experienced engineer to manage and architect large-scale distributed database systems. The role requires a deep understanding of Kubernetes and AWS, with responsibilities including driving automation and enhancing data integrity. Candidates... 
    Amazon Web Service

    Monograph

    San Francisco, CA
    1 day ago
  • $128.5k - $161k

    A leading code security company is looking for a Software Engineer specializing in Infrastructure...  ...engineers to design, implement, and manage critical infrastructure. Candidates...  ...experience, familiarity with Kubernetes and AWS, and excellent communication skills. This... 
    Amazon Web Service

    Semgrep

    San Francisco, CA
    1 day ago
  •  ...Engineer - Public Cloud (Senior/Lead/Principal) Our Public Cloud...  ...with solving real-world data management challenges, a strong...  ...public cloud platforms such as AWS, GCP, Azure, or Alibaba Design...  ...containerization frameworks such as Kubernetes, Docker, Mesos Resolve... 
    Amazon Web Service

    Salesforce, Inc..

    San Francisco, CA
    7 days ago
  •  ...secure systems and enhancing tooling. Candidates should have 2-3+ years of experience in high-availability systems, proficient in AWS and Kubernetes, and have a solid programming background. The position offers unlimited PTO and a comprehensive benefits plan, fostering an... 
    Amazon Web Service

    Slash Financial

    San Francisco, CA
    2 days ago
  • $125k - $165k

    A leading innovator in laboratory software is seeking a Site Reliability Engineer in San...  ...and performance of AI systems, managing production infrastructure, and operating...  ...distributed systems and technologies like AWS and Kubernetes. The position offers a competitive salary... 
    Amazon Web Service

    TELCOR

    San Francisco, CA
    4 days ago
  • A leading cloud security firm in San Francisco is seeking a Cloud Security Architect to own the cloud and infrastructure security. You will design AWS tenant isolation, manage Kubernetes security, and implement cloud security posture management. Ideal candidates have 5... 
    Amazon Web Service
    Remote work

    Mercor, Inc.

    San Francisco, CA
    5 days ago
  •  ...Infrastructure Engineer to ensure stability, security, and scalability of their infrastructure. In this foundational role, you'll manage the Kubernetes cluster, drive infrastructure automation, and work closely with engineering teams. Candidates should have 6+ years of... 
    Amazon Web Service
    Flexible hours
    Weekend work

    careers.bitkraft.vc - Jobboard

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to DevOps Manager: AWS, Kubernetes, SRE Lead. Be the first to apply!