Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

Alembic Technologies

Site Reliability Engineer (SRE)

We're looking for an experienced Site Reliability Engineer (SRE) to help us scale our platform with reliability, observability, and operational excellence at the core. You'll partner with engineers and data scientists to build, automate, and maintain the infrastructure that powers our core platform—including data pipelines, ML workloads, and real-time analytics systems.

This is a hands-on, high-impact role with visibility across the stack and the opportunity to shape the future of our infrastructure and operations.

Key Responsibilities
  • Design, build, and maintain scalable infrastructure to support real-time analytics and machine learning workloads
  • Improve system reliability and performance through automation, observability, and proactive capacity planning
  • Own and evolve CI/CD pipelines, deployment automation, rollback mechanisms, and config management
  • Implement and maintain monitoring, alerting, and incident response processes (SLOs, runbooks, on-call rotations)
  • Collaborate across engineering and data science teams to drive a culture of performance and reliability
  • Ensure security, compliance, and operational readiness across our cloud infrastructure
  • Drive post-incident analysis and continuous improvement initiatives
What Will Help You Succeed
  • 8+ years of experience in SRE, DevOps, or infrastructure engineering roles
  • 5+ years of experience with datacenter operations and/or system and network administration
  • Experience with containerization (Docker), and orchestration (Kubernetes)
  • Strong knowledge of Linux systems, networking, and systems performance tuning
  • Solid understanding of infrastructure-as-code (e.g., Terraform, Ansible)
  • Good programming skills and ability to apply sound coding principles to IaC and scripting code with languages such as Terraform, Ansible, Bash (shell scripting), and/or Python.
  • Experience with monitoring and observability stacks (e.g., Prometheus, Grafana, Datadog, ELK, OpenTelemetry) Proficiency with CI/CD tools and pipelines (e.g., GitHub Actions, ArgoCD, etc.)
  • Ability to debug complex systems and automate solutions in scripting languages
  • Excellent communication skills and the ability to work cross-functionally
Nice-to-Have
  • Experience with cloud and managed services (e.g. AWS)
  • Experience supporting data-intensive platforms (Spark, Airflow, Kafka, etc.)
  • Familiarity with security practices for cloud-native applications and infrastructure
  • Experience in high-compliance or SOC-2 environments
What You'll Get
  • Ownership of mission-critical infrastructure in a company solving real-world enterprise problems
  • A front-row seat to a high-performance engineering culture
  • The ability to influence how our platform scales—from deployment to incident management
  • An environment that values curiosity, accountability, and impact
Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in San Francisco, CA vacancy
  •  ...About the Role We're looking for an experienced Site Reliability Engineer (SRE) to help us scale our platform with reliability, observability, and operational excellence at the core. You'll partner with engineers and data scientists to build, automate, and maintain... 
    Senior

    Alembic Limited

    San Francisco, CA
    4 days ago
  •  ...founders with PhDs in AI, Math, and Computer Science - is poised to redefine computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI infrastructure operate with exceptional reliability, performance, and... 
    Senior

    Hyperbolic Labs

    San Francisco, CA
    5 days ago
  •  ...experiment constantly as we find the right paths in an AI-native landscape. The Role You'll be the infrastructure and reliability engineer on the Data Replication team - a full-stack product team running over 3 million sync jobs a week powering thousands of data... 
    Senior
    Local area

    Airbyte

    San Francisco, CA
    1 day ago
  •  ...About the job Senior Site Reliability Engineer About the Company Stellar is a decentralized, public blockchain that gives developers the tools to create experiences that are more like cash than crypto. The network is faster, cheaper, and far more energy-efficient... 
    Senior

    TechChain Talent

    San Francisco, CA
    1 day ago
  • $160k - $250k

     ...public clouds when the right fit. As we continue to commercialize our machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS offering for our customers. Our ideal candidate is someone who is able... 
    Senior

    Hive

    San Francisco, CA
    5 days ago
  • $181.69k - $213.75k

     ...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects founders, investors, and limited partners through world-class software, purpose-built for everyone in venture capital, private... 
    Senior
    Full time
    Work at office

    Carta

    San Francisco, CA
    5 days ago
  •  ...advanced algorithms that significantly outperforms individual engineers. We combine language models with human ingenuity to push the...  ...quality. The Role: We are seeking an experienced Site Reliability Engineer to join our Platform Engineering team in the Bay Area... 
    Senior

    CodeRabbit

    San Francisco, CA
    2 days ago
  • $195k - $240k

     ...Senior Site Reliability Engineer San Francisco (Hybrid) At You.com, we are building the AI Search Infrastructure that powers modern AI systems. Our goal is to create the trusted knowledge layer that agents, applications, and enterprises rely on to retrieve real-... 
    Senior
    Full time
    Immediate start
    Remote work
    Work from home
    Flexible hours

    Y.O.U.

    San Francisco, CA
    5 days ago
  •  ...Site Reliability Engineer 3 We are looking for a Site Reliability Engineer 3 to support mission-critical cloud services and production operations. The role focuses on improving service reliability, reducing operational risk, automating repetitive tasks, and driving... 
    Senior
    Immediate start
    Flexible hours
    Shift work

    Oracle

    San Francisco, CA
    14 hours ago
  • $117k - $209.33k

     ...Job Requisition ID # 26WD99273 Position Overview Want to help make a better world? As a Senior Site Reliability Engineer at Autodesk, you can help us build and operate reliable, secure, and scalable cloud services for Autodesk GovCloud products. As part of a... 
    Senior
    For contractors

    Autodesk

    San Francisco, CA
    1 day ago
  •  ...come shape the future and be part of a truly unique global culture at OutSystems! Hybrid Onsite in Menlo Park, CA Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and... 
    Senior
    Immediate start
    Remote work
    Worldwide

    OutSystems

    San Francisco, CA
    2 days ago
  •  ...Udaip Cloud-Based Data And Ai Platform Engineer At U.S. Bank, we're on a journey to do our best. Helping the customers and businesses we serve to make better and smarter financial decisions and enabling the communities we support to grow and succeed. We believe it... 
    Senior
    Temporary work
    Work experience placement

    Phenom People

    San Francisco, CA
    5 days ago
  • $127k - $249k

     ...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper).... 
    Senior
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    San Francisco, CA
    1 day ago
  • $166.9k - $225.9k

     ...Summary: Drata's SRE team operates as both a central engineering function and an embedded reliability practice. You'll be part of a close-knit SRE team...  ...What you'll bring: ~6+ years of experience in Site Reliability Engineering, Cloud Engineering, or building... 
    Senior
    Work at office
    Immediate start
    Worldwide
    Monday to Friday
    Flexible hours

    Drata Inc

    San Francisco, CA
    2 days ago
  • US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average of over 12 years of industry experience, the successful candidate will bridge the gap between software development and systems... 
    Senior

    Axiom Pursuits

    San Francisco, CA
    1 day ago
  • OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes... 
    Senior
    Flexible hours

    OutSystems, Inc.

    San Francisco, CA
    1 day ago
  • $287k

     ...Series B and have grown 800% over the last 12 months. Engineering at Ivo Engineers at Ivo are inventors. Ivo was...  ...expect us to hit our SLAs. What ? We're looking for an Senior or Staff Site level Reliability Engineer as part of Infrastructure team to: Own... 
    Senior
    Contract work
    Work at office
    Remote work

    IVO Inc

    San Francisco, CA
    5 days ago
  • $220k - $235k

     ...Staff/Senior Staff Site Reliability Engineer Ironclad is the leading AI contracting platform that transforms agreements into assets. Contracts move faster, insights surface instantly, and agents push work forward, all with you in control. Whether you're buying or selling... 
    Senior
    Full time
    Contract work
    Work at office

    Ironclad Inc

    San Francisco, CA
    2 days ago
  • $181k - $263k

     ...and supporting deployments of global products, and providing first line operational support. We are looking for a Senior Staff Site Reliability Engineer who will set the technical direction for reliability engineering across LiveRamp's global infrastructure. This is a... 
    Senior
    Work from home
    Flexible hours
    Night shift

    LiveRamp

    San Francisco, CA
    2 days ago
  • $300k

     ...thousands of H100s, H200s, and B200s, ready for experimentation, full-scale model training, or inference. As a Platform Engineer/Senior Site Reliability Engineer, you’ll own the reliability, performance, and automation of this GPU-powered infrastructure, ensuring... 
    Senior

    Hamilton Barnes Associates Limited

    San Francisco, CA
    3 days ago
  •  ...alongside clinicians to make that possible. We’re a team of doctors, engineers, designers, researchers, and creatives building tools that...  ...for leading incidents end-to-end. Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes... 
    Senior
    Work at office
    Worldwide

    Heidi Health Ltd

    San Francisco, CA
    1 day ago
  •  ...Responsibilities Lead and onboard services and teams to the reliability tenets. Establish and maintain Service Level Objectives (...  ...Science or equivalent. 6+ years of experience in Site Reliability Engineering, managing infrastructure and services at scale. History of... 
    Senior

    OutSystems, Inc.

    San Francisco, CA
    1 day ago
  • What you’ll do As a Senior Site Reliability Engineer, you’ll work closely with product teams in Spend to deliver and maintain scalable, reliable cloud infrastructure in support of key product initiatives. Aligned to the roadmap, you’ll lead on infrastructure design and... 
    Senior

    Airwallex-

    San Francisco, CA
    5 days ago
  •  ...acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from...  ...redefining go-to-market with state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of data... 
    Senior

    Unify

    San Francisco, CA
    2 days ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon's production systems. This means designing and implementing...  ...scale as we grow. You'll build tooling that makes the entire engineering team more effective, establish on-call rotations and runbooks... 
    Senior

    Pylon

    San Francisco, CA
    4 days ago
  •  ...work from home day is currently Tuesday. Engineering at Lambda is responsible for building...  ...observability adoptable and improve product reliability. Lead members of other engineering teams...  ...in Go Have 5+ years of experience in Site Reliability Engineering practices Possess... 
    Senior
    Work at office
    Local area
    Work from home

    Lambda

    San Francisco, CA
    5 days ago
  •  ...about this role, we encourage you to apply. The Role As a Senior Platform Engineer, you are a champion for DevOps and SRE culture and...  ...goals are met. What You Will Be Doing Improving production reliability and system resilience within an SRE scoped team Championing... 
    Senior
    Flexible hours

    Megaport

    Brisbane, CA
    1 day ago
  • We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. You’ll learn to deploy and maintain a fleet of RPC and validator nodes for multiple blockchain networks. You’ll also provide guidance... 
    Senior
    Remote job

    Blockchain Works

    San Francisco, CA
    13 days ago
  • $60 per hour

    Senior Site Reliability Engineer (Copy) Seattle Hybrid (Hybrid location). Full-time. About Us Supio is a trusted AI platform purpose-built for law firms, reshaping how data drives impactful outcomes. Our innovative approach blends technology with deep legal expertise,... 
    Senior
    Full time
    Work at office
    Flexible hours

    Bonfirevc

    San Francisco, CA
    1 day ago
  • $50 per hour

     ...years of professional SRE experience 5+ years of experience contributing to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems Bachelor's Degree in Computer Science or related field, or 8+ years relevant work... 
    Senior
    Temporary work
    Work experience placement

    Epoch Biodesign

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!