Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer

gamma.app

We're building the creative layer for modern communication. Every month, over a billion people make presentations — but the tools they use to make them haven't evolved in decades. We're changing that, using AI to disrupt a massive market.

Millions of people rely on Gamma to create, teach, and persuade, creating more than 1 million gammas every day.

We see Gamma as the next great workplace tool, combining viral B2C love with a massive B2B opportunity. We believe AI can be a true creative partner: one that understands context, clarity, and taste.

We’ve reached a $2.1B valuation , crossed $100M in annual recurring revenue , and have been profitable since 2023.

We're an imaginative, passionate team who takes our work seriously, but not ourselves. Our culture is warm, a little quirky, and fueled by curiosity.

About the role

Gamma's infrastructure needs to be rock-solid for millions of daily users while enabling our engineering teams to ship fast. You'll own the operational health of our full backend platform, building automation and tooling that improves reliability and partnering with engineering to design systems that are observable, resilient, and easy to operate. Your work directly impacts every Gamma user's experience.

This is a high-impact role where you'll balance reliability with velocity, knowing when to move fast and when to prioritize stability. You'll lead incident response, drive systemic improvements, and help shape how Gamma scales to serve its next 100 million users.

Our team has a strong in-office culture and works in person 4–5 days per week in San Francisco. We love working together to stay creative and connected, with flexibility to work from home when focus matters most.

What you'll do

  • Own reliability, availability, and performance of Gamma's production systems across primarily AWS infrastructure
  • Build observability infrastructure with metrics, logging, tracing, and alerting that provide deep visibility into system health
  • Design automation to reduce toil, improve deployment safety, and accelerate incident resolution
  • Lead incident response, conduct blameless post-mortems, and drive systemic improvements to prevent recurring issues
  • Partner with engineering teams on architecture reviews, SLOs/SLIs, and reliability best practices
  • Manage and optimize our infrastructure including compute, networking, databases, and managed services

What you'll bring

  • 5+ years in Site Reliability Engineering, DevOps, or systems engineering roles with deep AWS expertise
  • Strong programming skills (Python, Go, or TypeScript/Node.js) for building tools and automation
  • Experience with infrastructure-as-code (Terraform, CloudFormation) and comprehensive observability solutions
  • Track record improving system reliability through automation, monitoring, and architectural improvements
  • Solid understanding of networking, distributed systems, containerization (Docker, Kubernetes), and database performance
  • Strong incident management and debugging skills for complex production issues
  • (Nice to have) Experience scaling SaaS applications to millions of users
  • (Nice to have) Background with real-time collaborative systems, Kafka, chaos engineering, or service mesh technologies
  • (Nice to have) AWS certifications or experience with security/compliance requirements (SOC 2, ISO 27001)

Compensation range

Final offer amounts are determined by multiple factors, including but not limited to experience and expertise in the requirements listed above.

If you're interested in this role but you don't meet every requirement, we encourage you to apply anyway! We're always excited about meeting great people.

We're building on a full Typescript stack centered around some of the most modern and popular technologies.

We use our own custom, open-source AI prompting framework, AIJSX. We have a lot of custom tools built in-house, but also new ones like Vercel AI SDK.

Our tiny team operates at massive scale:

1M+

70M users around the world

6M+ AI images generated daily

1 trillion LLM tokens processed per month

Life at Gamma

You get energy from small teams doing big things.

You love when design, code, and storytelling overlap.

You default to action, even when the answer isn’t clear yet.

You value details, but know when to ship and move on.

You bring both the spreadsheets and the sparkle, equal parts workhorse and unicorn.

You believe AI should amplify creativity, not replace it.

You know kindness and intensity are not opposites.

You like working with people who care deeply: about their craft, their teammates, and the users on the other side of the screen.

Who we are

Gamma is full of imaginative, passionate people who take their work seriously but not themselves. The culture is warm, a little quirky, and fueled by curiosity. It’s the kind of place where you’ll debate a pixel on Monday, laugh over someone’s keyboard setup on Tuesday, and ship something remarkable by Friday.

We care about craft, move with intention, and don’t mind getting a little scrappy. It’s fast, creative, and occasionally chaotic — but that’s what makes it interesting.

Here’s a bit about what it’s like to work here, from people on the inside:

“quirky, inspiring, fun, a little wild in the best way”

“You can have an idea and just run with it.”

“Everyone’s talented and humble — the mix keeps you sharp.”

“We ship cool stuff, learn a ton, and laugh a lot doing it.”

Meet the team

We're a team of dreamers and doers building in beautiful San Francisco

We're kabbadi enthusiasts, pickleballers, dog herders, woodworkers, keyboard nerds, potters, and more — and we can't wait to meet you!

#J-18808-Ljbffr

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer in San Francisco, CA vacancy
  •  ...customer acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from companies like...  ...of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of data monthly and... 
    Suggested

    Unify

    San Francisco, CA
    1 day ago
  •  ...shape the future of healthcare, we’d love to meet you. About the role We’re hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems that power our product. You’ll work across our distributed workflow... 
    Suggested
    Work at office
    Remote work
    Flexible hours
    2 days per week

    Plenful

    San Francisco, CA
    4 days ago
  • $163k - $203k

     ...will be a senior technical contributor on the SRE team, responsible for the reliability, scalability, and security of Prosper’s Cloud Platform portfolio. This is as much of a platform engineering role as it is SRE role — you will maintain the applications that run on our... 
    Suggested
    Work experience placement
    Work at office
    Local area
    Remote work
    Flexible hours
    2 days per week

    Prosper

    San Francisco, CA
    20 days ago
  •  ...Job Description Velia Multiservices is proud to partner with a fast-growing, early-stage startup to identify a top-tier Site Reliability Engineer who will play a critical role in scaling and strengthening a high-performance platform used by enterprise clients such as... 
    Suggested

    Velia multiservices

    San Francisco, CA
    25 days ago
  • $150k

     ...Job Description Job Description About The Role We are seeking an experienced Site Reliability Engineer (SRE) with a strong focus on DevSecOps to join our growing engineering team. In this role, you will oversee and maintain the reliability, security posture, and... 
    Suggested

    VantageScore

    San Francisco, CA
    29 days ago
  •  ...work from home day is currently Tuesday. Engineering at Lambda is responsible for building...  ...observability adoptable and improve product reliability. Lead members of other engineering teams...  ...in Go Have 5+ years of experience in Site Reliability Engineering practices Possess... 
    Work at office
    Local area
    Work from home

    Lambda

    San Francisco, CA
    13 hours ago
  • The role We're looking for a world-class Site Reliability Engineer to ensure the reliability, performance, and scalability of our AI infrastructure platform. You’ll be building and operating the core systems that power agentic AI at scale. Your mission: keep our ultra-... 

    Blaxel

    San Francisco, CA
    2 days ago
  • $125k - $165k

    Position: Site Reliability Engineer Location: San Francisco, CA Job Id: 434 # of Openings: 1 TELCOR Inc, a leading innovator in laboratory software, is looking for a Site Reliability Engineer to join our TELCOR AI Systems team! Do you have strong experience in cloud... 
    Temporary work
    Work at office
    Visa sponsorship
    Work visa
    Relocation package
    Flexible hours

    TELCOR

    San Francisco, CA
    4 days ago
  • $163k - $203k

     ...will be a senior technical contributor on the SRE team, responsible for the reliability, scalability, and security of Prosper’s Cloud Platform portfolio. This is as much of a platform engineering role as it is SRE role — you will maintain the applications that run on our... 
    Work experience placement
    Work at office
    Local area
    Remote work
    Flexible hours
    2 days per week

    Prosper

    San Francisco, CA
    2 days ago
  •  ...and enthusiasm for building a great culture and product, you will find a home at Fieldguide. About the Role As a Senior Site Reliability Engineer (SRE) at Fieldguide, you will be responsible for ensuring the reliability, scalability, and observability of our production... 
    Remote work
    Work from home
    Flexible hours

    Fieldguide

    San Francisco, CA
    4 days ago
  • $175k - $250k

     ...fully distributed across North American time zones and supports a fast‑growing customer base of SaaS companies. About the Site Reliability Engineering Team The Site Reliability Engineering (SRE) team ensures the WorkOS platform remains fast, reliable, and resilient at... 
    Remote work

    I did my part and supported the Regular Toilet

    San Francisco, CA
    1 day ago
  • $60 per hour

    Senior Site Reliability Engineer (Copy) Seattle Hybrid (Hybrid location). Full-time. About Us Supio is a trusted AI platform purpose-built for law firms, reshaping how data drives impactful outcomes. Our innovative approach blends technology with deep legal expertise,... 
    Full time
    Work at office
    Flexible hours

    Bonfirevc

    San Francisco, CA
    1 day ago
  • # Senior Site Reliability EngineerHybrid - San Francisco**Our Mission & Values:** At Drata, we help companies earn and keep the trust of...  ...**Job Summary:**Drata's SRE team operates as both a central engineering function and an embedded reliability practice. You'll be part... 
    Work at office
    Immediate start
    Worldwide
    Monday to Friday
    Flexible hours

    Careers at Drata

    San Francisco, CA
    2 days ago
  • For more information, please read ourSenior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: US - San Francisco Bay Areatime type: Full timeposted on: Posted Yesterdayjob requisition id: R1478**There are NO limits to your career: come... 
    Immediate start
    Remote work
    Worldwide

    OutSystems Inc.

    San Francisco, CA
    1 day ago
  • A dynamic tech firm located in San Francisco is seeking a Site Reliability Engineer to enhance operational health across their production systems. This high-impact role demands expertise in AWS and strong programming skills. You will manage production systems' reliability... 

    gamma.app

    San Francisco, CA
    4 days ago
  •  ...advanced algorithms that significantly outperforms individual engineers. We combine language models with human ingenuity to push the...  ...and quality. The Role We are seeking an experienced Site Reliability Engineer to join our Platform Engineering team in the Bay Area... 

    CodeRabbit

    San Francisco, CA
    1 day ago
  • $140k - $185k

     ...alongside clinicians to make that possible. We’re a team of doctors, engineers, designers, researchers, and creatives building tools that...  ...in on-call and incident response: Improve operational reliability: Own parts of the production environment: Strengthen observability... 
    Work at office
    Worldwide

    Dormont Manufacturing Co

    San Francisco, CA
    1 day ago
  • $151.5k - $252.5k

     ...and making a real impact for some of the world’s biggest brands. About The Role We are looking for an experienced Senior Site Reliability Engineer to join the Veeam Data Cloud (VDC) engineering team. You will be working with a global team to build the world’s next modern... 
    Base plus commission
    Local area
    Worldwide

    Veeam

    San Francisco, CA
    2 days ago
  •  ...Job Description Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas...  ...evangelize cloud best practices while building a culture of reliability and observability Engage in and improve the end to end lifecycle... 

    Forhyre

    San Francisco, CA
    25 days ago
  • $227.2k - $324.5k

     ...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems.... 
    Full time
    Contract work
    Temporary work
    Local area
    Flexible hours

    Tubi

    San Francisco, CA
    13 hours ago
  •  ...customer acquisition, and Connor was a machine learning research engineer at Scale AI . The rest of our team comes from companies like...  ...-of-the-art AI. As our Staff SRE Tech Lead, you'll own the reliability and scalability of our platform as we add terabytes of data monthly... 

    Unify

    San Francisco, CA
    1 day ago
  • $200k - $275k

     ...backed by top-tier investors including a16z, Khosla, Activant, 1984 Ventures and Page One. The Role We’re hiring a Lead Site Reliability Engineer to drive the strategy, architecture, and execution of reliability, scalability, and operational excellence across our... 
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    14 days ago
  • We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. You’ll learn to deploy and maintain a fleet of RPC and validator nodes for multiple blockchain networks. You’ll also provide guidance... 
    Remote job

    Blockchain Works

    San Francisco, CA
    3 days ago
  • TELCOR Inc is looking for a Site Reliability Engineer to ensure the reliability, scalability, and performance of our AI products' systems. The role involves designing and operating resilient systems in cloud and containerized environments while managing production infrastructure... 
    Remote job

    TELCOR Inc

    San Francisco, CA
    4 days ago
  • $50 per hour

     ...years of professional SRE experience 5+ years of experience contributing to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems Bachelor's Degree in Computer Science or related field, or 8+ years relevant work... 
    Temporary work
    Work experience placement

    Epoch Biodesign

    San Francisco, CA
    13 hours ago
  • Senior Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco • Full-Time About Andromeda Andromeda Cluster was founded by Nat Friedman and Daniel Gross to give early‑stage startups access to the kind of scaled AI infrastructure once reserved... 
    Full time
    Remote work

    Cortes 23

    San Francisco, CA
    1 day ago
  • $180k - $210k

    Location Remote US Employment Type Full time Location Type Remote Department Tech Engineering Compensation $180K - $210K • Offers Equity The base salary & equity offered for this position will depend on several factors, including location, experience, qualifications... 
    Full time
    H1b
    Work at office
    Remote work
    Worldwide
    Visa sponsorship
    Flexible hours

    Twelve Labs

    San Francisco, CA
    4 days ago
  •  ...cloud-native systems. As a Staff Platform Engineer, you will play a critical role in...  ...technical leadership role. You will own reliability for major platform domains, design scalable...  ...Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a... 

    Saviynt

    San Francisco, CA
    6 days ago
  • $151k - $297k

    The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper). As... 
    Local area
    Immediate start
    Remote work
    Flexible hours
    Shift work

    MongoDB

    San Francisco, CA
    13 hours ago
  • $138k - $179k

     ...write up and follow up tasks to close any gaps identified. We partner with a wide variety of other teams from infrastructure and engineering, to QA and business teams, so strong collaborative instincts and clear communication skills are a key part of our toolset. As... 
    Flexible hours

    MSCI Inc

    San Francisco, CA
    13 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer. Be the first to apply!