Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

$175k - $250k

The Recruiting Guy

1 day ago Be among the first 25 applicants

This range is provided by The Recruiting Guy. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

$175,000.00/yr - $250,000.00/yr

Job Title: Senior Cloud Infrastructure Engineer
Location: San Francisco, CA. Remote unavailable.
Modality: On-Site only.
Must live within commuting distance of San Francisco or be willing to relocate.
Relocation Assistance: No
Employment Type: Salaried W2 Full-Time.
Salary Range: $175,000 - $250,000

About The Company

We represent a pioneering open source technology company in San Francisco that is transforming the way creators interact with generative AI. They are the team behind a powerful, node based visual interface that gives artists, developers, and innovators the ability to design, control, and customize AI workflows with complete flexibility. Their platform allows users to connect modular components, build complex pipelines, and run everything locally with impressive speed and precision. Their mission is to make generative AI open, transparent, and accessible to everyone. Built around community collaboration and creative empowerment, their tools help users experiment freely and bring their ideas to life. Whether it is visual storytelling, image generation, or advanced machine learning, their technology gives creators the freedom to explore without limitations.

About The Role

In this role, you will take the lead on designing, deploying, and maintaining large-scale distributed systems that power AI workloads. The ideal candidate is deeply technical, self-sufficient, and motivated by solving complex infrastructure challenges. You will work closely with core engineers to shape the company’s long-term infrastructure vision while ensuring scalability, performance, and reliability across environments.

What You’ll Do
  • Design, build, and maintain the core infrastructure that powers AI workloads at scale
  • Manage and automate GPU compute clusters using tools such as Python, Kubernetes, Terraform, and Ansible
  • Architect and operate systems for orchestration, observability, distributed storage, and networking
  • Ensure reliability, scalability, and performance across production environments
  • Collaborate closely with core engineers to design infrastructure for new features and systems
  • Contribute to technical strategy and long-term infrastructure vision
  • Drive best practices for infrastructure automation, deployment, and monitoring
Requirements
  • 5+ years experience as an Infrastructure Engineer or Site Reliability Engineer building and operating large-scale distributed systems
  • Skilled in Python and comfortable working with infrastructure-as-code tools such as Terraform and Ansible
  • Familiar with container orchestration systems such as Kubernetes and related tooling like FluxCD, Prometheus, and Grafana
  • Capable of managing high-performance GPU environments across cloud and bare metal setups
  • Highly adaptable, resourceful, and motivated by building things from the ground up
  • Excited to work in a small, fast-growing team where autonomy and accountability are key
  • Comfortable working on-site in a startup setting where collaboration and speed matter most
Bonus Points
  • Experience contributing to or maintaining open-source projects
  • Background working with AI infrastructure, ML pipelines, or GPU orchestration
  • Strong computer science fundamentals and ability to work across different programming languages or frameworks

Skills: prometheus,fluxcd,kubernetes,python,ansible,terraform,infrastructure,grafana

#J-18808-Ljbffr
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in San Francisco, CA vacancy
  • $210k - $240k

     ...Join to apply for the Senior Site Reliability Engineer role at Alembic Technologies This range is provided by Alembic Technologies. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range $210,000... 
    Senior
    Full time

    Alembic Technologies

    San Francisco, CA
    2 days ago
  •  ...CloudDevs works with fast-moving, venture-backed startups across the US. We’re building a pool of world-class Site Reliability Engineers for current roles and for upcoming opportunities. You will either be placed directly into one of our partner startups or added to our... 
    Senior
    Local area

    Breakout Tools

    San Francisco, CA
    2 days ago
  •  ...acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from...  ...redefining go-to-market with state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of data... 
    Senior

    Unify

    San Francisco, CA
    2 days ago
  • US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average of over 12 years of industry experience, the successful candidate will bridge the gap between software development and systems... 
    Senior

    Axiom Pursuits

    San Francisco, CA
    1 day ago
  • Fieldguide is seeking a Senior Site Reliability Engineer to ensure the reliability and scalability of our production systems in San Francisco, CA. The role involves working closely with product teams to define reliability standards and build robust observability practices... 
    Senior
    Remote job
    Flexible hours

    Fieldguide

    San Francisco, CA
    4 days ago
  • OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes... 
    Senior
    Flexible hours

    OutSystems, Inc.

    San Francisco, CA
    1 day ago
  • $227.2k - $324.5k

     ...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization...  ...automation. We are seeking an experienced and visionary Senior SRE Manager to lead and grow our newly built Site Reliability... 
    Senior
    Full time
    Contract work
    Temporary work
    Local area
    Flexible hours

    Tubi

    San Francisco, CA
    11 hours ago
  •  ...co‑founders with PhDs in AI, Math, and Computer Science — is poised to redefine computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI infrastructure operate with exceptional reliability, performance, and... 
    Senior

    deCircle

    San Francisco, CA
    11 hours ago
  •  ...work from home day is currently Tuesday. Engineering at Lambda is responsible for building...  ...observability adoptable and improve product reliability. Lead members of other engineering teams...  ...in Go Have 5+ years of experience in Site Reliability Engineering practices Possess... 
    Senior
    Work at office
    Local area
    Work from home

    Lambda

    San Francisco, CA
    11 hours ago
  • What you’ll do As a Senior Site Reliability Engineer, you’ll work closely with product teams in Spend to deliver and maintain scalable, reliable cloud infrastructure in support of key product initiatives. Aligned to the roadmap, you’ll lead on infrastructure design and... 
    Senior

    Airwallex-

    San Francisco, CA
    11 hours ago
  •  ...Responsibilities Lead and onboard services and teams to the reliability tenets. Establish and maintain Service Level Objectives (...  ...Science or equivalent. 6+ years of experience in Site Reliability Engineering, managing infrastructure and services at scale. History of... 
    Senior

    OutSystems, Inc.

    San Francisco, CA
    1 day ago
  •  ...alongside clinicians to make that possible. We’re a team of doctors, engineers, designers, researchers, and creatives building tools that...  ...for leading incidents end-to-end. Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes... 
    Senior
    Work at office
    Worldwide

    Heidi Health Ltd

    San Francisco, CA
    1 day ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon's production systems. This means designing and implementing...  ...scale as we grow. You'll build tooling that makes the entire engineering team more effective, establish on-call rotations and runbooks... 
    Senior

    Pylon

    San Francisco, CA
    4 days ago
  •  ...values and enthusiasm for building a great culture and product, you will find a home at Fieldguide. About the Role As a Senior Site Reliability Engineer (SRE) at Fieldguide, you will be responsible for ensuring the reliability, scalability, and observability of our... 
    Senior
    Remote work
    Work from home
    Flexible hours

    Fieldguide

    San Francisco, CA
    4 days ago
  • $127k - $249k

     ...We are looking for an experienced Senior or Staff Engineer for our SRE, InfraSec team, to guide the security of our cloud-based infrastructure. As a Staff SRE, you will be very hands‑on technically while also mentoring a small team of SREs. The InfraSec team collaborates... 
    Senior
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    San Francisco, CA
    2 days ago
  • $60 per hour

    Senior Site Reliability Engineer (Copy) Seattle Hybrid (Hybrid location). Full-time. About Us Supio is a trusted AI platform purpose-built for law firms, reshaping how data drives impactful outcomes. Our innovative approach blends technology with deep legal expertise,... 
    Senior
    Full time
    Work at office
    Flexible hours

    Bonfirevc

    San Francisco, CA
    1 day ago
  • For more information, please read ourSenior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: US - San Francisco Bay Areatime type: Full timeposted on: Posted Yesterdayjob requisition id: R1478**There are NO limits to your career: come... 
    Senior
    Immediate start
    Remote work
    Worldwide

    OutSystems Inc.

    San Francisco, CA
    1 day ago
  • # Senior Site Reliability EngineerHybrid - San Francisco**Our Mission & Values:** At Drata, we help companies earn and keep the trust of their...  ...Job Summary:**Drata's SRE team operates as both a central engineering function and an embedded reliability practice. You'll be... 
    Senior
    Work at office
    Immediate start
    Worldwide
    Monday to Friday
    Flexible hours

    Careers at Drata

    San Francisco, CA
    2 days ago
  • $50 per hour

     ...years of professional SRE experience 5+ years of experience contributing to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems Bachelor's Degree in Computer Science or related field, or 8+ years relevant work... 
    Senior
    Temporary work
    Work experience placement

    Epoch Biodesign

    San Francisco, CA
    11 hours ago
  • $166.9k - $225.9k

    Job Summary Drata's SRE team operates as both a central engineering function and an embedded reliability practice. You'll be part of a close-knit SRE team...  ...organization. What you’ll bring 6+ years of experience in Site Reliability Engineering, Cloud Engineering, or... 
    Senior
    Flexible hours

    Drata

    San Francisco, CA
    2 days ago
  • $325k

    Engineering at Ivo Engineers At Ivo Are Inventors. Ivo Was First-to-market With An AI agent that lives in MS Word and edits...  ...expect us to hit our SLAs. We’re looking for an Senior or Staff Site level Reliability Engineer as part of the Infrastructure team to: Own uptime... 
    Senior
    Contract work

    Icehouseventures

    San Francisco, CA
    11 hours ago
  • $165k - $225k

     ...it, and the SDF team is expanding to support the rapidly growing and changing Stellar ecosystem. SDF is looking for a Senior Site Reliability Engineer to help build and operate the foundation that powers our engineering teams. You’ll ensure the reliability and scalability... 
    Senior
    Temporary work
    Work at office
    Local area
    Worldwide
    Flexible hours

    Stellar

    San Francisco, CA
    4 days ago
  • $220k - $235k

     ...are seeking a strategic, high-output Staff/Senior Staff SRE to define the future of our cloud platform and champion engineering excellence across Ironclad. In this role,...  ...leadership and strategic direction for the Site Reliability Engineering team and our broader Cloud... 
    Senior
    Full time
    Work at office

    jobr.pro

    San Francisco, CA
    3 days ago
  • Senior Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco • Full-Time About Andromeda Andromeda Cluster was founded by Nat Friedman and Daniel Gross to give early‑stage startups access to the kind of scaled AI infrastructure once reserved... 
    Senior
    Full time
    Remote work

    Cortes 23

    San Francisco, CA
    1 day ago
  • $15 per hour

    Summary The Wikimedia Foundation is looking for a Senior Site Reliability Engineer to support and develop the platform serving the world’s favorite encyclopedia, Wikipedia, to millions of people around the globe. Wikimedia’s Site Reliability Engineering (SRE) team is... 
    Senior
    Permanent employment
    For contractors
    Remote work

    Nerdleveltech

    San Francisco, CA
    11 hours ago
  • A tech company focused on AI is seeking a Site Reliability Engineer to ensure the reliability and performance of its GPU marketplace. This role involves maintaining service level objectives, managing capacity, and implementing secure systems. The ideal candidate has strong... 
    Senior

    Hyperbolic Labs

    San Francisco, CA
    3 days ago
  • $181k - $263k

     ...and supporting deployments of global products, and providing first line operational support. We are looking for a Senior Staff Site Reliability Engineer who will set the technical direction for reliability engineering across LiveRamp's global infrastructure. This is a... 
    Senior
    Work from home
    Flexible hours
    Night shift

    Liveramp

    San Francisco, CA
    4 days ago
  • Drata is seeking a Senior Site Reliability Engineer in San Francisco. In this role, you will engage in reliability architecture for product teams, lead production readiness reviews, and build automation around monitoring and alerting. The ideal candidate has at least 6... 
    Senior

    Careers at Drata

    San Francisco, CA
    2 days ago
  • $232k - $319k

     ...to help us continue to scale the service with great people and reliable, cost-effective, and efficient infrastructure, processes, and...  ...platform capabilities in partnership with architects and product engineering Build a world-class observability platform and monitoring... 
    Senior
    Permanent employment
    Local area
    Worldwide
    Flexible hours

    Okta, Inc.

    San Francisco, CA
    1 day ago
  • Airwallex- is seeking a Senior Site Reliability Engineer in San Francisco, California, to work with product teams to build and maintain robust cloud infrastructure. In this role, you will lead critical infrastructure projects, ensuring the reliability and performance of... 
    Senior

    Airwallex-

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!