Senior Site Reliability Engineer — AI Cloud Reliability
Dormont Manufacturing Co
Crusoe is seeking a Site Reliability Engineer in San Francisco to ensure the stability and performance of its GPU cloud platform. Successful candidates will have a minimum of 5 years in cloud operations and strong knowledge of tools like Prometheus and Grafana. This role involves collaboration across teams to improve metrics and handle incident response. Benefits include industry competitive pay, health insurance, and a 401(k) match. #J-18808-Ljbffr Dormont Manufacturing Co
- Anyscale is seeking a Senior Site Reliability Engineer to join our Infrastructure team in San Francisco, California... ...candidate will enhance distributed AI application development and work on... ...with strong experience in Kubernetes and cloud-native technologies. This role focuses...Senior
- ...stacks to accelerate the progress of AI applications out into the real world.... ...About the role Anyscale is looking for a Senior Site Reliability Engineer to join the Infrastructure team.... ...running distributed AI applications in the cloud as easy as on your laptop. As part of...Senior
- ...security, delivering an AI-powered platform that governs... ...complex, distributed, cloud‑native systems. As a Staff Platform Engineer, you will play a... ...leadership role. You will own reliability for major platform... ...Platform Engineering, or Site Reliability Engineering...Senior
- ...About the job Senior Site Reliability Engineer About the Company Stellar is a decentralized, public blockchain... ...~5+ years of experience of working in cloud-based systems operations, as a SRE or... ...code Experience experimenting with AI-driven approaches to operations...Senior
- ...Hyperbolic Labs is on a mission to democratize AI by breaking down the barriers to computing power with our Open-Access AI Cloud. By aggregating computing resources... ...About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and...Senior
$160k - $250k
...maintain a hybrid infrastructure with public clouds when the right fit. As we continue to... ..., we also need to grow our DevOps and Site Reliability team to maintain the reliability of our... ...passionate about creating a revolutionary AI company. At Hive, you will have a steep...Senior- ...create the next generation of Gen AI-driven code reviewers: a... ...significantly outperforms individual engineers. We combine language models... ...are seeking an experienced Site Reliability Engineer to join our Platform... ...infrastructure on Google Cloud Platform to support CodeRabbit...Senior
- OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes...SeniorFlexible hours
$300k
Join a stealth-mode startup building out their AI and cloud platform, powered by thousands of H100s, H200s, and B200s, ready... ...full-scale model training, or inference. As a Platform Engineer/Senior Site Reliability Engineer, you’ll own the reliability, performance, and automation...Senior- ...services and teams to the reliability tenets. Establish and maintain... ...infrastructure, ensuring cloud‑native best practices. Collaborate... ...in Python, using Gen AI tooling to accelerate... ...6+ years of experience in Site Reliability Engineering, managing infrastructure and...Senior
- ...human. Heidi is building an AI Care Partner that works alongside... .... We’re a team of doctors, engineers, designers, researchers, and... ...to-end. Improve operational reliability: Identify recurring issues... ...improve Kubernetes clusters, cloud infrastructure, and core platform...SeniorWork at officeWorldwide
- ...acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from companies like... ...go-to-market with state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of...Senior
$60 per hour
Senior Site Reliability Engineer (Copy) Seattle Hybrid (Hybrid location). Full-time. About Us Supio is a trusted AI platform purpose-built for law firms, reshaping how data drives impactful outcomes. Our innovative approach blends technology with deep legal expertise,...SeniorFull timeWork at officeFlexible hours$166.9k - $225.9k
...operates as both a central engineering function and an embedded reliability practice. You'll be part... ...'ll work across a modern cloud‑native stack to help... ...+ years of experience in Site Reliability Engineering,... ...Experience with AIOps—using AI/ML‑based tooling for anomaly...SeniorFlexible hours- # Senior Site Reliability EngineerHybrid - San Francisco**Our Mission & Values:*... ...operates as both a central engineering function and an embedded reliability... ...You'll work across a modern cloud-native stack to help Drata... ...with AIOps - using AI/ML-based tooling for anomaly...SeniorWork at officeImmediate startWorldwideMonday to FridayFlexible hours
$175k - $250k
...250,000.00/yr Job Title: Senior Cloud Infrastructure Engineer Location: San Francisco, CA... ...unavailable. Modality: On-Site only. Must live within... ...interact with generative AI. They are the team behind... ...scalability, performance, and reliability across environments. What...SeniorFull timeRemote workRelocationRelocation package- ...information, please read ourSenior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations:... ...secure infrastructure, while ensuring cloud-native best practices;Collaborate... ...in Python supported by Gen AI tooling to accelerate development of...SeniorImmediate startRemote workWorldwide
- ...the data and action layer for AI agents. We give agents fast,... ...now includes AI agents. Engineering Hiring Sprint We're growing... ...Engineers Database Engineers Site Reliability Engineers Extensibility API... ...across multiple regions and clouds. You’ll build and maintain the...SeniorWork at officeLocal areaFlexible hours
$127k - $249k
The Team Platform Engineering is the department within SRE that is responsible... .... Among these are our multi-cloud-provider Kubernetes... ...components that ensure cluster reliability and security (e.g., CoreDNS,... ...redefined the database for the AI era, enabling innovators to...SeniorWork at officeLocal areaRemote workWorldwideFlexible hours$300 per month
...We’re crafting the engine that powers a world... ...create ambitiously with AI — without... ...responsible, transformative cloud infrastructure.... ...building the most reliable, energy-efficient,... ...that mission. As a Site Reliability Engineer... ...partner closely with senior SREs,...SeniorTemporary work- Senior Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco • Full-Time About Andromeda Andromeda Cluster was founded... ...Andromeda works with leading AI labs, data centers, and cloud providers to deliver compute when and where it’s needed...SeniorFull timeRemote work
$220k - $235k
...strategic, high-output Staff/Senior Staff SRE to define the future of our cloud platform and champion engineering excellence across Ironclad.... ...direction for the Site Reliability Engineering team and our broader... ...ArgoCD Experience with modern AI enabled tools such as...SeniorFull timeWork at office$151.5k - $252.5k
A leading technology firm is seeking a Senior Site Reliability Engineer to join their Data Cloud engineering team in San Francisco. The role requires expertise in Azure infrastructure and SaaS applications, focusing on building reliable, scalable systems. The ideal candidate...Senior$181k - $263k
## Senior Staff Site Reliability EngineerApplylocations: San Franciscotime type: Full... ...Staff Site Reliability Engineer who will set the technical... ...strategy across Kubernetes, cloud resources, and database infrastructure... ...Familiarity with LLMs and AI-assisted development...SeniorWork from homeFlexible hoursNight shift- ...Alembic is the pioneering Causal AI platform. We help the world's... ...perform under real-world scale, reliability, and security demands — and we're looking for an engineer who wants to own the foundation... ...across our network and cloud infrastructure. Partner across...Senior
$232k - $319k
...Secure Every Identity, from AI to Human Identity is the key... ...service with great people and reliable, cost-effective, and efficient... ...partnership with architects and product engineering Build a world-class... ...of scalable, self-service Cloud infrastructure platforms (e.g....SeniorPermanent employmentLocal areaWorldwideFlexible hours- ...Udaip Cloud-Based Data And Ai Platform Engineer At U.S. Bank, we're on a journey to do our best. Helping the customers and businesses we serve to make better and smarter financial decisions and enabling the communities we support to grow and succeed. We believe it takes...SeniorTemporary workWork experience placement
$261k - $326k
A technology company specializing in AI infrastructure is seeking a Principal Engineer to enhance reliability and scalability of cloud systems. This role demands over 15 years of experience in production engineering or related fields and involves setting technical directions...Senior- Drata is seeking a Senior Site Reliability Engineer in San Francisco. In this role, you will engage in reliability architecture for product teams,... ...ideal candidate has at least 6 years of experience in SRE or Cloud Engineering, expertise in Terraform and Datadog, and is...Senior
- ...building the category-defining AI workflow automation platform that... ...We’re hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems that... ...operating production systems in cloud environments, ideally AWS. ~...Work at officeRemote workFlexible hours2 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Site Reliability Engineer — AI Cloud Reliability. Be the first to apply!
- site reliability engineer remote San Francisco, CA
- site reliability engineer sre San Francisco, CA
- site reliability engineer San Francisco, CA
- cloud engineering manager San Francisco, CA
- informatica cloud developer San Francisco, CA
- senior cloud data engineer San Francisco, CA
- cloud engineer San Francisco, CA
- senior devops cloud engineer San Francisco, CA
- graduate cloud engineer San Francisco, CA
- cloud operations engineer San Francisco, CA


