Head of Cloud Platform & SRE — Multi-Cloud & Observability
Baseten
Baseten in San Francisco is looking for a Senior Manager of Cloud Platform and Site Reliability to lead and grow the organization responsible for their machine learning platform infrastructure. The role requires managing team leads, setting technical direction, and ensuring the reliability of cloud operations. Ideal candidates have strong technical expertise in Kubernetes, cloud infrastructure, and proven incident management skills. They will contribute to establishing standards for service reliability and drive cross-functional collaboration with product and engineering teams. #J-18808-Ljbffr Baseten
- ...seeking a Director of Site Reliability Engineering to lead a dynamic SRE team. This senior role involves shaping engineering culture while... ...build. This position requires a breadth of experience in SRE, cloud technologies, as well as strong leadership and communication...Platform
- ...Overview: Job Title: Observability Architect (Dynatrace) Location... ..., infrastructure, cloud platforms, and containerized environments... ...coverage across hybrid and multi-cloud environments. Cloud... ...Partner with DevOps, SRE, IT, and business teams to align...PlatformContract work
- Neara is seeking a Sr. Site Reliability Engineer to design and operate the multi-cloud infrastructure powering Optura’s AI Platform. You will own systems end-to-end and partner with teams to ensure secure, scalable services. With at least 8 years in production environments...Platform
$170k - $250k
...Site Reliability Engineer (SRE) Location: San Francisco, CA... ...building a next-generation GPU cloud platform for enterprises, startups, and... ...engineering to build the automation, observability, and platform infrastructure that powers their multi-cloud GPU marketplace at scale...PlatformWork at officeVisa sponsorshipFlexible hours- THE ROLE As Senior Manager of Cloud Platform and Site Reliability, you... ...our cloud infrastructure and SRE practice — from coaching your... ...escalations, to shaping the multi-year roadmap for multi-cloud... ...infrastructure, and observability platforms. You operate at the...PlatformTemporary workFlexible hours
- Devops Engg /SRE Jobs in SRP Systems Inc San Francisco... ...focus on improving platform reliability, availability... ...of distributed (multi‑tiered) systems, algorithms... ...Kafka. Preferred Skills Cloud platforms (AWS, Azure,... ...etc.). Monitoring and observability tools such as Dynatrace...Platform
- ...Join us and help build the platform engineers turn to to ship AI... ...processes, automations, and observability tooling that keep our platform... ...like these as part of the SRE team: Improve Baseten... ...the reliability of Baseten's multi-cloud Kubernetes infrastructure, including...PlatformFlexible hours
$172.5k - $260.1k
...Manager, Software Engineering - Cloud Platform Location New York, NY; San... ...on as an afterthought. SRE Mindset: Engineering for failure... ...999% availability standard. Observability: Relying on telemetry,... ...zone. Deep understanding of multi-account cloud strategies, centralized...PlatformWork experience placementShift work$305k - $385k
...to build beneficial AI systems. About the role The Anthropic Platform Org’s mission is to help builders build. Our vision is to be the... ...is responsible for our APIs, self-serve developer experience, multi-cloud integrations, and agentic infrastructure. We serve a wide...PlatformWork at officeVisa sponsorshipFlexible hours$170k - $230k
...Site Reliability Engineer (SRE) Palo Alto / San Francisco... ...Mithril is an AI infrastructure platform built to make GPU compute... ...across a heterogeneous, multi-cloud environment. About the Opportunity... ...will build the automation, observability, and tooling that allows...PlatformWork at officeLocal area1 day per week- ...DESCRIPTION The Senior DevOps & SRE Manager - Platform Reliability & Global... ...operational excellence of a complex, multi‑platform ecosystem spanning... ...as Code, and automation Observability and incident... ...experience with Kubernetes, cloud platforms, and event‑driven...PlatformWork at office3 days per week
- ...us and help build the platform engineers turn to to ship... ...Manager for Baseten's Cloud Platform team, you will... ...platform engineering, or SRE context (not managing... ...HAVE Experience with OSS observability tooling (Prometheus,... ...OpenTelemetry). Background in multi‑cloud environments or...PlatformFlexible hours
- ...transformation initiatives by building resilient multi-cloud infrastructure, automating deployments at scale, and driving platform reliability for enterprise SaaS products.... ...) Background in distributed systems observability with OpenTelemetry About APPIT Software Solutions...PlatformFlexible hours
- ...Missionforce Operations (Private Cloud Edition)Skip to main... ...processes on a single platform.We are seeking a senior... ...infrastructure, DevOps/SRE, customer success,... ..., including multi-tenancy, identity and access... ...isolation, APIs, integration, observability, release management,...PlatformFor contractorsWork at officeRemote workShift work
$148.5k - $223.9k
...Reliability Engineer (Cloud Automation) Location:... ...About the Team The Cloud Platform Engineering team builds... ...on as an afterthought. SRE Mindset: Engineering... ...availability standard. Observability: Relying on telemetry,... ...of new, fully governed multi-account cloud environments...PlatformWork experience placementShift work- ...passionate about building and operating production-grade systems. This role involves ownership of AWS infrastructure, Kubernetes platforms, and continuous improvement efforts in a high-pressure environment. The ideal candidate has deep AWS expertise, strong coding skills...Platform
- ...emphasizes collaboration across teams and requires expertise in cloud systems, CI/CD practices, and reliability metrics. Candidates... ...automation and configuration management, experience with cloud platforms like AWS, and a strong ability to work in a distributed environment...PlatformRemote job
$140k - $185k
...the production environment: Strengthen observability: Reduce operational toil: Support... ...What we’re looking for 3-6+ years in SRE, DevOps, Platform, or operations-heavy engineering roles... ...under pressure. Experience operating cloud infrastructure (AWS preferred). Working...PlatformWork at officeWorldwide$166k - $225k
...leading data and AI company in San Francisco seeks a Senior Software Engineer to enhance their infrastructure platform. This role requires building multi-cloud systems and scalable solutions for managing data and AI workloads. Ideal candidates have a strong programming...PlatformFlexible hours- ...about this opportunity, feel free to reach out and apply today! Responsibilities Architect and implement a secure, scalable cloud platform meeting FedRAMP High and DoD IL5 standards. Oversee the integration of physical infrastructure with cloud orchestration,...PlatformRemote work
$300k
...mode startup building out their AI and cloud platform, powered by thousands of H100s, H200s,... ...and Kubernetes environments. Develop observability, alerting, and auto-healing systems for... ...Must Have: 7+ years of experience in SRE, DevOps, or Infrastructure Engineering...Platform- ...will work at the intersection of cloud infrastructure, Kubernetes, automation, and observability, with a strong focus on... ...improvement of Kubernetes (EKS) platforms Reliability of production systems... ...Experience troubleshooting complex, multi-layer Kubernetes issues...PlatformRemote work
- ...Web site Reliability Engineer (SRE) CloudDevs works with fast-moving... ...system reliability, efficiency, and observability. Outline and monitor SLIs, SLOs... ...5+ years in SRE, DevOps, or Platform Engineering roles. Sturdy expertise with cloud infrastructure (AWS most popular...Platform
- ...the enterprise sustainability platform. Companies like Airbnb,... ...Engineering Manager for our Cloud Infrastructure team to help lead... ...infrastructure that powers Watershed’s multi-region deployment on GCP,... ...database architecture, observability, reliability & SLOs, cloud security...PlatformWork at officeRemote work
- Hyperbolic Labs, based in San Francisco, seeks a Platform Engineer to design the control plane for our innovative GPU marketplace. This... ...for implementing identity management, billing systems, and multi-cloud abstractions that enhance developer experiences. An expert in...Platform
- B Capital is seeking a Systems Engineer to join its Compute Platform team in San Francisco. This role involves maintaining a K8s-based... ...systems challenges, focusing on GPU infrastructures and multi-cloud environments. The ideal candidate has extensive experience in...Platform
- Zyphra in San Francisco is hiring a Platform Engineer responsible for designing and maintaining robust infrastructure. You will collaborate with teams to enhance system observability, manage cloud environments and ensure deployment safety. The ideal candidate has strong...Platform
- ...layer 1 blockchain and developer platform that connects any L1 and L2,... ...you. Experience: 3+ years of cloud infrastructure experience 2+... ...enjoy building testing and observability capabilities that will accelerate... ...processes. DevOps Engineer/SRE Transitioning to Blockchain...PlatformRemote job
$325k - $405k
...Software Engineer, Security Observability to join our Security team. In... ...data systems to ensure high platform availability. Collaborate closely... ...Terraform and working with cloud platforms such as Azure.... ...site reliability engineering (SRE), or security. The ability to...PlatformRemote workRelocation package$177.19k - $364.8k
...Software Engineer to join our Observability team at Pinterest. This role... ...collaboration: Partner with SRE, Infrastructure, Product Engineering... ...Experience building internal platforms or tools with strong adoption... ...solutions. Familiarity with cloud‑native architectures and...PlatformWork at officeRelocationRelocation package
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Head of Cloud Platform & SRE — Multi-Cloud & Observability. Be the first to apply!
- cloud engineering manager San Francisco, CA
- director of cloud San Francisco, CA
- cloud program manager San Francisco, CA
- senior cloud service delivery manager San Francisco, CA
- vp cloud San Francisco, CA
- junior cloud administrator San Francisco, CA
- salesforce commerce cloud San Francisco, CA
- oracle cloud technical San Francisco, CA
- cloud engineer azure San Francisco, CA
- cloud admin San Francisco, CA


