Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Head of Cloud Platform & SRE — Multi-Cloud & Observability

Baseten

Baseten in San Francisco is looking for a Senior Manager of Cloud Platform and Site Reliability to lead and grow the organization responsible for their machine learning platform infrastructure. The role requires managing team leads, setting technical direction, and ensuring the reliability of cloud operations. Ideal candidates have strong technical expertise in Kubernetes, cloud infrastructure, and proven incident management skills. They will contribute to establishing standards for service reliability and drive cross-functional collaboration with product and engineering teams. #J-18808-Ljbffr Baseten

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Head of Cloud Platform & SRE — Multi-Cloud & Observability in San Francisco, CA vacancy
  •  ...us and help build the platform engineers turn to to ship...  ...As Senior Manager of Cloud Platform and Site Reliability...  ...infrastructure and SRE practice - from...  ...escalations, to shaping the multi-year roadmap for multi-...  ...inference infrastructure, and observability platforms. You operate... 
    Platform
    Temporary work
    Flexible hours

    BaseTen

    San Francisco, CA
    2 days ago
  • Lambda, a leader in AI cloud infrastructure, is seeking a Software Engineer specializing in observability platforms. The ideal candidate has over 8 years of experience, including 3+ years in Go and 5+ years practicing Site Reliability Engineering. Responsibilities include... 
    Platform
    Work at office

    Lambda

    San Francisco, CA
    5 days ago
  •  ...Reliability Engineer to lead the reliability, scalability, and observability strategies across their platform. The ideal candidate will have over 10 years of...  ...with a strong focus on distributed systems and cloud environments, preferably AWS. You will guide best practices... 
    Platform
    Flexible hours

    Fieldguide

    San Francisco, CA
    4 days ago
  • $175k - $210k

     ...Senior Manager, DevOps & SRE – Platform Reliability & Global Operations...  ...operational excellence of a complex, multi-platform ecosystem spanning...  ...as Code, and automation Observability and incident troubleshooting...  ...experience with Kubernetes, cloud platforms, and event driven... 
    Platform
    Work at office
    3 days per week

    Q-Cells

    San Francisco, CA
    2 days ago
  •  ...Web site Reliability Engineer (SRE) CloudDevs works with fast-moving...  ...system reliability, efficiency, and observability. Outline and monitor SLIs, SLOs,...  ...5+ years in SRE, DevOps, or Platform Engineering roles. Sturdy expertise with cloud infrastructure (AWS most popular... 
    Platform

    The10minutecareersolution

    San Francisco, CA
    1 day ago
  • $145.35k - $253.23k

     ...seeking a Manager, SAP S4 Public Cloud SAC Lead in Enterprise...  ...flows between SAP and non SAP platforms using SAP BTP and Databricks....  ...principles as applied to SAP centric, multi cloud, and hybrid analytics...  ...a calendar of holidays to be observed during the year and provides... 
    Platform
    H1b
    Local area

    KPMG

    San Francisco, CA
    3 days ago
  • $300 per month

     ...manufacturing, data center construction, and cloud services. If you want to do the most...  ...architecture and evolution of Crusoe's observability platform at scale. In this role, you will define...  ...'s observability platform supporting multi-region, multi-datacenter Kubernetes... 
    Platform
    Temporary work

    Crusoe

    San Francisco, CA
    5 days ago
  • $166k - $225k

     ...leading data and AI company in San Francisco seeks a Senior Software Engineer to enhance their infrastructure platform. This role requires building multi-cloud systems and scalable solutions for managing data and AI workloads. Ideal candidates have a strong programming... 
    Platform
    Flexible hours

    Databricks Inc.

    San Francisco, CA
    4 days ago
  •  ...about this opportunity, feel free to reach out and apply today! Responsibilities Architect and implement a secure, scalable cloud platform meeting FedRAMP High and DoD IL5 standards. Oversee the integration of physical infrastructure with cloud orchestration,... 
    Platform
    Remote work

    Hamilton Barnes Associates Limited

    San Francisco, CA
    1 day ago
  • $260k - $385k

     ...Software Engineer, Security Observability to join our Security team. In...  ...data systems to ensure high platform availability Collaborate...  ...like Terraform and working with cloud platforms such as Azure....  ...site reliability engineering (SRE), or security. The ability... 
    Platform
    Remote work
    Relocation package

    OpenAI

    San Francisco, CA
    2 days ago
  • A decentralized AI platform company in the United States is seeking an experienced ML Training Platform Engineer to design and build...  ...in infrastructure and platform engineering, with expertise in multi-cloud deployments and distributed systems. Responsibilities include... 
    Platform

    Pluralis Research

    San Francisco, CA
    3 days ago
  • A leading cloud security startup in the US is seeking a Platform Engineer to design and maintain AWS infrastructure for multi-tenant SaaS platforms. This role requires strong proficiency in AWS, Kubernetes, and infrastructure-as-code tools such as Terraform. You will work... 
    Platform
    Remote work

    Trades Workforce Solutions

    San Francisco, CA
    3 days ago
  •  ...'s leading AI-powered, cloud-native products that shape...  ...on the Compute Platform team, you will be a key...  ...evolving our next-generation, multi-tenant, cloud-native...  ...execution environments Observability & Operations: Drive...  ...including product management, SRE, and other engineering... 
    Platform

    IBM

    San Francisco, CA
    4 days ago
  • B Capital is seeking a Systems Engineer to join its Compute Platform team in San Francisco. This role involves maintaining a K8s-based...  ...systems challenges, focusing on GPU infrastructures and multi-cloud environments. The ideal candidate has extensive experience in... 
    Platform

    B Capital

    San Francisco, CA
    4 days ago
  • Zyphra in San Francisco is hiring a Platform Engineer responsible for designing and maintaining robust infrastructure. You will collaborate with teams to enhance system observability, manage cloud environments and ensure deployment safety. The ideal candidate has strong... 
    Platform

    Zyphra

    San Francisco, CA
    3 days ago
  •  ...layer 1 blockchain and developer platform that connects any L1 and L2,...  ...you. Experience: 3+ years of cloud infrastructure experience 2+...  ...enjoy building testing and observability capabilities that will accelerate...  ...processes. DevOps Engineer/SRE Transitioning to Blockchain... 
    Platform
    Remote job

    Blockchain Works

    San Francisco, CA
    3 days ago
  • $180k - $240k

     ...technology company is seeking a Site Reliability Engineer (SRE) who is passionate about leveraging data and automation....  ...should have expertise in Infrastructure as Code, Google Cloud, and effective observability practices. Competitive compensation ranges from $180K to... 
    Remote job

    Pantera Capital

    San Francisco, CA
    1 day ago
  •  ...We're looking for an ML Training Platform Engineer to architect, build, and scale...  ...training. Responsibilities Multi-Cloud Infrastructure : Design resource management...  ...tooling) with hands-on experience in observability, SRE practices, monitoring (Prometheus/... 
    Platform
    Work experience placement

    Pluralis

    San Francisco, CA
    4 days ago
  • $142.6k - $261.5k

     ...communities. The opportunity The Platforms Practice specializes in...  ...experience building and operating cloud infrastructure and Kubernetes...  ...communication across teams Apply SRE best practices, establish...  ...insights, a globally connected, multi-disciplinary network and... 
    Platform
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    San Francisco, CA
    4 days ago
  •  ...*Senior Manager, Software Engineering - Cloud Platform **Location:** New York, NY; San Francisco...  ...landing zone.* Deep understanding of multi-account cloud strategies, centralized governance...  ...within highly technical platform or SRE teams.When you join Salesforce, you’ll... 
    Platform
    Work experience placement
    Shift work

    Salesforce, Inc.

    San Francisco, CA
    3 days ago
  •  .../Kubernetes Hybrid cloud (on-prem ? cloud, VMware...  ...consultant who can shape platform strategy, own end-to-end...  ...private, hybrid, and multi-cloud environments. The...  ...& Security Define SRE/operational runbooks, monitoring/observability (Aria Operations/Operations... 
    Platform

    Talent Search PRO

    San Francisco, CA
    2 days ago
  •  ...backend services that power our multi-tenant platform. Architect and optimize...  ...Java development, testing, observability, and deployment in a...  ...teams across DevOps, Security, SRE, and Application Engineering...  ..., Docker, and AWS (or other cloud providers). Solid knowledge... 
    Platform

    Saviynt

    San Francisco, CA
    5 days ago
  •  ...ROLE We're looking for a Platform/DevOps Engineer to own...  ...environments, observability, and developer tooling....  ...architectures across multiple cloud providers, self-managed...  ...platform engineering, DevOps, SRE, or related infra roles...  ..., FinOps, or multi-language dev environments... 
    Platform
    Work at office
    Local area
    Flexible hours

    Untolabs

    San Francisco, CA
    4 days ago
  •  ...reliability of our system across cloud, edge, and real-world environments . Our platform runs across distributed...  ...for making systems observable, diagnosable, and...  ...years of experience in SRE, infrastructure, or distributed...  ...Experience with multi-site or edge deployments... 
    Platform
    Permanent employment

    Claryo

    San Francisco, CA
    4 days ago
  • $182k - $249k

    Okta, Inc. is seeking an experienced Staff Site Reliability Engineer to join their Infrastructure Platform AGILE SRE team in San Francisco, CA. This role involves resolving infrastructure challenges through strong technical guidance, mentoring, and improvements to monitoring... 
    Platform

    Okta, Inc.

    San Francisco, CA
    5 days ago
  •  ...ensure the reliability and performance of its AI platform. The role involves architecting and operating...  ...scalability. Ideal candidates will have 3+ years in SRE or DevOps, programming proficiency, and experience with major cloud providers. The company focuses on optimizing... 
    Platform

    Blaxel

    San Francisco, CA
    1 day ago
  • $166.9k - $225.9k

    Drata is looking for a Site Reliability Engineer to join their SRE team in San Francisco. In this role, you'll operate at the intersection...  ...This position requires at least 6 years of experience in SRE or cloud engineering, with strong skills in Terraform and Datadog. The... 
    Platform

    Drata

    San Francisco, CA
    2 days ago
  • $202k - $272k

     ...Director, Strategic Cloud Partnerships San Francisco, California, USA; Seattle, Washington, USA We...  ...innovators and pioneers dedicated to shaping the future of observability. At New Relic, we build an intelligent platform that empowers companies to thrive in an AI-first... 
    Platform
    Contract work
    Work at office
    Remote work
    Flexible hours

    New Relic

    San Francisco, CA
    3 days ago
  • $175k - $225k

     ...will be building the backend systems that power LangChain's observability and evals platform. You will work on the core services that allow developers...  ...with database systems (Postgres, Redis, Clickhouse), and cloud platforms (AWS, GCP, Azure) ~ Strong communication... 
    Platform
    Work at office
    Flexible hours

    LangChain

    San Francisco, CA
    5 days ago
  •  ...identifying new opportunities. The successful candidate will collaborate with a supportive team to promote their AI-driven cloud observability platform, helping customers navigate the sales journey. Ideal applicants will showcase proven sales experience, strong... 
    Platform
    Full time

    Rockstar

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Head of Cloud Platform & SRE — Multi-Cloud & Observability. Be the first to apply!