Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior / Staff Site Reliability, Platform Engineering

Saviynt

About Saviynt Saviynt is a leader in identity security, delivering an AI-powered platform that governs and secures access to applications, data, and business processes for global enterprises and government institutions. Built for the AI era, Saviynt helps organizations move faster—securely and compliantly. Why This Role Matters Saviynt’s SaaS platform runs on complex, distributed, cloud‑native systems. As a Staff Platform Engineer, you will play a critical role in ensuring these systems remain highly available, scalable, and secure as the company grows. This is a hands‑on engineering and technical leadership role. You will own reliability for major platform domains, design scalable solutions on Kubernetes and AWS, and drive automation and reliability improvements across multiple teams. What You’ll Do In this pivotal role, you will be instrumental in designing, building, and maintaining the shared infrastructure services and platforms that our product and application teams will depend on. You will focus on creating reusable, reliable, and scalable solutions that abstract away complexity, enabling other teams to focus on their core business logic and deliver features faster in a multi‑cloud environment. Design and build core platform components and shared infrastructure services that other development teams will integrate with and leverage to deploy and operate their applications. Architect, implement, and manage highly available and scalable Kubernetes platforms as a service for internal consumers. Develop robust, internal‑facing tools and automation for infrastructure provisioning and management primarily using Go (Golang). Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.), focusing on creating reusable patterns and modules for other teams. Design and implement shared Event‑Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub that product teams can easily utilize. Develop and maintain robust CI/CD pipelines (e.g., GitLab CI and ArgoCD) as a service, providing standardized and automated deployment workflows for various development teams. Design and build resilient Distributed Systems components that serve as building blocks for other applications, focusing on reliability, fault tolerance, and performance. Manage and optimize our shared infrastructure across Multi‑Region Cloud Environments, ensuring that platform services are globally available and performant for all consumers. Establish and enhance centralized Observability and Monitoring platforms and tools that provide self‑service insights for consuming teams. Define and implement clear, well‑documented RESTful API designs for the infrastructure services you build, ensuring ease of integration for internal clients. Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities, providing traffic management, security, and policy enforcement as a shared platform for services. Design, implement, and optimize highly available Relational Database services or shared data platforms for broad organizational use. Collaborate closely with product development teams to understand their infrastructure needs and pain points, providing technical guidance and support. Participate in on‑call rotations to support the critical shared infrastructure you build. What We’re Looking For 6+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers. Deep expertise with Kubernetes in production environments, particularly in providing it as a platform (single‑tenant and multi‑tenant deployment architectures). Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation. Extensive hands‑on experience with at least one major Cloud Provider (AWS, GCP, or Azure); multi‑cloud experience is a strong plus, especially in building abstractions over them. Proven experience designing and implementing Event‑Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services. Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams. Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components. Familiarity with Multi‑Region Cloud Environments and strategies for building globally distributed and highly available platform. Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure. Strong experience with RESTful API design principles and building well‑documented, consumable APIs. Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context. Hands‑on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service. Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non‑technical audiences. A strong customer‑centric mindset, treating internal development teams as your primary customers. Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required. Why Join Saviynt Work on a large‑scale, cloud‑native SaaS platform. Solve complex reliability challenges at scale. Influence platform architecture and engineering practices. Competitive compensation, benefits, and career growth. Security & Compliance This role requires adherence to Saviynt’s information security and privacy policies, including annual security training. #J-18808-Ljbffr Saviynt

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Senior / Staff Site Reliability, Platform Engineering in San Francisco, CA vacancy
  • $232k - $319k

     ...too, let's talk. The Infrastructure Platform and Shared Services Team Okta authenticates...  ...scale the service with great people and reliable, cost-effective, and efficient...  ...Accelerate the velocity of SRE and product engineering by developing robust platforms, powerful... 
    Senior
    Permanent employment
    Local area
    Worldwide
    Flexible hours

    Okta, Inc.

    San Francisco, CA
    4 days ago
  •  ...you will find a home at Fieldguide. About the Role As a Senior Site Reliability Engineer (SRE) at Fieldguide, you will be responsible for ensuring...  ...with rapid growth. You’ll work closely with product and platform engineering teams to define and implement reliability standards... 
    Senior
    Remote work
    Work from home
    Flexible hours

    Fieldguide.ai

    San Francisco, CA
    5 days ago
  • OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes... 
    Senior
    Flexible hours

    OutSystems, Inc.

    San Francisco, CA
    4 days ago
  • $50 per hour

     ...computing solutions. Crusoe Cloud is a managed cloud services platform powered by stranded energy that enables climate-friendly innovation...  ...to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems Bachelor's Degree in... 
    Senior
    Temporary work
    Work experience placement

    Epoch Biodesign

    San Francisco, CA
    1 day ago
  • $325k

     ...Engineering at Ivo Engineers At Ivo Are Inventors. Ivo Was First-to-market With An...  ...build the foundation for Ivo’s entire platform. Customers are cagey about their...  ...hit our SLAs. We’re looking for an Senior or Staff Site level Reliability Engineer as part of the Infrastructure... 
    Senior
    Contract work

    Icehouseventures

    San Francisco, CA
    5 days ago
  • $166.9k - $225.9k

     ...operates as both a central engineering function and an embedded reliability practice. You'll be part...  ...engineering leads and staff engineers to define SLOs...  ...cause incidents. Central SRE Platform Work Beyond your product...  ...+ years of experience in Site Reliability Engineering,... 
    Senior
    Flexible hours

    Drata

    San Francisco, CA
    5 days ago
  • US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average of over 12 years of industry experience, the successful candidate will bridge the gap between software development and systems... 
    Senior

    Axiom Pursuits

    San Francisco, CA
    4 days ago
  • $227.2k - $324.5k

     ...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team....  ...latency, performance, and capacity of our platform, and we achieve our goals through a...  ...seeking an experienced and visionary Senior SRE Manager to lead and grow our... 
    Senior
    Full time
    Contract work
    Temporary work
    Local area
    Flexible hours

    Tubi

    San Francisco, CA
    3 days ago
  • $163k - $203k

    GoTo Meeting is looking for a Senior Site Reliability Engineer in San Francisco. You will be responsible for the reliability, scalability, and security of Prosper’s Cloud Platform portfolio. This role requires expertise in Kubernetes, cloud platforms (preferably GCP), and... 
    Senior

    GoTo Meeting

    San Francisco, CA
    4 days ago
  •  ...accessible, secure, and affordable. Join us in building a platform that empowers innovators everywhere to turn their visionary...  ...to redefine computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI infrastructure... 
    Senior

    deCircle

    San Francisco, CA
    3 days ago
  • $181k - $263k

    ## Senior Staff Site Reliability EngineerApplylocations: San Franciscotime type: Full timeposted on: Posted...  ...**LiveRamp is the data collaboration platform of choice for the world’s most...  ...for a Senior Staff Site Reliability Engineer who will set the technical direction... 
    Senior
    Work from home
    Flexible hours
    Night shift

    LiveRamp

    San Francisco, CA
    5 days ago
  •  ...home day is currently Tuesday. Engineering at Lambda is responsible for...  ...and operate observability platforms for logging, metrics, and...  ...adoptable and improve product reliability. Lead members of other engineering...  ...5+ years of experience in Site Reliability Engineering... 
    Senior
    Work at office
    Local area
    Work from home

    Lambda

    San Francisco, CA
    3 days ago
  • A leading biotechnology firm in South San Francisco is seeking a Site Reliability Engineer to architect and implement Infrastructure as Code (IaC) solutions that enhance cloud-based platform solutions for Machine Learning and HPC workloads. The ideal candidate has extensive... 
    Senior
    3 days per week

    Genentech

    South San Francisco, CA
    4 days ago
  • What you’ll do As a Senior Site Reliability Engineer, you’ll work closely with product teams in Spend to deliver and maintain scalable, reliable...  ...Software Engineering, or a related field. Expertise in cloud platforms (AWS/GCP), Kubernetes, observability, and incident... 
    Senior

    Airwallex-

    San Francisco, CA
    3 days ago
  •  ...was a machine learning research engineer at Scale AI. The rest of our team...  ...teams, starting with an end to end platform powering warm outbound. Today,...  ...market with state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding... 
    Senior

    Unify

    San Francisco, CA
    4 days ago
  • An innovative R&D company in San Francisco is seeking a Site Reliability Engineer to join its Platform Engineering team. This position focuses on ensuring the reliability and performance of an AI-powered code review platform. The ideal candidate will have 6-8 years of... 
    Senior

    CodeRabbit

    San Francisco, CA
    2 days ago
  •  ...that possible. We’re a team of doctors, engineers, designers, researchers, and creatives...  ...end-to-end. Improve operational reliability: Identify recurring issues and reliability...  ...clusters, cloud infrastructure, and core platform services, with growing ownership as... 
    Senior
    Work at office
    Worldwide

    Heidi Health Ltd

    San Francisco, CA
    4 days ago
  •  ...and onboard services and teams to the reliability tenets. Establish and maintain...  ...equivalent. 6+ years of experience in Site Reliability Engineering, managing infrastructure and services...  ...Containerization technologies and orchestration platforms—mainly Kubernetes and EKS (CKA, CKAD... 
    Senior

    OutSystems, Inc.

    San Francisco, CA
    4 days ago
  • We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. You’ll...  ...’ll ever need. It is a layer 1 blockchain and developer platform that connects any L1 and L2, from Ethereum to Bitcoin and... 
    Senior
    Remote job

    Blockchain Works

    San Francisco, CA
    6 days ago
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon's production systems...  ...ll build tooling that makes the entire engineering team more effective, establish on-call rotations and runbooks, and ensure our platform can handle the demands of a regulated,... 
    Senior

    Pylon

    San Francisco, CA
    2 days ago
  • $60 per hour

    Senior Site Reliability Engineer (Copy) Seattle Hybrid (Hybrid location). Full-time. About Us Supio is a trusted AI platform purpose-built for law firms, reshaping how data drives impactful outcomes. Our innovative approach blends technology with deep legal expertise,... 
    Senior
    Full time
    Work at office
    Flexible hours

    Bonfirevc

    San Francisco, CA
    4 days ago
  • For more information, please read ourSenior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: US - San Francisco Bay Areatime...  ....Containerization technologies and orchestration platforms, mainly Kubernetes and EKS (CKA, CKAD, CKS... 
    Senior
    Immediate start
    Remote work
    Worldwide

    OutSystems Inc.

    San Francisco, CA
    4 days ago
  • # Senior Site Reliability EngineerHybrid - San Francisco**Our Mission & Values...  ...operates as both a central engineering function and an embedded reliability...  ...engineering leads and staff engineers to define SLOs...  ...incidents*Central SRE Platform Work*Beyond your product team... 
    Senior
    Work at office
    Immediate start
    Worldwide
    Monday to Friday
    Flexible hours

    Careers at Drata

    San Francisco, CA
    5 days ago
  • $127k - $249k

    The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions...  ..., alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper).... 
    Senior
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    San Francisco, CA
    1 day ago
  • A tech company focused on AI is seeking a Site Reliability Engineer to ensure the reliability and performance of its GPU marketplace. This role involves maintaining service level objectives, managing capacity, and implementing secure systems. The ideal candidate has strong... 
    Senior

    Hyperbolic Labs

    San Francisco, CA
    1 day ago
  • Drata is seeking a Senior Site Reliability Engineer in San Francisco. In this role, you will engage in reliability architecture for product teams, lead production readiness reviews, and build automation around monitoring and alerting. The ideal candidate has at least 6... 
    Senior

    Careers at Drata

    San Francisco, CA
    5 days ago
  • CloudDevs: Senior Web site Reliability Engineer (SRE) CloudDevs works with fast-moving, venture-backed startups throughout the US. We’re constructing...  ...requirements. Necessities 5+ years in SRE, DevOps, or Platform Engineering roles. Sturdy expertise with cloud infrastructure... 
    Senior

    The10minutecareersolution

    San Francisco, CA
    1 day ago
  • $220k - $235k

     ...the leading AI contracting platform that transforms agreements...  ...seeking a strategic, high‑output Staff/Senior Staff SRE to define the...  ...platform and champion engineering excellence across Ironclad....  ...strategic direction for the Site Reliability Engineering team and our broader... 
    Senior
    Full time
    Work at office

    Ironclad Inc.

    San Francisco, CA
    4 days ago
  • Senior Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco • Full-Time About Andromeda Andromeda Cluster was founded...  ...to deliver compute when and where it’s needed most. Our platform routes training and inference jobs across global supply,... 
    Senior
    Full time
    Remote work

    Cortes 23

    San Francisco, CA
    4 days ago
  • $15 per hour

    Summary The Wikimedia Foundation is looking for a Senior Site Reliability Engineer to support and develop the platform serving the world’s favorite encyclopedia,...  ...Foundation is a remote-first organization with staff members including contractors based 40+ countries... 
    Senior
    Permanent employment
    For contractors
    Remote work

    Nerdleveltech

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior / Staff Site Reliability, Platform Engineering. Be the first to apply!