Senior / Staff Site Reliability, Platform Engineering

Saviynt

About Saviynt Saviynt is a leader in identity security, delivering an AI-powered platform that governs and secures access to applications, data, and business processes for global enterprises and government institutions. Built for the AI era, Saviynt helps organizations move faster—securely and compliantly. Why This Role Matters Saviynt’s SaaS platform runs on complex, distributed, cloud‑native systems. As a Staff Platform Engineer, you will play a critical role in ensuring these systems remain highly available, scalable, and secure as the company grows. This is a hands‑on engineering and technical leadership role. You will own reliability for major platform domains, design scalable solutions on Kubernetes and AWS, and drive automation and reliability improvements across multiple teams. What You’ll Do In this pivotal role, you will be instrumental in designing, building, and maintaining the shared infrastructure services and platforms that our product and application teams will depend on. You will focus on creating reusable, reliable, and scalable solutions that abstract away complexity, enabling other teams to focus on their core business logic and deliver features faster in a multi‑cloud environment. Design and build core platform components and shared infrastructure services that other development teams will integrate with and leverage to deploy and operate their applications. Architect, implement, and manage highly available and scalable Kubernetes platforms as a service for internal consumers. Develop robust, internal‑facing tools and automation for infrastructure provisioning and management primarily using Go (Golang). Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.), focusing on creating reusable patterns and modules for other teams. Design and implement shared Event‑Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub that product teams can easily utilize. Develop and maintain robust CI/CD pipelines (e.g., GitLab CI and ArgoCD) as a service, providing standardized and automated deployment workflows for various development teams. Design and build resilient Distributed Systems components that serve as building blocks for other applications, focusing on reliability, fault tolerance, and performance. Manage and optimize our shared infrastructure across Multi‑Region Cloud Environments, ensuring that platform services are globally available and performant for all consumers. Establish and enhance centralized Observability and Monitoring platforms and tools that provide self‑service insights for consuming teams. Define and implement clear, well‑documented RESTful API designs for the infrastructure services you build, ensuring ease of integration for internal clients. Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities, providing traffic management, security, and policy enforcement as a shared platform for services. Design, implement, and optimize highly available Relational Database services or shared data platforms for broad organizational use. Collaborate closely with product development teams to understand their infrastructure needs and pain points, providing technical guidance and support. Participate in on‑call rotations to support the critical shared infrastructure you build. What We’re Looking For 6+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers. Deep expertise with Kubernetes in production environments, particularly in providing it as a platform (single‑tenant and multi‑tenant deployment architectures). Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation. Extensive hands‑on experience with at least one major Cloud Provider (AWS, GCP, or Azure); multi‑cloud experience is a strong plus, especially in building abstractions over them. Proven experience designing and implementing Event‑Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services. Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams. Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components. Familiarity with Multi‑Region Cloud Environments and strategies for building globally distributed and highly available platform. Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure. Strong experience with RESTful API design principles and building well‑documented, consumable APIs. Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context. Hands‑on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service. Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non‑technical audiences. A strong customer‑centric mindset, treating internal development teams as your primary customers. Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required. Why Join Saviynt Work on a large‑scale, cloud‑native SaaS platform. Solve complex reliability challenges at scale. Influence platform architecture and engineering practices. Competitive compensation, benefits, and career growth. Security & Compliance This role requires adherence to Saviynt’s information security and privacy policies, including annual security training. #J-18808-Ljbffr Saviynt

Apply

Vacancy posted 9 days ago

Similar jobs that could be interesting for youBased on the Senior / Staff Site Reliability, Platform Engineering in Milpitas, CA vacancy

Senior Thermal Engineer, Server Platform Design
$180.5k - $270.7k
Qualcomm is seeking an experienced Thermal Engineer to develop high-performance thermal solutions for data center applications in Santa Clara, California. The role involves hands-on lab work, thermal testing, and collaboration with cross-functional teams. The ideal candidate...
Senior
Qualcomm
Santa Clara, CA
4 days ago
Senior Embedded Firmware Engineer, Server Platform Equity
A leading technology company in Santa Clara is seeking a Senior System Software Engineer to design and implement microcontroller firmware for GPU server platforms. The ideal candidate will have a Bachelor's degree in Electrical Engineering or Computer Science, along with...
Senior
NVIDIA
Santa Clara, CA
2 days ago
Senior Site Reliability Engineer
$150k - $175k
...Site Reliability Engineer At ASAPP, our mission is simple: deliver the best AI-powered customer experience—faster than anyone else. To achieve... ...customer problems and deliver new features, not reinvent platforms. What You'll Do Work with product engineering...
Senior
Remote work
ASAPP
Mountain View, CA
4 days ago
Senior Site Reliability Engineer
...Senior Site Reliability Engineer LeanData helps the world's fastest-growing companies automate, simplify, and accelerate revenue. We are looking... ...have deep experience configuring New Relic (or similar platforms) to create meaningful dashboards, SLIs, and SLOs....
Senior
Full time
Work at office
Flexible hours
2 days per week
LeanData
Santa Clara, CA
3 days ago
Senior Site Reliability Engineer
$181.69k - $213.75k
...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects founders... ...Trusted by 65,000+ companies in 160+ countries, Carta's platform of software and services lays the groundwork so you can...
Senior
Full time
Work at office
Carta
Santa Clara, CA
3 days ago
Senior Site Reliability Engineer
$148k - $235.75k
...organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-... ...prem infrastructure. Maintain uptime, reliability and readiness of on-prem engineering... ...Kibana, Grafana, Splunk, or similar platforms, applied to analyze logs, metrics, and...
Senior
Remote work
NVIDIA
Santa Clara, CA
4 days ago
Senior Site Reliability Engineer
$159.2k - $301.6k
...The Opportunity We are seeking a Senior SRE (Site Reliability Engineer) to help compose, build, and operate highly scalable, secure, and resilient cloud platforms. We are redefining this role to focus on product and platform engineering. This position is a core builder...
Senior
Temporary work
Local area
Worldwide
Adobe
San Jose, CA
1 day ago
Senior SRE Engineer — AI-Driven Compute Platform
...technology leader is looking for an experienced SRE software engineer in Cupertino, California, to build and enhance compute infrastructure... .... Applicants should have at least 8 years of experience in site reliability engineering, a strong background in cloud infrastructure, and...
Senior
Apple Inc.
Cupertino, CA
10 hours ago
Senior/Staff Site Reliability Engineer
$180k - $260k
.... About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our... ..., you will work closely with our infrastructure and platform teams to manage rollouts of both on-premises and cloud...
Senior
Odd job
Work at office
Remote work
Gatik AI
Mountain View, CA
3 days ago
Senior Staff Site Reliability Engineer
$126k - $204.5k
...delivers the industry's most advanced SecOps platform, consisting of XDR, XSIAM, XSOAR, and... ...you will collaborate closely with our engineering teams to develop innovative solutions... ...of the product and ensure the reliability and availability of our services. Qualifications...
Senior
Full time
Work at office
Palo Alto Networks
Santa Clara, CA
2 days ago
Senior Site Reliability Engineer, AIOPs
...Role Overview You will be building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior Java SRE & Platform Engineer - AWS/Kubernetes
...technology company is looking for a Java SRE Engineer to support large-scale cloud migrations... ...lead migrations, design robust AWS EKS platforms, and implement deployment strategies.... ...with various teams to ensure reliability. This position is onsite in the San Francisco...
Senior
EITACIES Inc.
Santa Clara, CA
1 day ago
Senior Site Reliability Engineer - HPC
$152k - $241.5k
Overview NVIDIA is looking for a Senior Site Reliability Engineer (SRE) to join our Compute Farm team and help build the next generation of our global services platform. The role focuses on keeping critical systems operational while leveraging AI technologies to deliver...
Senior
NVIDIA Corporation
Santa Clara, CA
4 days ago
Senior Site Reliability Engineer - HPC
$152k - $241.5k
...Overview We’re looking for a Senior SRE to join our Compute Farm... ...generation of our global services platform. At NVIDIA, you’ll keep... ...lifecycle management, fleet reliability/auto‑healing, E2E observability... ...Perl, or Ruby. Mentored other engineers and influenced technical direction...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Site Reliability Engineer: Cloud, Kubernetes & CI/CD
A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS.... ...requires no customer interaction and focuses on improving platform architecture and reliability. #J-18808-Ljbffr Amiri Recruiting
Senior
Amiri Recruiting
Mountain View, CA
1 day ago
Senior Manager, Site Reliability Engineering
$200k - $322k
Senior Manager, Site Reliability Engineering page is loaded## Senior Manager, Site Reliability Engineeringlocations: US, CA, Santa Claratime type: Full... ...Lead the development of automation and orchestration platforms that reduce manual effort across the outage lifecycle...
Senior
NVIDIA Corporation
Santa Clara, CA
3 days ago
Senior Site Reliability Engineer — Scale, Automation & Uptime
$145k - $165k
A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key...
Senior
Bolt Graphics, Inc.
Sunnyvale, CA
3 days ago
Senior Cloud Platform Engineer - Kubernetes & Multi-Cloud
$235k - $250k
...experienced Infrastructure Development professional to design and maintain critical shared services for our mission-critical SaaS platform. You will be instrumental in enabling teams to deliver features faster in a multi-cloud environment. The ideal candidate has 9+ years...
Senior
Medium
Milpitas, CA
1 day ago
Sr Site Reliability Engineer (Internet Security Platform)
$120.3k - $194.53k
.... Job Summary Palo Alto Networks runs a large hybrid infrastructure across multiple public clouds. As a Site Reliability Engineer on the Internet Security Platform team, you will be part of a team supporting Advanced DNS Security services. This includes automation, architecture...
Senior
Full time
Work at office
Visa sponsorship
Work visa
Palo Alto Networks, Inc.
Santa Clara, CA
2 days ago
Senior Wireless Network Platform Engineer
$264.51k
...experience in network management product development and virtualization technologies. This position offers a competitive salary of $264,514 per year and involves collaboration with engineers to create solutions for complex technical challenges. #J-18808-Ljbffr AIRSPAN CAREERS
Senior
AIRSPAN CAREERS
Milpitas, CA
3 days ago
Staff/Sr. Software Engineer, AI, Search & Knowledge Platforms
$203.3k - $305.6k
Staff/Sr. Software Engineer, AI, Search & Knowledge Platforms Santa Clara, California, United States Machine Learning and AI Our... ...product teams to deliver fast, reliable, and intelligent search... ...Description We're looking for a Senior Software Engineer to design and...
Senior
Worldwide
Relocation
Apple
Santa Clara, CA
4 days ago
Senior Systems Software Engineer — AI Server & Firmware
$184k - $356.5k
...design validation, and the development of diagnostic test tools. Successful candidates will possess a Bachelor’s degree in Electrical Engineering or Computer Science plus 8+ years of relevant experience. The position offers a competitive salary range of $184,000-$356,500...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Staff PM - SMB Health Insights & Growth
Intuit Inc. in Mountain View is seeking a Senior Staff PM to spearhead the Small Business Health initiative. This role aims to create a transformative insights product for small businesses to monitor and improve their financial health. The successful candidate will have...
Senior
Intuit Inc.
Mountain View, CA
4 days ago
Senior Staff Cloud Networking Software Architect
$262k - $365k
Google Inc. is looking for a Senior Staff Software Engineer to join their Cloud team in Sunnyvale, CA. In this role, you will provide technical leadership on high-impact projects while managing project priorities and developing large-scale solutions. The ideal candidate...
Senior
Google Inc.
Sunnyvale, CA
2 days ago
Senior Staff, AI Inference Systems (Remote)
$230k - $250k
Cerebras Systems in Sunnyvale, CA, seeks a Sr. Member of Technical Staff to develop resilient software for their AI chip. Responsibilities include designing robust software features, maintaining deployment workflows using AWS, and debugging software issues. Candidates should...
Senior
Remote job
Cerebras
Sunnyvale, CA
10 hours ago
Senior Data Platform & Integration Engineer
Estuate, Inc. in Milpitas, California, is looking for a software engineer with extensive experience in designing and supporting database systems. The role involves analyzing, implementing, and automating testing processes using tools like Databricks, Apache Spark, and...
Senior
Estuate
Milpitas, CA
2 days ago
Senior Staff TPM - Reliability & Multi-Cloud Infra
A global data and AI company is seeking a Senior Staff Technical Program Manager to lead Reliability initiatives within product engineering teams. This role requires over 10 years of experience in managing cloud infrastructure programs and driving improvements in reliability...
Senior
Local area
Databricks
Mountain View, CA
3 days ago
Senior AWS DevOps Engineer
...campaigns within social, email, text, in-app, and print channels. You are a critical part of a team responsible for the success of our platform. On a daily basis, you’ll also collaborate with Marketing, Product, Sales, Operations, Compliance, and Leadership. For Payactiv,...
Senior
Work at office
PayActiv Inc
Milpitas, CA
5 days ago
Senior Staff SWE, Unified Traffic Infra RealTime Networking
$262k - $365k
Google Inc. is seeking a Senior Staff Software Engineer for Unified Traffic Engineering in Sunnyvale, CA. This role involves designing... ...network solutions for Google and Google Cloud Platform (GCP) services, ensuring reliable network availability. The ideal candidate should...
Senior
Google Inc.
Sunnyvale, CA
2 days ago
Senior AWS DevOps Engineer
...earned - when they need it most. Our platform integrates with employers and... ...everyday workers. We are seeking a Senior AWS DevOps & Cloud Security Engineer to own and drive the security,... ...streamline operations and improve reliability. Participate in on-call rotations...
Senior
Work at office
Local area
Jobot
Milpitas, CA
5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior / Staff Site Reliability, Platform Engineering. Be the first to apply!