Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Software Engineer - Reliability

Rubrik, Inc.

Overview The Site Reliability Engineering (SRE) team at Rubrik ensures the reliability, availability, performance, and security of our enterprise infrastructure services, spanning global SaaS platforms and government-compliant environments. We operate at the intersection of software development and systems engineering, prioritizing automation, self-healing architectures, and resiliency. As a Staff Site Reliability Engineer, you will serve as a primary technical leader and architect across our distributed cloud systems, drive long-term technical roadmaps, establish cross-organizational reliability standards, and solve complex distributed systems challenges that safeguard enterprise and public sector environments. Beyond the core SRE charter, this Staff role also leads the Application-SRE team — a US-based group that partners with engineering, Sales, and Support to unblock POCs, drive complex customer escalations to resolution, and convert recurring field signals into reliability roadmap items. You will be the technical leader and project owner for Application-SRE, setting direction, tracking commitments, and ensuring the team operates as a high-leverage bridge between the field and the broader engineering organization. What You'll Do As a Staff Site Reliability Engineer, you will possess engineering-wide influence and own the following critical areas: Infrastructure Strategy & Architecture : Formulate and execute the architectural vision for Rubrik's Cloud Platform, optimizing backend infrastructure systems like Kubernetes, MySQL, and cloud-native services for performance, security, and multi-region scale. Hyperscale Automation & Platform Tooling : Build, scale, and maintain sophisticated internal tools, platform controllers, and automation frameworks in Go or Python to systematically reduce operational toil. Cross-Functional Leadership : Wield engineering-wide influence to create technical consensus among component, platform, and security teams, embedding resilience, capacity guards, and compliance from the initial designs. Reliability Governance : Define, audit, and enforce SLIs, SLOs, and error budgets across critical services, translating telemetry insights into actionable roadmaps during executive reviews. Incident Command & Operations Review : Serve as a primary Incident Commander for high-severity outages, directing mitigation and conducting blameless post-mortems to drive durable fixes. Cost Governance & Capacity Modeling : Architect cost-observability tools and attribution frameworks, leading capacity forecasting, quota optimization, and vendor SLA management. Application-SRE Leadership : Set technical direction for the Application-SRE team, elevating how the team diagnoses, mitigates, and resolves complex customer-impacting issues. Technical Multiplier & Mentorship : Champion SRE best practices, mentor engineers across the organization, participate in interview processes, and raise the technical bar. On-Call Rotations : Participate in on-call rotations. Experience You'll Need Citizenship & Residency : Must be a US Citizen residing in CONUS. Education : BS, MS, or PhD in Computer Science, Computer Engineering, or related technical discipline. Industry Experience : 8–12+ years of software engineering and production cloud infrastructure, with at least 5+ years in an SRE, DevOps, or Platform role for hyperscale SaaS. Technical Depth : Proficiency in Go, Python, or Java with strong knowledge of concurrency, data structures, and testing patterns. Distributed Systems Expertise : Experience designing and operating large-scale distributed systems, databases, and high-availability cloud architectures. Systems Internals : Strong command of Unix/Linux, systems administration, and networking. Field-to-Product Feedback : Ability to translate customer escalations and POCs into product and reliability improvements. Customer & Field Fluency : Experience partnering with Sales, Support, and customers on escalations and POCs. Leadership Capability : Track record of technical leadership, mapping architectural dependencies, and guiding multi-team initiatives. Preferred Qualifications Extensive production experience with Kubernetes deployments (GKE, EKS) and large-scale databases. Experience with compliance frameworks (e.g., FedRAMP, SOC 2, ISO 27001, CJIS). Fluency in Infrastructure-as-Code (Terraform, Pulumi) and observability tooling (Prometheus, Grafana, OpenTelemetry). Equal Opportunity Rubrik is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, or disability. We provide equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics. In addition to applicable federal law, Rubrik complies with state and local nondiscrimination laws. Reasonable accommodations are available to qualified applicants with disabilities. For accommodations, contact View email address on click.appcast.io. #J-18808-Ljbffr Rubrik, Inc.

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Staff Software Engineer - Reliability in Palo Alto, CA vacancy
  • $218.3k - $327.5k

    ABOUT TEAM & ABOUT ROLE The Site Reliability Engineering (SRE) team at Rubrik ensures the absolute...  ...environments. We operate at the intersection of software development and systems engineering,...  ..., and structural resiliency. As a Staff Site Reliability Engineer, you will... 
    Suggested
    Local area
    Shift work

    Rubrik

    Palo Alto, CA
    1 day ago
  • $160.2k - $290.7k

     ...platform team develops the first layers of software on the GM Autonomous Vehicles from...  ...vehicle platforms. Role As a Staff Software Engineers, you are the expert professionals...  ...scalable, flexible, cost-efficient, and reliable foundation is critical. This role will... 
    Suggested
    Work experience placement
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    2 days ago
  • $207k - $300k

    Staff Software Engineer, Site Reliability Engineering, Traffic Virtnet corporate_fare Google place Sunnyvale, CA, USA Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. 8 years of experience with software development in one or... 
    Suggested
    Full time

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $207k - $300k

    Staff Software Engineer, Site Reliability Engineering corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. 8 years of experience with software development in one or more programming... 
    Suggested
    Full time

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $170k - $216k

     ...S. states. The Planner/Perception Reliability team builds out architectures, tools,...  ...reliability and is accountable for onboard software health while ensuring high...  ...this hybrid role you will report to a Staff Software Engineer / Tech Lead Manager. You will:... 
    Suggested
    Full time
    Immediate start
    Remote work

    Waymo

    Mountain View, CA
    5 days ago
  • $180k - $250k

     ...encompassing solution that integrates advanced software and hardware powering the fleet,...  ...the role We are seeking a Senior/Staff Software Engineer to join our Localization team to build...  ...and other sensors to create a robust, reliable localization pipeline. Develop and... 
    Odd job
    Work at office

    Gatik AI

    Mountain View, CA
    4 days ago
  • $218.8k - $335.3k

     ...systems to intuitive design, intelligent software, and next-generation safety and...  ...validation, and align multiple teams to ship reliable, scalable autonomy capabilities that meet...  ...Lead technical reviews and drive software engineering best practices across the team and the larger... 
    Work experience placement
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    1 day ago
  • $110k - $230k

     ...Duck Creek Software Engineer At GEICO, we offer a rewarding career where your ambitions are met...  ...emphasis on improving its scalability, reliability, and efficiency. Responsibilities In this role, you will serve as a Staff Engineer, driving secure, performant, and... 
    Hourly pay
    Work experience placement
    Flexible hours

    GEICO

    Palo Alto, CA
    5 days ago
  • $132.8k - $250.8k

     ...Staff Software Engineer We made history and now we work to transform the future – for our customers, our communities and our families. You...  ...build web applications and services that result in useful, reliable, secure products for our customers. This team creates back-... 
    Full time
    Immediate start
    Flexible hours

    Ford Motor Company

    Palo Alto, CA
    4 days ago
  • $206.5k - $258.1k

     ...protect it for future generations. Role Summary As a Staff HIL Software Engineer, you will directly architect and own the verification strategy and infrastructure, driving the quality and reliability of our mission-critical automotive software solutions. You... 
    Full time
    Contract work
    Temporary work
    Part time
    Local area
    Shift work

    Rivian

    Palo Alto, CA
    5 days ago
  • $185k - $260k

     ...generation, blood-based tests that are reliable, accessible and deliver a new...  ...major systems with a focus on software and cloud architecture. You'll partner across Engineering, Data Science, R&D, Regulatory,...  ...components. Partner with staff engineers on test design and... 
    Flexible hours
    3 days per week

    DELFI Diagnostics

    Palo Alto, CA
    3 days ago
  • $220k - $303.6k

     ...Google, including YouTube's monetization engine and key search advertising technologies,...  ...enterprise platform that remains fast, reliable and secure at scale. Build and maintain...  ...the Role is Right For Me? ~7+ years software development experience using one or more... 
    Temporary work
    Work at office
    Flexible hours

    Moloco

    Menlo Park, CA
    2 days ago
  • $200k - $250k

     ...Staff Software Engineer, Embedded Mountain View, CA Kodiak Robotics, Inc. was founded in 2018 and has become a leader in autonomous ground...  ...devices that interface with our truck systems, ensure reliable operations of large robots, and provide cutting edge functionality... 
    Temporary work
    Work at office
    Visa sponsorship
    Flexible hours

    Kodiak

    Mountain View, CA
    3 days ago
  • $144.25k - $256.25k

     ...Staff Software Engineer New York, NY, United States Phoenix, AZ, United States Palo Alto, CA, United States (Hybrid) Job Description...  ...optimizes development processes to ensure high-quality, reliable, and efficient software systems. Responsibilities Lead... 
    Full time
    Work at office
    Local area
    Visa sponsorship
    Flexible hours
    Shift work

    American Express

    Palo Alto, CA
    7 days ago
  • $228.6k - $314.25k

     ...recommendations that help customers run workloads reliably at scale. Beyond query observability,...  ...across all these surfaces, raising the engineering bar of the combined team, and shaping...  .... Champion reliable, high-quality software and the operational practices that let a... 
    Worldwide

    Databricks

    Mountain View, CA
    1 day ago
  • $200k - $250k

     ...Staff Software Engineer, Controls Kodiak Robotics, Inc. was founded in 2018 and has become a leader in autonomous ground transportation committed...  ...& novel techniques Architect, develop, and test reliable, redundant, and safety-critical software that controls fully... 
    Temporary work
    Work at office
    Visa sponsorship
    Flexible hours

    Kodiak

    Mountain View, CA
    3 days ago
  • $192k - $260k

     ...help address several major challenges, including data staleness, reliability, total cost of ownership, data lock-in, and limited use-case...  ...this vision is the next generation (decoupled) query engine and structured storage system that can outperform specialized... 
    Local area
    Worldwide

    Databricks

    Mountain View, CA
    2 days ago
  • $209.7k - $283.8k

     ...To support this growth, we need strong technical ownership to ensure our ML pipelines remain reliable, scalable, and architecturally sound. We are seeking a staff ML engineer to design and evolve the large-scale offline platform. This role focuses on building... 
    Work at office
    Worldwide
    Relocation package

    Unity

    Mountain View, CA
    2 days ago
  • $238k - $302k

     ...Staff Software Engineer, Multiverse Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver...  ...-of-memory", that would affect the Waymo Driver's system reliability in the real world. You Will: Oversee and... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    1 day ago
  • $161.71k - $234.33k

     ...Staff Software Engineer, Test Mountain View, CA We are CARIAD, an automotive software development team with the Volkswagen Group. Our...  ...validation results, and improve software quality, performance, and reliability within assigned systems and programs. The Staff... 
    Permanent employment
    Temporary work
    Early shift

    CARIAD, Inc.

    Mountain View, CA
    6 days ago
  • $189k - $303k

     ...accessible for all. We are searching for an exceptional Staff-level Backend Software Engineer to join the Aurora Services Engineering team and take...  ...infrastructure to scale our products with high availability and reliability. Collaborate with stakeholders including Security,... 
    Work at office
    Local area
    Remote work
    3 days per week

    Aurora Innovation

    Mountain View, CA
    5 days ago
  • $80 - $100 per hour

     ...Fortune 500 brands in the world. Job Title: Senior Backend Engineer/ Simulation Software Engineer. Duration: 6+ months Location: Sunnyvale,...  ...quality through testing, performance profiling, reliability patterns, observability, secure coding, and maintainability... 

    LeadStack Inc.

    Mountain View, CA
    2 days ago
  •  ...Autoscience Institute Senior Or Staff Software Engineer (Technical Lead) At Autoscience Institute, we automate AI research. Recently, we announced...  ...research agents. You'll be responsible for building reliable backend infrastructure and production systems that support... 
    Full time

    Autoscience Institute

    Menlo Park, CA
    5 days ago
  • $190k - $261.25k

     ...business impact. Founded by engineers and driven by customer...  ...only getting started. As the Staff Technical Lead (TL) for Customer...  ...quality, safety, and reliability standards Design agentic workflows...  ...engineers and SMEs across Software and Support Engineering functions... 
    Local area
    Worldwide

    Databricks

    Mountain View, CA
    4 days ago
  • $281k - $356k

     ...Senior Staff Software Engineer Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver....  ...champion best practices in data engineering, including system reliability, efficiency, developer experience, and innovation to... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    1 day ago
  • $150k - $226k

     ...Harness is the AI Software Delivery Platform company, led by technologist and entrepreneur...  ..., deployments, application security, reliability, compliance, and cost optimization....  ..., and real-time risk detection to help engineering teams ensure software integrity, prevent... 
    Local area
    Immediate start
    Flexible hours
    Shift work

    Harness

    Mountain View, CA
    4 days ago
  • $281k - $356k

     ...Senior Staff Software Engineer, TLM Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver...  ...the multi-year roadmap for signal quality, coverage, and reliability. Your mandate is to ensure our signals don't just match... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    2 days ago
  • $112.68k - $212.76k

     ...Staff Software Engineer Ford's IVI (In-Vehicle Infotainment) organization is seeking a Staff Software Engineer with deep technical expertise...  ...and optimize software for embedded platforms, ensuring reliability and real-time performance in compute-constrained ECU environment... 
    Immediate start
    Flexible hours

    Ford Motor Company

    Palo Alto, CA
    1 day ago
  • $152k - $248k

     ...tier scalable, high-volume performing, and reliable user-centric applications that operate 24x7. You will produce high quality software that is unit tested, code reviewed, and...  ...leadership, driving and performing best engineering practices to initiate, plan, and execute... 
    For contractors
    Work experience placement
    Work at office
    Flexible hours

    LinkedIn

    Mountain View, CA
    3 days ago
  •  ...The team lives at the intersection of platform engineering discipline and developer-facing product thinking. As a Staff Engineer on the Developer Experience team,...  ...build on — with a strong bias toward generality, reliability, and clean abstractions Drive measurable... 
    Work at office
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    14 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer - Reliability. Be the first to apply!