Senior Platform Reliability Engineer

$182k - $250k

Transformcap

Grow Therapy is on a mission to serve as the trusted partner for therapists growing their practice, and patients accessing high-quality care. Powered by technology, we are a three-sided marketplace that empowers providers, augments insurance payors, and serves patients. Following the mass increase in depression and anxiety, the need for accessibility is more important than ever. To make our vision for mental healthcare a reality, we’re building a team of entrepreneurs and mission-driven go-getters. Since launching in February 2021, we’ve empowered more than ten thousand therapists and hundreds of thousands of clients across the country and insurance landscape. We’ve raised more than $328Mm in funding, including our Series D, at a $3B valuation from Sequoia Capital, Transformation Capital, TCV, SignalFire, Menlo Ventures, Goldman Sachs Alternatives, and others. About the Role We’re hiring a Senior Platform Reliability Engineer to help define and scale reliability as a first-class capability at Grow. In this role you’ll operate horizontally across the organization, shaping how reliability is understood, measured, and built into the developer experience. You’ll work closely with other members of the platform team as well as our product engineering teams to establish standards around observability, SLOs/SLAs, and incident response—while also helping translate those standards into self-service tooling and “golden paths” that make it easy for teams to adopt them. This is a high-impact, highly autonomous role where you’ll drive both cultural and technical change, ultimately enabling teams to independently build and operate reliable systems at scale. What You'll Work On You’ll help us establish and scale reliability as a discipline at Grow by: Defining Reliability Standards Establishing frameworks for SLOs/SLAs, error budgets, and operational readiness; helping teams understand what to measure and why it matters. Improving Observability & Measurement Identifying gaps in metrics, logging, and tracing; ensuring services are measurable, debuggable, and aligned with reliability goals. Evolving Incident Response Developing and improving incident response practices, from detection to post-incident learning, and helping teams build sustainable on-call and escalation patterns. Enabling Self-Service Reliability Partnering with the platform team to build tooling and abstractions (e.g., service scorecards, dashboards, templates, golden paths) that make it easy for teams to adopt and stay compliant with reliability standards. Driving Adoption Across Teams Working cross-functionally to educate, influence, and guide engineering teams—scaling reliability practices through a combination of clear standards, strong communication, and developer-friendly systems Who You Are Experienced in production systems: You have 6+ years of experience operating and improving reliability of production systems at scale. Strong foundation in cloud and infrastructure: You have hands‑on experience with AWS, Kubernetes (e.g., EKS), and infrastructure as code tools like Terraform. Deep understanding of reliability principles: You’ve defined or worked with SLOs/SLAs, understand error budgets, and have experience improving reliability through measurement and iteration. Observability expertise: You’ve worked with modern observability tooling (we use DataDog) and understand how to build actionable monitoring systems across metrics, logs, and traces. Systems thinker: You’re able to zoom out, identify patterns across teams and services, and design solutions that scale beyond a single system. Impact-oriented: You focus on outcomes over output and care deeply about improving real reliability outcomes—not just adding processes. Strong communicator and influencer: You can drive change across teams without direct authority, balancing pragmatism with long-term vision. Self-directed: You thrive in ambiguous environments and are comfortable defining problems, proposing solutions, and executing independently. Team player : You collaborate well, communicate with empathy, and enjoy mentoring and learning from others. Bonus Points You’ve helped introduce or scale reliability practices in a growing organization. You’ve built internal tooling or platforms used by multiple teams. You have experience designing service-level scorecards or compliance/reporting systems. You’ve worked with both SaaS (e.g., DataDog) and self‑managed observability stacks. You were previously a product engineer and bring empathy for developer experience. You have experience with database reliability and performance (we use PostgreSQL) Why This Role Is Exciting This is a rare opportunity to define what reliability looks like at a growing, scaling engineering organization—and to do it in a way that actually sticks. You won’t just be responding to incidents or working within a single team. You’ll be shaping how reliability is measured, enforced, and experienced across the entire company. You’ll work alongside your team mates to turn best practices into intuitive, self-service systems that engineers rely on every day. Your work will directly improve system reliability, reduce incidents, and enable teams to move faster with confidence, ultimately making reliability a built-in property of how we build software at Grow. Role Details Employment Type: Full Time, Exempt Base Compensation: The base compensation range for this position is $182,000–$250,000 USD Annually. This is a hybrid role with the expectation to work onsite from our San Francisco, NYC, or Seattle hub location three days per week (Tuesday, Wednesday, and Thursday) and travel 2–3 times per year (e.g., company and department offsites). The base compensation for this role will vary depending on several factors, including relevant experience, qualifications, and the candidate’s working location. Full Time Employee Benefits: Comprehensive Health Coverage: Medical, dental, and vision insurance, plus life and disability coverage. Parental Leave & Family Support: Up to 18 weeks paid leave and a new child stipend. Financial Wellness: 401(k) program and equity opportunities. Meals & Home Office Support: Stipends for home office setup and ongoing funds for meals, with tailored perks for both remote and in-office employees. Time Off to Recharge: Flexible PTO, 12 paid holidays, and a full winter break week. Wellness & Development: Annual stipends to put towards personal & professional growth. Mental & Physical Health Support: No-cost access to therapy through the Grow platform, weekly flexible hours for self-care (“Mental Health Mornings/Afternoons”) and memberships to leading wellness apps (such as One Medical, Headspace, and Talkspace). Extra Perks: Pet insurance discounts, commuter benefits, and global travel assistance. Research shows that some groups hesitate to apply unless they meet every qualification. If you’re excited about this role but don’t check every box, we encourage you to apply. At Grow, we value diverse experiences, transferable skills, and the unique strengths each person brings. Grow Therapy is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. By submitting your application, you acknowledge and consent to the use of automated tools as part of our recruitment process. Specifically, we use a third‑party AI tool, Gem, to assist in the initial screening of resumes. Importantly, no hiring decisions are made by the AI tool. All decisions about which candidates move forward are made by our human recruiting team after independent review. We are committed to transparency and fairness in our hiring practices. If you have questions about how our AI tools work, or would like more information about how your application will be processed, please contact us at View email address on click.appcast.io. If you require an accommodation due to a disability, or have concerns about the use of AI in the hiring process, please also contact us. We are happy to provide assistance or offer an alternative method of participating in the recruitment process. #J-18808-Ljbffr Transformcap

Apply

Vacancy posted 18 hours ago

Similar jobs that could be interesting for youBased on the Senior Platform Reliability Engineer in San Francisco, CA vacancy

Senior SRE & Platform Engineer for AI-Driven Ops
$163k - $203k
GoTo Meeting is looking for a Senior Site Reliability Engineer in San Francisco. You will be responsible for the reliability, scalability, and security of Prosper’s Cloud Platform portfolio. This role requires expertise in Kubernetes, cloud platforms (preferably GCP), and...
Senior
GoTo Meeting
San Francisco, CA
5 days ago
Senior SRE Platform Engineer for AI-Powered Code Review
An innovative R&D company in San Francisco is seeking a Site Reliability Engineer to join its Platform Engineering team. This position focuses on ensuring the reliability and performance of an AI-powered code review platform. The ideal candidate will have 6-8 years of experience...
Senior
CodeRabbit
San Francisco, CA
3 days ago
Senior GPU HPC Platform Reliability Engineer
A leading AI research company in San Francisco is seeking a software engineer for its Fleet High Performance Computing team. In this role, you'll ensure the reliability and uptime of the compute fleet, working with automation systems and monitoring tools. Ideal candidates...
Senior
OpenAI
San Francisco, CA
1 day ago
Senior Platform Reliability Engineer
$200k - $250k
A leading visual creation platform in San Francisco is seeking a Senior Owner of Stability and Infrastructure. This hands-on technical leadership role demands expertise in service reliability to ensure the platform's performance as it scales. Responsibilities include setting...
Senior
Vizcom
San Francisco, CA
5 days ago
Senior Platform & Reliability Engineer — AI-Native Scale
OpenArt AI in San Francisco is seeking a Senior Platform & Reliability Engineer to design and improve the reliability of its infrastructure. The role emphasizes building and operating production systems while collaborating with product engineers to ensure platform scalability...
Senior
OpenArt AI
San Francisco, CA
4 days ago
Senior Platform & Reliability Engineer (SRE)
$200k - $250k
...unsolicited. About Vizcom Vizcom is a visual creation platform that combines modern web tooling with AI-... ...production infrastructure. We’re hiring a senior owner of stability and infrastructure to ensure the platform is reliable, fast, and resilient as we scale. Role...
Senior
Permanent employment
Vizcom
San Francisco, CA
5 days ago
Senior Platform & Reliability Engineer
Overview Senior Platform & Reliability Engineer OpenArt is an AI Storytelling and Visual Creation Platform used by millions worldwide. We’re building the next generation of creative tools powered by cutting-edge AI, enabling anyone to create videos, visuals, characters...
Senior
Remote work
Worldwide
Visa sponsorship
OpenArt AI
San Francisco, CA
4 days ago
Senior / Staff Site Reliability, Platform Engineering
...identity security, delivering an AI-powered platform that governs and secures access to... ...cloud‑native systems. As a Staff Platform Engineer, you will play a critical role in ensuring... ...technical leadership role. You will own reliability for major platform domains, design...
Senior
Saviynt
San Francisco, CA
10 days ago
Senior Site Reliability Engineer, Platform Infrastructure (Foundations)
...raised to date. About the role Anyscale is looking for a Senior Site Reliability Engineer to join the Infrastructure team. Anyscale aims to provide... ...the critical infrastructure that powers Anyscale’s cloud platform. You will have the opportunity to work on open-source...
Senior
Anyscale
San Francisco, CA
4 days ago
Senior Manager, Site Reliability Engineering - Infrastructure Platform
$232k - $319k
...too, let's talk. The Infrastructure Platform and Shared Services Team Okta authenticates... ...scale the service with great people and reliable, cost-effective, and efficient... ...Accelerate the velocity of SRE and product engineering by developing robust platforms, powerful...
Senior
Permanent employment
Local area
Worldwide
Flexible hours
Okta, Inc.
San Francisco, CA
5 days ago
Sr. Director, SRE Platform Engineering
$202.8k - $327.63k
...Intelligent Agreement Management platform, companies can create, commit, and... ...management (CLM). What you’ll do The Senior Director, SRE Platform Engineering is a senior engineering leader... ...Service Management (ITSM) and Site Reliability Engineering (SRE) capabilities, applying...
Senior
Permanent employment
Contract work
Work at office
Local area
Remote work
2 days per week
DocuSign, Inc.
San Francisco, CA
4 days ago
Senior Offshore Mechanical Reliability Engineer
Hudson Manpower is seeking a Mechanical Engineer - Offshore Reliability for a role involving the improvement of offshore mechanical equipment reliability and performance. This position requires a Bachelor's Degree in Mechanical Engineering and a minimum of 12 years of experience...
Senior
Hudson Manpower
San Francisco, CA
5 days ago
Senior Reliability & DFX Engineer for AI Accelerators
A leading AI research organization in San Francisco is seeking a cross-stack engineer to ensure reliability in next-generation AI systems. This hands-on position requires extensive experience in reliability modeling and DFX architecture to enhance the durability and performance...
Senior
OpenAI
San Francisco, CA
4 days ago
Senior Nix-Based CI/CD Reliability Engineer
Revel is seeking a Senior Software Reliability Engineer in San Francisco to enhance their deployment tooling for zero-downtime releases. You will design and maintain CI pipelines, focus on Nix-based systems, and support high-consequence software delivery. Applicants should...
Senior
Revel
San Francisco, CA
3 days ago
Senior DB Reliability Engineer (Remote)
scribehow.com is seeking a Senior Database Reliability Engineer based in San Francisco (hybrid model). You will own the reliability, performance, and scalability of our data tier and work with a growing engineering team. Your expertise will ensure smooth operations across...
Senior
Remote job
scribehow.com
San Francisco, CA
3 days ago
Senior Site Reliability Engineer
...universally accessible, secure, and affordable. Join us in building a platform that empowers innovators everywhere to turn their visionary... ...computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI infrastructure...
Senior
Hyperbolic Labs
San Francisco, CA
4 days ago
Senior Reliability Engineer - Rotating Equipment
$160k - $190k
Southern Recruiting Solutions, Inc. seeks a Sr. Reliability Engineer based in San Francisco, California. This role requires a Bachelor's in Mechanical Engineering and over 8 years of experience in a chemical plant or refinery. The successful candidate will conduct root...
Senior
Southern Recruiting Solutions, Inc.
San Francisco, CA
5 days ago
Senior Site Reliability Engineer
...About the Role We're looking for an experienced Site Reliability Engineer (SRE) to help us scale our platform with reliability, observability, and operational excellence at the core. You'll partner with engineers and data scientists to build, automate, and maintain...
Senior
Alembic Limited
San Francisco, CA
3 days ago
Senior Site Reliability Engineer - ML/HPC Cloud Platforms
A leading biotechnology firm in South San Francisco is seeking a Site Reliability Engineer to architect and implement Infrastructure as Code (IaC) solutions that enhance cloud-based platform solutions for Machine Learning and HPC workloads. The ideal candidate has extensive...
Senior
3 days per week
Genentech
South San Francisco, CA
5 days ago
Senior Site Reliability Engineer
...landscape. The Role You'll be the infrastructure and reliability engineer on the Data Replication team - a full-stack product team running... ...Own the infrastructure underpinning the Data Replication platform - Kubernetes clusters, CI/CD pipelines, secrets management,...
Senior
Local area
Airbyte
San Francisco, CA
5 days ago
Senior Principal Cloud Infra Reliability Engineer
$261k - $326k
A technology company specializing in AI infrastructure is seeking a Principal Engineer to enhance reliability and scalability of cloud systems. This role demands over 15 years of experience in production engineering or related fields and involves setting technical directions...
Senior
Crusoe
San Francisco, CA
5 days ago
Senior Site Reliability Engineer
...About the job Senior Site Reliability Engineer About the Company Stellar is a decentralized, public blockchain that gives developers the tools to create experiences that are more like cash than crypto. The network is faster, cheaper, and far more energy-efficient...
Senior
TechChain Talent
San Francisco, CA
5 days ago
Senior Site Reliability Engineer
$160k - $250k
...machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS... ...secure infrastructure Manage a diverse array of technology platforms, following best practices and procedures Participate in on-...
Senior
Hive
San Francisco, CA
4 days ago
Senior Site Reliability Engineer
...algorithms that significantly outperforms individual engineers. We combine language models with human ingenuity to push... ...The Role: We are seeking an experienced Site Reliability Engineer to join our Platform Engineering team in the Bay Area. You'll be instrumental...
Senior
CodeRabbit
San Francisco, CA
1 day ago
Senior SRE Engineer: Scale & Reliability (Kubernetes/GCP)
A leading language learning platform is seeking an experienced SRE Engineer to ensure the reliability and resilience of their infrastructure. Responsibilities include leading incident response, improving observability, and collaborating with various teams to enhance platform...
Senior
Speak
San Francisco, CA
3 days ago
Senior Site Reliability Engineer
$181.69k - $213.75k
...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects founders... ...Trusted by 65,000+ companies in 160+ countries, Carta's platform of software and services lays the groundwork so you can...
Senior
Full time
Work at office
Carta
San Francisco, CA
4 days ago
Senior Site Reliability Engineer
US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average of over 12 years of industry experience, the successful candidate will bridge the gap between software development and systems...
Senior
Axiom Pursuits
San Francisco, CA
5 days ago
Senior Site Reliability Engineer - AI-Driven, Scalable Infra
OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes...
Senior
Flexible hours
OutSystems, Inc.
San Francisco, CA
5 days ago
Senior Site Reliability Engineer
$195k - $240k
...Senior Site Reliability Engineer San Francisco (Hybrid) At You.com, we are building the AI Search Infrastructure that powers modern AI systems... ...-time, accurate, and citation-backed information. Our platform combines proprietary vertical indexes with LLM-optimized...
Senior
Full time
Immediate start
Remote work
Work from home
Flexible hours
Y.O.U.
San Francisco, CA
4 days ago
Senior Software Engineer - Site Reliability Engineering
...Udaip Cloud-Based Data And Ai Platform Engineer At U.S. Bank, we're on a journey to do our best. Helping the customers and businesses we serve to make better and smarter financial decisions and enabling the communities we support to grow and succeed. We believe it takes...
Senior
Temporary work
Work experience placement
Phenom People
San Francisco, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Platform Reliability Engineer. Be the first to apply!