Senior Platform Reliability Engineer
$182k - $250kGrow Therapy
Grow Therapy is on a mission to serve as the trusted partner for therapists growing their practice, and patients accessing high-quality care. Powered by technology, we are a three-sided marketplace that empowers providers, augments insurance payors, and serves patients. Following the mass increase in depression and anxiety, the need for accessibility is more important than ever. To make our vision for mental healthcare a reality, we’re building a team of entrepreneurs and mission-driven go-getters. Since launching in February 2021, we’ve empowered more than ten thousand therapists and hundreds of thousands of clients across the country and insurance landscape. We’ve raised more than $328Mm in funding, including our Series D, at a $3B valuation from Sequoia Capital, Transformation Capital, TCV, SignalFire, Menlo Ventures, Goldman Sachs Alternatives, and others. About the Role We’re hiring a Senior Platform Reliability Engineer to help define and scale reliability as a first-class capability at Grow. In this role you’ll operate horizontally across the organization, shaping how reliability is understood, measured, and built into the developer experience. You’ll work closely with other members of the platform team as well as our product engineering teams to establish standards around observability, SLOs/SLAs, and incident response—while also helping translate those standards into self-service tooling and “golden paths” that make it easy for teams to adopt them. This is a high-impact, highly autonomous role where you’ll drive both cultural and technical change, ultimately enabling teams to independently build and operate reliable systems at scale. What You'll Work On You’ll help us establish and scale reliability as a discipline at Grow by: Defining Reliability Standards Establishing frameworks for SLOs/SLAs, error budgets, and operational readiness; helping teams understand what to measure and why it matters. Improving Observability & Measurement Identifying gaps in metrics, logging, and tracing; ensuring services are measurable, debuggable, and aligned with reliability goals. Evolving Incident Response Developing and improving incident response practices, from detection to post-incident learning, and helping teams build sustainable on-call and escalation patterns. Enabling Self-Service Reliability Partnering with the platform team to build tooling and abstractions (e.g., service scorecards, dashboards, templates, golden paths) that make it easy for teams to adopt and stay compliant with reliability standards. Driving Adoption Across Teams Working cross-functionally to educate, influence, and guide engineering teams—scaling reliability practices through a combination of clear standards, strong communication, and developer-friendly systems Who You Are Experienced in production systems: You have 6+ years of experience operating and improving reliability of production systems at scale. Strong foundation in cloud and infrastructure: You have hands‑on experience with AWS, Kubernetes (e.g., EKS), and infrastructure as code tools like Terraform. Deep understanding of reliability principles: You’ve defined or worked with SLOs/SLAs, understand error budgets, and have experience improving reliability through measurement and iteration. Observability expertise: You’ve worked with modern observability tooling (we use DataDog) and understand how to build actionable monitoring systems across metrics, logs, and traces. Systems thinker: You’re able to zoom out, identify patterns across teams and services, and design solutions that scale beyond a single system. Impact-oriented: You focus on outcomes over output and care deeply about improving real reliability outcomes—not just adding processes. Strong communicator and influencer: You can drive change across teams without direct authority, balancing pragmatism with long-term vision. Self-directed: You thrive in ambiguous environments and are comfortable defining problems, proposing solutions, and executing independently. Team player : You collaborate well, communicate with empathy, and enjoy mentoring and learning from others. Bonus Points You’ve helped introduce or scale reliability practices in a growing organization. You’ve built internal tooling or platforms used by multiple teams. You have experience designing service-level scorecards or compliance/reporting systems. You’ve worked with both SaaS (e.g., DataDog) and self‑managed observability stacks. You were previously a product engineer and bring empathy for developer experience. You have experience with database reliability and performance (we use PostgreSQL) Why This Role Is Exciting This is a rare opportunity to define what reliability looks like at a growing, scaling engineering organization—and to do it in a way that actually sticks. You won’t just be responding to incidents or working within a single team. You’ll be shaping how reliability is measured, enforced, and experienced across the entire company. You’ll work alongside your team mates to turn best practices into intuitive, self-service systems that engineers rely on every day. Your work will directly improve system reliability, reduce incidents, and enable teams to move faster with confidence, ultimately making reliability a built-in property of how we build software at Grow. Role Details Employment Type: Full Time, Exempt Base Compensation: The base compensation range for this position is $182,000–$250,000 USD Annually. This is a hybrid role with the expectation to work onsite from our San Francisco, NYC, or Seattle hub location three days per week (Tuesday, Wednesday, and Thursday) and travel 2–3 times per year (e.g., company and department offsites). The base compensation for this role will vary depending on several factors, including relevant experience, qualifications, and the candidate’s working location. Full Time Employee Benefits: Comprehensive Health Coverage: Medical, dental, and vision insurance, plus life and disability coverage. Parental Leave & Family Support: Up to 18 weeks paid leave and a new child stipend. Financial Wellness: 401(k) program and equity opportunities. Meals & Home Office Support: Stipends for home office setup and ongoing funds for meals, with tailored perks for both remote and in-office employees. Time Off to Recharge: Flexible PTO, 12 paid holidays, and a full winter break week. Wellness & Development: Annual stipends to put towards personal & professional growth. Mental & Physical Health Support: No-cost access to therapy through the Grow platform, weekly flexible hours for self-care (“Mental Health Mornings/Afternoons”) and memberships to leading wellness apps (such as One Medical, Headspace, and Talkspace). Extra Perks: Pet insurance discounts, commuter benefits, and global travel assistance. Research shows that some groups hesitate to apply unless they meet every qualification. If you’re excited about this role but don’t check every box, we encourage you to apply. At Grow, we value diverse experiences, transferable skills, and the unique strengths each person brings. Grow Therapy is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. By submitting your application, you acknowledge and consent to the use of automated tools as part of our recruitment process. Specifically, we use a third‑party AI tool, Gem, to assist in the initial screening of resumes. Importantly, no hiring decisions are made by the AI tool. All decisions about which candidates move forward are made by our human recruiting team after independent review. We are committed to transparency and fairness in our hiring practices. If you have questions about how our AI tools work, or would like more information about how your application will be processed, please contact us at View email address on click.appcast.io. If you require an accommodation due to a disability, or have concerns about the use of AI in the hiring process, please also contact us. We are happy to provide assistance or offer an alternative method of participating in the recruitment process. #J-18808-Ljbffr Grow Therapy
- WP Engine is searching for a Production Engineer to join our engineering team in the United States. The ideal candidate will... ...of cloud technologies. The role involves building reliability into our platform, debugging issues, and maintaining automation tools. With...Senior
- Mosaec is seeking a talented Platform/Site Reliability Engineer for a remote position. You will work with startups across the US and Europe, focusing on your skills without client negotiation hassles. The ideal candidate will have extensive experience in Platform Engineering...SeniorRemote jobFlexible hours
- ...Flutter Entertainment in a challenging role focused on building cloud infrastructure and improving system reliability. As part of our Site Reliability Engineering team, you will design and execute strategies that enhance our cloud services using modern technologies and...SeniorRemote job
$198.03k - $287.95k
Calendly is looking for a Site Reliability Engineer to enhance its innovative infrastructure platform. This role will empower teams by enabling best practices in monitoring and optimizing resources. The ideal candidate will have robust experience with cloud technologies...Senior$160k - $185k
Accela, Inc. is seeking a Principal Site Reliability Engineer who will lead initiatives to enhance the reliability, scalability, and operational excellence of its Civic Platform. This role requires collaboration with multiple teams to modernize infrastructure and maintain...SeniorFlexible hours- ...Full time Location Type Remote Department Engineering About Chainlink Chainlink is the industry-standard oracle platform bringing the capital markets onchain and powering... ...you will be a part of that growth to ensure reliability and security remain at the forefront of...SeniorFull timeRemote work
- United States Digital Space LLC is seeking a Senior Engineer in New York, NY, to become part of the newly formed reliability team. This role emphasizes incident leadership and operational excellence, serving as a crucial element in enhancing the company's response to incidents...Senior
$160k - $240k
Bloomberg L.P. is seeking a Senior Software Engineer to join the Trade Automation & Execution Reliability team in New York. This role focuses on ensuring high-performance... ...with various teams to optimize trading platforms. This position offers a competitive salary range...Senior- A technology infrastructure company is seeking a Storage Systems Engineer to own and improve the performance and reliability of their storage systems. Responsibilities include ensuring reliability for critical workloads, designing lifecycle strategies, and enhancing tooling...SeniorRemote job
- ...initiatives while building a public cloud platform from scratch? Would you like to own... ...cloud platform? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate... ...in our SRE team: SRE I → SRE II → Senior → Senior II → Principal → Senior Principal...SeniorWork at officeRemote work
$200k - $240k
...to reduce friction everywhere in healthcare. Our LLM‑powered platform is solving chart review once and for all, across use cases.... ...medicine. Job Description We’re hiring an experienced Site Reliability Engineer for our Boston or NYC office! You can expect to: Design, build...SeniorWork at office$150k - $170k
...Senior Site Reliability Engineer – Zip Co Join to apply for the Senior Site Reliability Engineer role at Zip Co At Zip, we build cloud‑native software applications that serve millions of customers and process billions of dollars in payments. We’re looking for a seasoned...SeniorCasual workWork at officeRemote workFlexible hours- Insight Global is seeking a professional for the SRE Dashboard/AI Agent Analysis position in New York, NY. The role involves developing and maintaining web applications utilizing Java and Spring, while also designing RESTful APIs and collaborating with a diverse team. Candidates...Senior
- P2P.org is looking for an experienced Site Reliability Engineer to enhance our scalable, secure, and automated infrastructure. This fully remote role requires collaboration across multiple engineering teams to address complex reliability challenges and implement automation...SeniorRemote job
- Resilience, a technology-focused biomanufacturing company, is seeking a Reliability Engineer III to develop and oversee maintenance strategies for packaging and inspection assets. This role will focus on improving equipment reliability metrics while collaborating with...Senior
- The Consulting Solutions is seeking an experienced Senior / Staff Engineer for our SRE, InfraSec team in Seattle. The role involves leading the... ...with a focus on security, along with strong cloud platform skills. This position offers a competitive salary and the possibility...SeniorRemote job
$175k - $245k
A leading asset management firm in New York is seeking a Site Reliability Engineer to ensure high availability of technology services. The ideal candidate will have experience with AWS, Docker, and various operating systems. This role includes responsibilities like streamlining...Senior- ...Job Description A major financial services company in NYC is growing its team rapidly, and they are looking for a Senior DevOps Engineer / Site Reliability Engineer who can join. If you’re passionate about high-availability, reliability, automation, we’d be excited...Senior
- A leading technology firm is seeking a Sr. Site Reliability Engineer in the United States. The ideal candidate will enhance system reliability... ...in site reliability engineering. The position covers cloud platforms, container orchestration, and system optimization,...Senior
- A cloud computing firm is seeking a Senior Engineer to ensure the efficiency and reliability of their data center infrastructure. The role demands strong analytical abilities, problem-solving skills, and the capacity to influence stakeholders. Responsibilities include...SeniorRemote work
- Tavily Inc. in New York City is seeking a Senior Site Reliability Engineer to manage Kubernetes clusters and own the full infrastructure. You will improve CI/CD pipelines and ensure systems are reliable and scalable. This role offers the chance to work on real scaling...Senior
- ...adaptive training and intervention. For higher-risk users, our platform integrates seamlessly with the broader security stack to... ...safer, more resilient organizations. The Role: As a Senior Site Reliability Engineer (SRE) at Dune Security, you will play a critical role in...SeniorFull timeWork at office
- Solidus Labs in New York is seeking a skilled DevOps/SRE to enhance the reliability of their production systems. This role requires expertise in Docker, Kubernetes, and AWS, combined with a collaborative approach towards operational excellence. You will manage critical...Senior
- Upstart is seeking a Senior Software Engineer focused on Site Reliability Tooling. This role involves enhancing the reliability and observability of our production systems while working closely with other engineers at Upstart. Qualifications include a minimum of 6 years...SeniorRemote job
$127k - $249k
...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions... ..., alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper)....SeniorWork at officeLocal areaRemote workWorldwideFlexible hours- Rockefeller Capital Management in New York is seeking a Control Engineer to oversee the performance and reliability of security controls. This role emphasizes operational excellence and proactive risk management to ensure high availability. The ideal candidate will have...Senior
$150k - $200k
...Join to apply for the Senior Site Reliability Engineer role at Gradle Inc. Develocity is a first‑of‑its‑kind toolchain observability and acceleration platform that helps software teams adopt and improve DORA capabilities (including continuous delivery) in order to achieve...SeniorFull timeLocal areaRemote workWork from home- Abacus Insights Inc. is looking for a Senior Sustaining & Forward Deployed Engineer to ensure production operations and incident responses are handled effectively... ...to troubleshoot production issues and enhance system reliability. Benefits include unlimited paid time off and a...SeniorRemote job
$165k - $215k
...agencies, more than 30% of the world’s top MSSPs rely on our platform. We’re at the forefront of protecting organizations... ...in the market. We are seeking a highly skilled Senior DevOps / Site Reliability Engineer (SRE) to join our globally distributed engineering organization...Senior$180k - $200k
Parabola is looking for a Senior Site Reliability Engineer to improve performance and reliability of its software systems in New York. This role requires 5+ years of SRE or DevOps experience and expertise in AWS and containerization tools. Offering a salary of $180,000...SeniorWork at office3 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Platform Reliability Engineer. Be the first to apply!
- platform developer New York, NY
- senior platform engineer New York, NY
- platform engineering manager New York, NY
- platform engineer New York, NY
- client platform engineer New York, NY
- data platform engineer New York, NY
- network reliability engineer New York, NY
- principal reliability engineer New York, NY
- reliability maintenance engineering technician New York, NY
- reliability engineering manager New York, NY


