Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

Zello

About Zello Zello is a voice-first communication platform, powered by our industry-leading push-to-talk technology, to improve collaboration and productivity for desk‑less workers. With over 175+ million users, we’re the #1 rated push-to-talk app in the world, delivering 9 billion (yes, with a B) messages a month. At Zello, our company values are at the heart of what we do everyday. We’re proud to serve the frontline, we’re privileged to connect people in times of crisis across the globe, and we’re honored to support first responders. And this is where you come in. We're seeking a Senior Site Reliability Engineer who can own our data tier at high availability while also pulling weight across the broader platform. As Zello scales, the line between "database problem" and "platform problem" keeps blurring. We want someone who can sit on either side of it. This hire owns our data tier reliability (MySQL, MongoDB, ScyllaDB, Elasticsearch, Redis) and contributes to monitoring, on‑call, and our ongoing cloud modernization efforts. About Zello Zello is the leading push‑to‑talk communication platform, enabling instant voice communication for frontline workers across hospitality, logistics, transportation, construction, and public safety. When a hotel manager radios housekeeping or a trucker calls dispatch, they're on Zello — and they need it to work every time. The Platform team builds and operates the infrastructure that makes that possible. Databases sit at the center of that promise: every channel, every message, every login depends on them. The Role You'll join the Platform team and report to the Director of Platform Engineering. You'll own the reliability of our MySQL and MongoDB footprint across Google Cloud, work alongside application engineers on performance and schema decisions, and contribute to the broader platform, observability with Prometheus, Loki, and Tempo; on‑call; incident response. This role suits someone who likes operating real production systems, doesn't get stage fright in incidents, and writes the runbook for the next person who hits the same problem. We're investing in AI to compress incident response, build agents and tooling that speed up root‑cause analysis, and lift developer productivity across engineering. We want someone curious about what that looks like for an SRE and excited to help shape it. After a Successful First Year, You Will Have: Operated Zello's MySQL and MongoDB clusters to documented availability targets, with automated backups, regularly tested restores, and failover the on‑call team trusts under real incident pressure. Cut latency or capacity cost on at least one critical database workload through measurable performance work — index strategy, query tuning, schema changes, or sharding. Extended our Observability coverage so incidents are diagnosed in minutes rather than hours, with dashboards and alerts the team actually uses. Owned a slice of the Platform on‑call rotation and led postmortems that turned recurring incidents into permanent fixes. What You'll Do Design, deploy, and operate highly available MySQL and MongoDB clusters across our cloud environments; replication, sharding, backups, point‑in‑time recovery, upgrades, and disaster recovery. Tune query performance, schema, and index strategy in partnership with application engineers and push fixes upstream into the application when that's the right answer. Extend our observability stack — Prometheus, Loki, and Tempo — so the data tier is as well instrumented as the application tier, and traces actually reach the root cause. Participate in the Platform on‑call rotation, lead incident response for data‑tier issues, and write postmortems that drive durable change. Improve disaster recovery, security posture, and compliance for our database footprint — encryption, access control, audit logging, backup integrity. Evaluate and operate ScyllaDB/Cassandra and Elasticsearch where they fit the workload, and bring an opinion on when they don't. Write the automation, tooling, and operators that take repetitive work off the team's plate. Use AI to compress incident response and root‑cause analysis; building agents, automation, and developer‑enablement tooling that scale the team's reliability work. Who You Are You've operated highly available MySQL and MongoDB in production at scale; replication, sharding, backups, point‑in‑time recovery, and failover drills you've actually run, not just designed on paper. You diagnose database performance end‑to‑end; query plan, indexes, locking, OS, storage, network — and can point to specific incidents where you found and fixed root cause that others had missed. You've shipped meaningful work on at least two of bare metal Linux, containerized workloads (Docker, Kubernetes, or similar), and a major cloud (GCP preferred; AWS or Azure equivalent is fine). You instrument what you build. You've used Prometheus, OpenTelemetry, or comparable systems to close real incidents, and you've written the dashboard the next on‑call engineer will actually open. You write code that runs in production: Python, Go, Bash, or similar for automation, tooling, or operators. You don't hand off scripting to someone else. You communicate clearly under pressure and after the fact. Your postmortems are blameless, specific, and lead to changes that stick — and the people you've worked with describe collaborating with you as straightforward. You bring an opinion on managed vs. self‑managed databases, and can defend the trade‑off based on availability, cost, and operational burden. 7+ years in SRE, DevOps, platform, infrastructure, or database reliability roles, with at least 3 years owning production databases. BSc in Computer Science or equivalent practical experience. ScyllaDB/Cassandra or Elasticsearch experience is a plus. You've used AI tooling: copilots, agents, or custom automation to expedite incident response, root‑cause analysis, or developer workflows. We hire for potential, passion for our mission, and a knack for solving difficult problems over checking every qualification box. We have competitive pay, equity with significant upside, and intentionally design our benefits to encourage healthy and well‑balanced employees, flexible schedules and time off. We even offer a sabbatical after every five years of service so you’re able to pursue and enjoy what matters most to you. And of course, we wouldn’t be a technology company without a ping‑pong table and free snacks in our break room. Join us! Zello provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. All Zello personnel are required to comply with defined security, privacy, and compliance requirements applicable to their role along with requirements that are applicable to all Zello personnel. #J-18808-Ljbffr Zello

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Austin, TX vacancy
  • Apex Fintech Solutions UK Ltd. is hiring a Senior Site Reliability Engineer based in Austin, Texas. This role involves leading Kubernetes deployment, managing configuration systems, and automating IT workflows to enhance system reliability. The ideal candidate holds a... 
    Senior

    Apex Fintech Solutions UK Ltd.

    Austin, TX
    10 hours ago
  • A government agency in Texas is seeking a Site Reliability Engineer to ensure the reliability and performance of production systems. The role requires extensive experience in systems engineering and DevOps, proficiency in programming languages, and knowledge of cloud platforms... 
    Senior

    Pedigo Staffing Services

    Austin, TX
    3 days ago
  •  ...exceptional interactions, smarter decision-making, and accelerated growth in the AI-driven world. We’re looking for a Senior Site Reliability Engineer to help build and scale a high-impact SRE function. You’ll be a technical leader on a team responsible for improving... 
    Senior

    Elea Ecuador

    Austin, TX
    4 days ago
  •  .... Our infra has to match. The role We’re looking for a Senior SRE to own the reliability, scalability, and operational posture of Satsuma’s multi...  ...AI‑assisted development workflows Partner closely with engineering on reliability reviews and architecture decisions 5‑8 years... 
    Senior

    Satsuma

    Austin, TX
    2 days ago
  •  ...infrastructure and/or service according to terms for reliability and functionality. Assists team members...  ...deployments. Gains basic knowledge of site reliability trends and shares relevant...  ...to identify and elevate issues to senior team members. Collects and reviews basic... 
    Senior
    Immediate start
    Shift work

    Ll Oefentherapie

    Austin, TX
    4 days ago
  • $185k - $225k

    We are looking for an experienced engineer with strong Linux and system-level expertise who can operate autonomously in complex production...  ..., and observability. We are looking for a hands‑on Site Reliability Engineer (SRE) with a strong background in Linux infrastructure... 
    Senior
    Work at office

    Bumble Inc.

    Austin, TX
    3 days ago
  • About the Role We are looking for a Senior SRE to serve as the operations owner for the...  ...developer tooling ecosystem that shapes how engineers work day to day, including Python and ....  ...them. What You’ll Work On Operations & Reliability: Serve as a primary escalation point for... 
    Senior

    Dimensional Fund Advisors

    Austin, TX
    3 days ago
  • Senior Site Reliability Engineer - Trustwise (Austin) About Trustwise: At Trustwise, we are deeply committed to building an AI Trust layer that helps companies unlock Generative AI’s full potential. Our software helps enterprises deploy AI systems that are safe, aligned... 
    Senior
    Remote work

    trustwise Inc.

    Austin, TX
    3 days ago
  • About the Role We are looking for a Senior SRE to join our Platform Engineering team as the operations owner of our observability platforms. You’ll be responsible for the reliability, scalability, and continued evolution of the tools that give our engineering organization... 
    Senior

    Dimensional Fund Advisors

    Austin, TX
    3 days ago
  •  ...passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior Site Reliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users.... 
    Senior
    Permanent employment
    Remote work
    Work from home
    Flexible hours

    NinjaOne

    Austin, TX
    1 day ago
  •  ...constantly striving to make the most reliable and scalable systems possible to ensure...  ...ahead and we’re looking for a passionate Site Reliability Engineer to join our team in Dallas, TX or...  ...years of progressive experience as a Senior SRE or DevOps Lead (or equivalent role... 
    Senior
    Local area

    Traveltechessentialist

    Austin, TX
    1 day ago
  • About the Role Position: Senior Site Reliability Engineer Location: 2010 E. 6th Street, Austin, Texas 78702 Responsibilities Lead the deployment and management of Kubernetes environments, using tools such as Google Kubernetes Engine and Rancher to enhance system scalability... 
    Senior

    Apex Fintech Solutions UK Ltd.

    Austin, TX
    10 hours ago
  •  ...Presented by US FinTech Awards The World’s Top 250 Fintech Companies 2024 - Presented by CNBC About This Role Position : Senior Site Reliability Engineer Location : 2010 E. 6th Street, Austin, Texas 78702 Responsibilities : Lead the deployment and management of Kubernetes... 
    Senior
    Work from home

    Peak6 Investments LLC

    Austin, TX
    10 hours ago
  • $152k - $241.5k

    Senior Site Reliability Engineer - HPC page is loaded## Senior Site Reliability Engineer - HPClocations: US, CA, Santa Clara: US, TX, Austin: US, NC, Durhamtime type: Full timeposted on: Posted Todayjob requisition id: JR2013271NVIDIA has been transforming computer graphics... 
    Senior

    NVIDIA Corporation

    Austin, TX
    10 hours ago
  • Site Reliability Engineer (Associate / Intermediate / Senior) Site Reliability Engineer Associate (SRE) is responsible for assisting to ensure the reliability, scalability, and performance of TRS Information Technology Infrastructure. The incumbent assists in managing... 
    Senior
    Full time
    Work experience placement

    Teacher Retirement System of Texas

    Austin, TX
    3 days ago
  • $127k - $249k

    We are looking for an experienced Senior or Staff Engineer for our SRE, InfraSec team, to guide the security of our cloud-based infrastructure. As a Staff SRE, you will be very hands‑on technically while also mentoring a small team of SREs. The InfraSec team collaborates... 
    Senior
    Local area
    Remote work
    Flexible hours

    I did my part and supported the Regular Toilet

    Austin, TX
    3 days ago
  •  ...Austin - Remote Employment Type Full time Location Type Remote Department Platform About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of... 
    Senior
    Remote job
    Full time
    Work from home
    Sleeping nights

    Jaide Health

    Austin, TX
    10 hours ago
  • $111.6k - $186k

    Company Cox Automotive - USA Job Family Group Engineering / Product Development Job Profile Sr Software Engineer Management Level...  ...that may include an incentive program. Job Description Senior Site Reliability Engineer Department: Engineering / Platform Reliability Location... 
    Senior
    Full time
    Remote work
    Relocation
    Flexible hours
    Shift work

    Cox Enterprises

    Austin, TX
    3 days ago
  •  ...Job Description Job Description Sr. Software Engineer - Site Reliability About ShipperHQ: ShipperHQ is a trusted leader in the e-commerce...  ...logistics.  Position Overview: We’re seeking a Senior Site Reliability Engineer to join our fast-paced Engineering... 
    Senior
    Full time
    Work at office

    ShipperHQ

    Austin, TX
    16 days ago
  • Sr Site Reliability Engineer, Customer Systems Austin, Texas, United States Software and Services Imagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn’t have... 
    Senior

    Apple Inc.

    Austin, TX
    3 days ago
  • Upstart is seeking a Senior Software Engineer focused on Site Reliability Tooling. This role involves enhancing the reliability and observability of our production systems while working closely with other engineers at Upstart. Qualifications include a minimum of 6 years... 
    Senior
    Remote job

    Upstart

    Austin, TX
    3 days ago
  •  ...improve software solutions to ensure system reliability and availability, mitigate operational...  ...issues. # You will help lead chaos engineering efforts in a production-alike environment...  ...professionals, with engineers focused on site reliability engineering and... 
    Senior
    Permanent employment
    Flexible hours

    Teradata

    Austin, TX
    9 days ago
  •  ...Schwab. We are an integrated product, engineering, strategy and risk team, all based in San...  ...how we serve our clients. As a Senior Engineer on AI.x, you will play a key role...  ...areas of technology today. As a Senior AI Site Reliability Engineer you will support reliability efforts... 
    Senior

    Charles Schwab

    Austin, TX
    2 days ago
  • $98.58k - $138.02k

     ...requires a hybrid work schedule based out of one of our office locations: Austin, TX; Irvine, CA; or Akron, OH.    The  Site Reliability Engineer II will be responsible for supporting, enhancing, and maintaining Restaurant365’s cloud infrastructure and applications.... 
    Work at office
    Remote work

    Restaurant365

    Austin, TX
    15 days ago
  •  ...Senior Principal AI Agent Engineer The Software Engineering team delivers next-generation software application enhancements and new products for a changing world. Working at the cutting edge, we design and develop software for platforms, peripherals, applications... 
    Senior

    Dell

    Austin, TX
    17 hours ago
  •  ...selected candidate for this role to work on site in the specified location. As a member...  ...(AST), Service Availability and Engineering team, you will be immersed in a collaborative...  ...applications. What you’ll do Practice Site Reliability Engineering (SRE) and solve problems... 
    Senior
    Work at office
    Night shift

    Charles Schwab Corporation

    Austin, TX
    2 days ago
  • $163.4k - $272.3k

    Company Cox Automotive - USA Job Family Group Engineering / Product Development Job Profile Sr Lead Software Engineer Management...  ...that may include an incentive program. Job Description SENIOR LEAD SITE RELIABILITY & SYSTEMS ENGINEER Platform Engineering | Infrastructure,... 
    Senior
    Full time
    Remote work
    Relocation
    Flexible hours
    Shift work

    Cox Enterprises

    Austin, TX
    3 days ago
  •  ...A leading tech company is looking for a Senior Principal Software Engineer in Austin, Texas, who will define and execute the technical strategy for backend and AI services. You will oversee product and system health, driving AI and data integration. Ideal candidates will... 
    Senior

    Procore Technologies

    Austin, TX
    4 days ago
  • Teacher Retirement System of Texas is hiring a Site Reliability Engineer for its Austin office. The role requires expertise in maintaining IT infrastructure and ensuring reliability across systems. Candidates should have a bachelor’s degree in a related field and relevant... 
    Work at office

    Teacher Retirement System of Texas

    Austin, TX
    3 days ago
  • Site Reliability Engineer, Teamcenter, Enterprise Technology Services Austin, Texas, United States Software and Services Description As an SRE, you will play a key role in ensuring the reliability, scalability, and performance of Apple's FMD (Full Material Disclosure)... 

    Apple Inc.

    Austin, TX
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!