Senior Site Reliability Engineer
$160k - $250kHive
DevOps And Systems Team Position
Our unique machine learning needs led us to open our own data centers, with an emphasis on distributed high performance computing integrating GPUs. Even with these data centers, we maintain a hybrid infrastructure with public clouds when the right fit. As we continue to commercialize our machine learning models, we also need to grow our DevOps and Site Reliability team to maintain the reliability of our enterprise SaaS offering for our customers. Our ideal candidate is someone who is able to thrive in an unstructured environment and takes automation seriously. You believe there is no task that can't be automated and no server scale too large. You take pride in optimizing performance at scale in every part of the stack and never manually performing the same task twice.
Responsibilities
- Automate manual operational processes
- Improve workflows of developer, data, and machine learning teams
- Manage secure integration and deployment tooling
- Create, maintain, monitor, and audit secure infrastructure
- Manage a diverse array of technology platforms, following best practices and procedures
- Participate in on-call rotation and root cause analysis
- Maintain awareness of industry best practices for data maintenance handling as it relates to your role
- Adhere to policies, guidelines and procedures pertaining to the protection of information assets
- Report actual or suspected security and/or policy violations/breaches to an appropriate authority
Requirements
- Minimum 3 - 5 years of previous experience in development, operations, IT, or a related field
- Comfortable working on Linux infrastructures (Debian) via the CLI
- Able to learn quickly in a fast-paced environment
- Able to debug, optimize, and automate routine tasks
- Able to multitask, prioritize, and manage time efficiently independently
- Able to physically lift equipment at least 30 pounds
- Can communicate effectively across teams and management levels
- Degree in computer science, or similar, is an added plus!
Technology Stack
- Operating Systems - Linux/Debian Family/Ubuntu
- Configuration Management - Chef
- Containerization - Docker
- Container Orchestrators - Mesosphere/Kubernetes
- Scripting Languages - Python/Ruby/Node/Bash
- CI/CD Tools - Jenkins
- Network hardware - Arista/Cisco/Fortinet
- Hardware - HP/SuperMicro
- Storage - Ceph, S3
- Database - Scylla, Postgres, Pivotal GreenPlum
- Message Brokers: RabbitMQ
- Logging/Search - ELK Stack
- AWS: VPC/EC2/IAM/S3
- Networking: TCP / IP, ICMP, SSH, DNS, SSL / TLS, Storage systems, RAID, distributed file systems, NFS / iSCSI / CIFS
We are a group of ambitious individuals who are passionate about creating a revolutionary AI company. At Hive, you will have a steep learning curve and an opportunity to contribute to one of the fastest growing AI start-ups in San Francisco. The work you do here will have a noticeable and direct impact on the development of the company.
Thank you for your interest in Hive and we hope to meet you soon!
The current expected base salary for this position ranges from $160,000 - $250,000. Actual compensation may vary depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location. Base pay is one part of the total compensation package that is provided to compensate and recognize employees for their work; stock options may be offered in addition to the range provided here.
- ...founders with PhDs in AI, Math, and Computer Science - is poised to redefine computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI infrastructure operate with exceptional reliability, performance, and...Senior
- ...About the Role We're looking for an experienced Site Reliability Engineer (SRE) to help us scale our platform with reliability, observability, and operational excellence at the core. You'll partner with engineers and data scientists to build, automate, and maintain...Senior
- ...experiment constantly as we find the right paths in an AI-native landscape. The Role You'll be the infrastructure and reliability engineer on the Data Replication team - a full-stack product team running over 3 million sync jobs a week powering thousands of data...SeniorLocal area
- ...About the job Senior Site Reliability Engineer About the Company Stellar is a decentralized, public blockchain that gives developers the tools to create experiences that are more like cash than crypto. The network is faster, cheaper, and far more energy-efficient...Senior
$181.69k - $213.75k
...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects founders, investors, and limited partners through world-class software, purpose-built for everyone in venture capital, private...SeniorFull timeWork at office$195k - $240k
...Senior Site Reliability Engineer San Francisco (Hybrid) At You.com, we are building the AI Search Infrastructure that powers modern AI systems. Our goal is to create the trusted knowledge layer that agents, applications, and enterprises rely on to retrieve real-...SeniorFull timeImmediate startRemote workWork from homeFlexible hours- ...advanced algorithms that significantly outperforms individual engineers. We combine language models with human ingenuity to push the... ...quality. The Role: We are seeking an experienced Site Reliability Engineer to join our Platform Engineering team in the Bay Area...Senior
$117k - $209.33k
...Job Requisition ID # 26WD99273 Position Overview Want to help make a better world? As a Senior Site Reliability Engineer at Autodesk, you can help us build and operate reliable, secure, and scalable cloud services for Autodesk GovCloud products. As part of a...SeniorFor contractors- ...Site Reliability Engineer 3 We are looking for a Site Reliability Engineer 3 to support mission-critical cloud services and production operations. The role focuses on improving service reliability, reducing operational risk, automating repetitive tasks, and driving...SeniorImmediate startFlexible hoursShift work
- ...come shape the future and be part of a truly unique global culture at OutSystems! Hybrid Onsite in Menlo Park, CA Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and...SeniorImmediate startRemote workWorldwide
$127k - $249k
...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational... ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper)....SeniorWork at officeLocal areaRemote workWorldwideFlexible hours- ...Udaip Cloud-Based Data And Ai Platform Engineer At U.S. Bank, we're on a journey to do our best. Helping the customers and businesses we serve to make better and smarter financial decisions and enabling the communities we support to grow and succeed. We believe it...SeniorTemporary workWork experience placement
$166.9k - $225.9k
...Summary: Drata's SRE team operates as both a central engineering function and an embedded reliability practice. You'll be part of a close-knit SRE team... ...What you'll bring: ~6+ years of experience in Site Reliability Engineering, Cloud Engineering, or building...SeniorWork at officeImmediate startWorldwideMonday to FridayFlexible hours- US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average of over 12 years of industry experience, the successful candidate will bridge the gap between software development and systems...Senior
- OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes...SeniorFlexible hours
$287k
...Series B and have grown 800% over the last 12 months. Engineering at Ivo Engineers at Ivo are inventors. Ivo was... ...expect us to hit our SLAs. What ? We're looking for an Senior or Staff Site level Reliability Engineer as part of Infrastructure team to: Own...SeniorContract workWork at officeRemote work$220k - $235k
...Staff/Senior Staff Site Reliability Engineer Ironclad is the leading AI contracting platform that transforms agreements into assets. Contracts move faster, insights surface instantly, and agents push work forward, all with you in control. Whether you're buying or selling...SeniorFull timeContract workWork at office$181k - $263k
...and supporting deployments of global products, and providing first line operational support. We are looking for a Senior Staff Site Reliability Engineer who will set the technical direction for reliability engineering across LiveRamp's global infrastructure. This is a...SeniorWork from homeFlexible hoursNight shift$300k
...thousands of H100s, H200s, and B200s, ready for experimentation, full-scale model training, or inference. As a Platform Engineer/Senior Site Reliability Engineer, you’ll own the reliability, performance, and automation of this GPU-powered infrastructure, ensuring...Senior- What you’ll do As a Senior Site Reliability Engineer, you’ll work closely with product teams in Spend to deliver and maintain scalable, reliable cloud infrastructure in support of key product initiatives. Aligned to the roadmap, you’ll lead on infrastructure design and...Senior
- ...Responsibilities Lead and onboard services and teams to the reliability tenets. Establish and maintain Service Level Objectives (... ...Science or equivalent. 6+ years of experience in Site Reliability Engineering, managing infrastructure and services at scale. History of...Senior
- ...alongside clinicians to make that possible. We’re a team of doctors, engineers, designers, researchers, and creatives building tools that... ...for leading incidents end-to-end. Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes...SeniorWork at officeWorldwide
$140k - $220k
About the Job You’ll own reliability and operational excellence for Pylon's production systems. This means designing and implementing... ...scale as we grow. You'll build tooling that makes the entire engineering team more effective, establish on-call rotations and runbooks...Senior- ...acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from... ...redefining go-to-market with state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of data...Senior
- ...work from home day is currently Tuesday. Engineering at Lambda is responsible for building... ...observability adoptable and improve product reliability. Lead members of other engineering teams... ...in Go Have 5+ years of experience in Site Reliability Engineering practices Possess...SeniorWork at officeLocal areaWork from home
- ...about this role, we encourage you to apply. The Role As a Senior Platform Engineer, you are a champion for DevOps and SRE culture and... ...goals are met. What You Will Be Doing Improving production reliability and system resilience within an SRE scoped team Championing...SeniorFlexible hours
- We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. You’ll learn to deploy and maintain a fleet of RPC and validator nodes for multiple blockchain networks. You’ll also provide guidance...SeniorRemote job
$60 per hour
Senior Site Reliability Engineer (Copy) Seattle Hybrid (Hybrid location). Full-time. About Us Supio is a trusted AI platform purpose-built for law firms, reshaping how data drives impactful outcomes. Our innovative approach blends technology with deep legal expertise,...SeniorFull timeWork at officeFlexible hours$50 per hour
...years of professional SRE experience 5+ years of experience contributing to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems Bachelor's Degree in Computer Science or related field, or 8+ years relevant work...SeniorTemporary workWork experience placement$165k - $241.4k
...efficient, functional and very effective. We’re looking for talented engineers with a software or operations background, experienced in... ...closely with our application development teams to ensure the reliability, performance and security of our infrastructure....SeniorFull timeTemporary workWork at officeFlexible hours1 day per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!
- site reliability engineer remote San Francisco, CA
- site reliability engineer San Francisco, CA
- site reliability engineer sre San Francisco, CA
- senior data management analyst San Francisco, CA
- senior app developer San Francisco, CA
- senior game producer San Francisco, CA
- senior retail sales associate San Francisco, CA
- senior manager quality engineering San Francisco, CA
- senior software test automation engineer San Francisco, CA
- senior quantitative risk analyst San Francisco, CA

