Site Reliability Engineer
FLUIX
FLUIX is building the AI operating system that plans, designs, and optimizes AI infrastructure. We are based in Silicon Valley. We specialize in providing AI-driven solutions for data centers and power providers, leveraging cutting-edge Machine Learning (ML) and Artificial Intelligence (AI) technologies. Our mission is to double America’s compute capacity without building new data centers. We are seeking a skilled Site Reliability Engineer to join our growing team. The ideal candidate will help ensure the reliability, scalability, and performance of our hybrid-based (Cloud & On-Prem) platform while supporting our AI/ML infrastructure. You will work closely with our engineering, AI, and operations teams to build and maintain robust systems that support our cutting-edge solutions. Your expertise in ML/AI and experience with data center sites will be crucial in driving the success of our platform. Who you’ll work closely with Founder & CEO Chase Overcash CTO What you’ll do Design, implement, and maintain scalable systems while optimizing performance, ensuring high availability and disaster recovery, and assisting with codebase refactoring for modular deployment. Develop and maintain automation tools to streamline operations, improve efficiency, and automate repetitive tasks to enhance system reliability. Collaborate with engineering and data science teams to integrate ML and AI models into production environments, while ensuring seamless integration and high performance of cutting-edge models within our technology stack. Identify areas for improvement and drive initiatives to enhance system reliability and performance, while staying updated on industry trends and advancements in SRE practices, ML, and AI technologies. Respond to and resolve incidents to minimize impact and ensure timely resolution, while conducting post-incident reviews and implementing improvements to prevent recurrence. Create and manage multiple cloud instances (dev, staging, test), optimize cloud infrastructure and data center operations, and ensure the security and compliance of both infrastructure and applications. Your background Bachelorʼs degree in Computer Science, Engineering, or a related field (or equivalent experience). Proven experience as a Site Reliability Engineer or similar role in a SaaS environment, with a strong background in managing and optimizing cloud infrastructure (AWS preferred, or GCP, Azure), experience with ML and AI technologies, and familiarity with data center operations integrations. Proficiency in programming and scripting languages (e.g., Python), experience with containerization and orchestration tools (Kubernetes), a strong understanding of networking, security, and performance optimization, and knowledge of CI/CD pipelines and DevOps practices. Excellent problem-solving skills with attention to detail, strong communication and collaboration abilities, and the capacity to thrive in a fast-paced, dynamic startup environment. Culture Fit We are looking for obsessed individuals who want to give it their all. We are not afraid to get our hands dirty with physical and software systems. We are eager to visit and work with clients and understand the importance and gravitas of their mission-critical work. We are eager to come into the office and on-site, as our work directly affects physical environments. Due to our mission-critical work, we understand and our eager to help our teammates and co-workers during holidays, weekends, and emergencies. We are cordial and over-communicate with teammates, co-workers, and management. Attractive compensation package, including equity options. Comprehensive health, dental, and vision insurance, along with other standard benefits. A dynamic and collaborative San Francisco Bay Area work environment. Opportunities for professional growth and development, with the chance to shape the future of technology in the industry. #J-18808-Ljbffr FLUIX
$140k - $220k
About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing... ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks...Suggested$210k - $270k
Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production systems. You’ll be challenged with building frameworks and processes for ensuring uptime for our patients and providers...SuggestedFlexible hours$210k - $270k
Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities...Suggested$86.33k - $191.9k
...guardrails to make going fast also going safely. Identifying reliability anti-patterns and solving them systemically . You dive deep into... ...of AI‑assisted developer tools and platforms to increase engineering productivity, enforce code quality standards, and enable real...SuggestedLocal areaFlexible hours- The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing...Suggested
$180k - $360k
...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who... ...Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform...Temporary workRelocation- ...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native...Remote job
$150k - $180k
...financial, environmental, and innovation outcomes. Role Verrus is looking for candidates to serve as software-focused Senior Site Reliability Engineer at Verrus. This is a full‑time position based out of the Mountain View, CA office. Verrus takes a very technology‑forward...Full timeWork at officeLocal areaFlexible hours$232.9k - $335.81k
## Principal Site Reliability EngineerApplylocations: USA - CA - Palo Altotime type: Full timeposted on: Posted Yesterdayjob requisition id... ...About the Role:** We're looking for a Principal Site Reliability Engineer to join our Platform Engineering team — someone equally at...Permanent employment$180k - $260k
...facilitating effortless integration into customers’ logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you will work...Odd jobWork at officeRemote work- ...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems...
- A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal...
$250k
...systems, eGain provides the single source of truth—explainable, reliable, and maintainable—that serves as the repository for all... ...at scale. Position Overview As Director of Site Reliability Engineering, you will ensure that eGain’s AI knowledge management platform...Work at office$152k - $241.5k
...infrastructure platforms for automated host lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑driven operations (... ...languages such as Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through design reviews,...- Education Requirements, Ideal Experience: Associate’s degree in Industrial Engineering or IT related field Minimum of 0-3 years’ relevant experience Knowledge of the application of tools/techniques Experience in one coding language (Preferred) Experience in Database (Preferred...
- ...that keep the world running. Location: 5 on-site days a week in Sunnyvale, CA Headquarters. Our Team's Vision: Our Engineering team is shaping the future of cybersecurity... ...are looking for an experienced Senior Site Reliability Engineer (SRE) with a strong background in...Work experience placement
- ...building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute cluster): uptime, performance...
- Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP or AWS, on-prem). Build, maintain, and optimize Kubernetes clusters (including GPU-backed clusters). Implement...
$174k - $252k
Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California...Full time$147.4k - $220.9k
Site Reliability Engineer, Customer Systems Sunnyvale, California, United States Software and Services Imagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn...Relocation$147.4k - $272.1k
Site Reliability Engineer, Enterprise Technology Services Sunnyvale, California, United States Software and Services Imagine what we could do together. At Apple, new ideas have a way of becoming excellent products, services, and customer experiences very quickly. Bring...Relocation$150k - $195k
...milestones so that scale and resiliency are a part of every conversation. Develop best practices alongside engineering/operations teams to improve the scalability and reliability of internal processes. Participate in an on-call rotation. Minimum Qualifications 3 years of...Full timeWorldwide$150k - $180k
A technology-focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role...Full time$147.4k - $272.1k
...telling what you could accomplish. We are a team of software engineers developing web-based tools and native applications for Apple... ...inspire and delight millions every day. We’re looking for a Site Reliability Engineer who thinks like a systems engineer first and an...RelocationShift work$180k - $360k
An innovative technology firm is seeking a Cybersecurity Engineer to secure and maintain the reliability of critical applications within AWS. This role requires experience in Python, Terraform, and AWS identity management. The candidate will work with cross-functional teams...- Job Summary Note: This role requires US Citizenship. Your Career As a Principal Site Reliability Engineer, you will serve as the technical authority for our cloud-native infrastructure. You’re responsible for architecting the reliability, scalability, and security of a...Visa sponsorshipWork visaShift work
$126k - $204.5k
...As part of this role, you will collaborate closely with our engineering teams to develop innovative solutions that provide clear and... ...team to influence the operability of the product and ensure the reliability and availability of our services. Qualifications Required...$147.4k - $272.1k
Site Reliability Engineer (Edge Services), Infrastructure Services Sunnyvale, California, United States Software and Services We are seeking a proactive Site Reliability Engineer to champion the evolution of our production ecosystems. In this role, you will help drive...RelocationShift work$172.1k - $258.6k
Site Reliability Engineer, Physical Infrastructure Cupertino, California, United States Software and Services We are looking for a creative and highly motivated Site Reliability Engineer to join our team. Having depth and breadth of knowledge working in physical infrastructure...WorldwideRelocation$202k - $247k
Job Category Site Reliability Engineering Posting Date 11/18/2025, 12:24 AM Locations Santa Clara, CA, United States Job Schedule Full time Job Description At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best...Full timeWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability Engineer. Be the first to apply!
- site reliability engineer sre Palo Alto, CA
- site reliability engineer Palo Alto, CA
- website content developer Palo Alto, CA
- website coordinator Palo Alto, CA
- on site coordinator Palo Alto, CA
- IT site lead Palo Alto, CA
- on-site clinical research associate (traveling/remote) Palo Alto, CA
- junior website developer Palo Alto, CA
- site services specialist Palo Alto, CA
- site safety Palo Alto, CA

