Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer

FLUIX

FLUIX is building the AI operating system that plans, designs, and optimizes AI infrastructure. We are based in Silicon Valley. We specialize in providing AI-driven solutions for data centers and power providers, leveraging cutting-edge Machine Learning (ML) and Artificial Intelligence (AI) technologies. Our mission is to double America’s compute capacity without building new data centers. We are seeking a skilled Site Reliability Engineer to join our growing team. The ideal candidate will help ensure the reliability, scalability, and performance of our hybrid-based (Cloud & On-Prem) platform while supporting our AI/ML infrastructure. You will work closely with our engineering, AI, and operations teams to build and maintain robust systems that support our cutting-edge solutions. Your expertise in ML/AI and experience with data center sites will be crucial in driving the success of our platform. Who you’ll work closely with Founder & CEO Chase Overcash CTO What you’ll do Design, implement, and maintain scalable systems while optimizing performance, ensuring high availability and disaster recovery, and assisting with codebase refactoring for modular deployment. Develop and maintain automation tools to streamline operations, improve efficiency, and automate repetitive tasks to enhance system reliability. Collaborate with engineering and data science teams to integrate ML and AI models into production environments, while ensuring seamless integration and high performance of cutting-edge models within our technology stack. Identify areas for improvement and drive initiatives to enhance system reliability and performance, while staying updated on industry trends and advancements in SRE practices, ML, and AI technologies. Respond to and resolve incidents to minimize impact and ensure timely resolution, while conducting post-incident reviews and implementing improvements to prevent recurrence. Create and manage multiple cloud instances (dev, staging, test), optimize cloud infrastructure and data center operations, and ensure the security and compliance of both infrastructure and applications. Your background Bachelorʼs degree in Computer Science, Engineering, or a related field (or equivalent experience). Proven experience as a Site Reliability Engineer or similar role in a SaaS environment, with a strong background in managing and optimizing cloud infrastructure (AWS preferred, or GCP, Azure), experience with ML and AI technologies, and familiarity with data center operations integrations. Proficiency in programming and scripting languages (e.g., Python), experience with containerization and orchestration tools (Kubernetes), a strong understanding of networking, security, and performance optimization, and knowledge of CI/CD pipelines and DevOps practices. Excellent problem-solving skills with attention to detail, strong communication and collaboration abilities, and the capacity to thrive in a fast-paced, dynamic startup environment. Culture Fit We are looking for obsessed individuals who want to give it their all. We are not afraid to get our hands dirty with physical and software systems. We are eager to visit and work with clients and understand the importance and gravitas of their mission-critical work. We are eager to come into the office and on-site, as our work directly affects physical environments. Due to our mission-critical work, we understand and our eager to help our teammates and co-workers during holidays, weekends, and emergencies. We are cordial and over-communicate with teammates, co-workers, and management. Attractive compensation package, including equity options. Comprehensive health, dental, and vision insurance, along with other standard benefits. A dynamic and collaborative San Francisco Bay Area work environment. Opportunities for professional growth and development, with the chance to shape the future of technology in the industry. #J-18808-Ljbffr FLUIX

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer in Palo Alto, CA vacancy
  • $140k - $220k

    About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing...  ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks... 
    Suggested

    Pylon

    Palo Alto, CA
    3 days ago
  • $210k - $270k

    Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production systems. You’ll be challenged with building frameworks and processes for ensuring uptime for our patients and providers... 
    Suggested
    Flexible hours

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  • $210k - $270k

    Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities... 
    Suggested

    GoTo Meeting

    Palo Alto, CA
    1 day ago
  • $86.33k - $191.9k

     ...guardrails to make going fast also going safely. Identifying reliability anti-patterns and solving them systemically . You dive deep into...  ...of AI‑assisted developer tools and platforms to increase engineering productivity, enforce code quality standards, and enable real... 
    Suggested
    Local area
    Flexible hours

    Traveltechessentialist

    Palo Alto, CA
    4 days ago
  • The Role We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing... 
    Suggested

    Nectar

    Palo Alto, CA
    1 day ago
  • $180k - $360k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform... 
    Temporary work
    Relocation

    Pantera Capital

    Palo Alto, CA
    4 days ago
  •  ...join our small team focused on growth and productivity. The role involves scaling our platform and infrastructure while enhancing reliability and the overall developer experience. Ideal candidates will have strong expertise in distributed systems, cloud-native... 
    Remote job

    BuildBuddy

    Palo Alto, CA
    1 day ago
  • $150k - $180k

     ...financial, environmental, and innovation outcomes. Role Verrus is looking for candidates to serve as software-focused Senior Site Reliability Engineer at Verrus. This is a full‑time position based out of the Mountain View, CA office. Verrus takes a very technology‑forward... 
    Full time
    Work at office
    Local area
    Flexible hours

    Verrus, LLC

    Mountain View, CA
    16 hours ago
  • $232.9k - $335.81k

    ## Principal Site Reliability EngineerApplylocations: USA - CA - Palo Altotime type: Full timeposted on: Posted Yesterdayjob requisition id...  ...About the Role:** We're looking for a Principal Site Reliability Engineer to join our Platform Engineering team — someone equally at... 
    Permanent employment

    Uniphore Technologies Inc.

    Palo Alto, CA
    4 days ago
  • $180k - $260k

     ...facilitating effortless integration into customers’ logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you will work... 
    Odd job
    Work at office
    Remote work

    Booster

    Mountain View, CA
    1 day ago
  •  ...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems... 

    Prophet Town

    Mountain View, CA
    3 days ago
  • A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal... 

    Amiri Recruiting

    Mountain View, CA
    1 day ago
  • $250k

     ...systems, eGain provides the single source of truth—explainable, reliable, and maintainable—that serves as the repository for all...  ...at scale. Position Overview As Director of Site Reliability Engineering, you will ensure that eGain’s AI knowledge management platform... 
    Work at office

    eGain Corporation

    Sunnyvale, CA
    4 days ago
  • $152k - $241.5k

     ...infrastructure platforms for automated host lifecycle management, fleet reliability/auto‑healing, E2E observability or data‑driven operations (...  ...languages such as Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction through design reviews,... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • Education Requirements, Ideal Experience: Associate’s degree in Industrial Engineering or IT related field Minimum of 0-3 years’ relevant experience Knowledge of the application of tools/techniques Experience in one coding language (Preferred) Experience in Database (Preferred... 

    FII

    Sunnyvale, CA
    4 days ago
  •  ...that keep the world running. Location: 5 on-site days a week in Sunnyvale, CA Headquarters. Our Team's Vision: Our Engineering team is shaping the future of cybersecurity...  ...are looking for an experienced Senior Site Reliability Engineer (SRE) with a strong background in... 
    Work experience placement

    Illumio

    Sunnyvale, CA
    16 hours ago
  •  ...building an AI Data Center AIOps platform that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and operating it (not the compute cluster): uptime, performance... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP or AWS, on-prem). Build, maintain, and optimize Kubernetes clusters (including GPU-backed clusters). Implement... 

    Amiri Recruiting

    Santa Clara, CA
    2 days ago
  • $174k - $252k

    Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California... 
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $147.4k - $220.9k

    Site Reliability Engineer, Customer Systems Sunnyvale, California, United States Software and Services Imagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn... 
    Relocation

    Apple Inc.

    Sunnyvale, CA
    4 days ago
  • $147.4k - $272.1k

    Site Reliability Engineer, Enterprise Technology Services Sunnyvale, California, United States Software and Services Imagine what we could do together. At Apple, new ideas have a way of becoming excellent products, services, and customer experiences very quickly. Bring... 
    Relocation

    Apple Inc.

    Sunnyvale, CA
    4 days ago
  • $150k - $195k

     ...milestones so that scale and resiliency are a part of every conversation. Develop best practices alongside engineering/operations teams to improve the scalability and reliability of internal processes. Participate in an on-call rotation. Minimum Qualifications 3 years of... 
    Full time
    Worldwide

    Isc2 Eastbay Chapter

    Sunnyvale, CA
    2 days ago
  • $150k - $180k

    A technology-focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role... 
    Full time

    Verrus, LLC

    Mountain View, CA
    3 days ago
  • $147.4k - $272.1k

     ...telling what you could accomplish. We are a team of software engineers developing web-based tools and native applications for Apple...  ...inspire and delight millions every day. We’re looking for a Site Reliability Engineer who thinks like a systems engineer first and an... 
    Relocation
    Shift work

    Apple Inc.

    Cupertino, CA
    16 hours ago
  • $180k - $360k

    An innovative technology firm is seeking a Cybersecurity Engineer to secure and maintain the reliability of critical applications within AWS. This role requires experience in Python, Terraform, and AWS identity management. The candidate will work with cross-functional teams... 

    Pantera Capital

    Palo Alto, CA
    4 days ago
  • Job Summary Note: This role requires US Citizenship. Your Career As a Principal Site Reliability Engineer, you will serve as the technical authority for our cloud-native infrastructure. You’re responsible for architecting the reliability, scalability, and security of a... 
    Visa sponsorship
    Work visa
    Shift work

    Palo Alto Networks, Inc.

    Santa Clara, CA
    3 days ago
  • $126k - $204.5k

     ...As part of this role, you will collaborate closely with our engineering teams to develop innovative solutions that provide clear and...  ...team to influence the operability of the product and ensure the reliability and availability of our services. Qualifications Required... 

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  • $147.4k - $272.1k

    Site Reliability Engineer (Edge Services), Infrastructure Services Sunnyvale, California, United States Software and Services We are seeking a proactive Site Reliability Engineer to champion the evolution of our production ecosystems. In this role, you will help drive... 
    Relocation
    Shift work

    Apple Inc.

    Sunnyvale, CA
    4 days ago
  • $172.1k - $258.6k

    Site Reliability Engineer, Physical Infrastructure Cupertino, California, United States Software and Services We are looking for a creative and highly motivated Site Reliability Engineer to join our team. Having depth and breadth of knowledge working in physical infrastructure... 
    Worldwide
    Relocation

    Apple Inc.

    Cupertino, CA
    2 days ago
  • $202k - $247k

    Job Category Site Reliability Engineering Posting Date 11/18/2025, 12:24 AM Locations Santa Clara, CA, United States Job Schedule Full time Job Description At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best... 
    Full time
    Worldwide

    Fortinet, Inc.

    Santa Clara, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer. Be the first to apply!