Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Operations Lead

$160k - $200k

AI Fabrik

About AI Fabrik

AI Fabrik builds an edge inference delivery network for high-performance tokens, with faster time-to-market from grid to tokens. Our mission is to build the inference infrastructure we wished every enterprise already had — close to users, close to the cloud, and extremely resilient for real-time workloads. We are builders, architects, engineers, and researchers with hands-on experience in real-world AI deployment in production, and decades of data center experience that taught us exactly what needs to change.

AI Fabrik was incubated inside Gruve and backed by , (Temasek), , — existing investors from Gruve who followed us into this new chapter. We are deploying five initial production sites, with the first one coming online in July 2026.

About the Role

We are seeking an experienced operations leader to oversee the day-to-day management of our mission-critical infrastructure. In this role, you will be responsible for ensuring the reliability, availability, and scalability of live 24x7 production environments, while maintaining exceptional service levels for customers and stakeholders. The ideal candidate has hands-on experience operating critical facilities, establishing and managing service level agreements (SLAs), building strong vendor and partner relationships, and proactively identifying and mitigating risks before they impact operations. You will lead incident response efforts, drive capacity planning initiatives, manage operating budgets, and continuously improve operational processes to support business growth. Experience with high-density GPU deployments, AI infrastructure, and liquid cooling technologies is highly desirable. This is a unique opportunity to help shape and scale the operational foundation of next-generation AI infrastructure.

Key Responsibilities

  • Own day-to-day operation and uptime of our live sites — keeping power, cooling, network, and compute infrastructure available and healthy in a 24x7 environment
  • Manage the ongoing vendor ecosystem (facility maintenance, smart/remote hands, cooling, UPS and generator service, fire systems, physical security) — defining, tracking, and enforcing SLAs and holding each vendor to performance, response times, and budget
  • Build and run the preventive and corrective maintenance program, scheduling maintenance windows and coordinating vendors with minimal disruption to live workloads
  • Lead incident and outage response — own on-call and escalation, drive rapid resolution, and close the loop with root-cause analysis and preventive actions
  • Monitor facility health continuously (DCIM, building management, environmental) and manage capacity — power, cooling, space, and rack utilization — ahead of the engineering team's growth
  • Run change management for the live environment, and coordinate ongoing hardware operations (installs, moves, decommissions, cabling, cross-connects, spares) in support of engineering
  • Own operating budget and efficiency (opex, utility costs, PUE), physical security operations, and compliance, inspections, and audits (fire/safety, environmental, frameworks such as SOC 2)
  • Maintain operational documentation (runbooks, MOPs, SOPs/EOPs), report to leadership on uptime, capacity, incidents, and spend, and support new site bring-up and handover into operations as locations come online.

Basic Qualifications

  • Proven experience operating live data center or critical facilities — owning uptime, maintenance, and vendor performance in a 24x7 environment. This is a hard requirement
  • Strong vendor and service-provider management: setting and enforcing SLAs and maintenance contracts, and holding multiple vendors accountable on availability, quality, and cost
  • Working knowledge of critical facility systems in operation — power (utility, switchgear, UPS, generators, PDUs), mechanical and liquid cooling, fire suppression, cabling, and physical security
  • Hands-on with monitoring and management tooling (DCIM, building/facility management, environmental), plus solid capacity planning for power, cooling, and space
  • A track record in incident and outage management — on-call ownership, fast resolution, root-cause analysis, and preventive follow-through
  • Experience managing operating budgets with demonstrated cost and efficiency control (including PUE/energy), and familiarity with relevant codes, standards, and audits (fire/safety, Uptime Institute, TIA-942)
  • Strong documentation discipline and stakeholder communication — crisp reporting to leadership and coordination across a distributed US/India team; willing to be on-site, carry on-call, and travel as operations demand
  • Exposure to high-density and GPU/AI infrastructure and liquid/immersion cooling is a strong plus, as is new-site bring-up experience and relevant certifications (e.g., CDCP/CDCDP/DCOM, PMP)

Salary Range

$160,000 - $200,000 USD + Benefits

Why AI Fabrik

At AI Fabrik, we hire for impact. We want those who challenge how inference infrastructure is built and who excel at delivering it in production. We are builders, architects, engineers, and researchers. We move fast, work with rigor, and care deeply about what runs in the real world.

We are committed to building a diverse and inclusive team. AI Fabrik is an equal opportunity employer. We welcome applicants from all backgrounds and thank all who apply; however, only those selected for an interview will be contacted.

Please note that this is an onsite position based out of AI Fabrik’s Redwood City, California office.

Vacancy posted 14 hours ago
Similar jobs that could be interesting for youBased on the Site Operations Lead in Santa Clara, CA vacancy
  • $100k - $110k

     ...Burlingame, CA) Hybrid Remote - 3 days on-site preferred (either location) Based in...  .... As an authorized distributor for leading electrical & lighting manufacturers, we...  ...3-person Quotes & Project Team through operational challenges, expanding contractor relationships... 
    Website
    For contractors
    Remote work

    Trueseek

    Santa Clara, CA
    21 days ago
  • $90k - $110k

    Ascend Clinical, LLC is hiring a Phlebotomy Operations Manager responsible for overseeing laboratory-provided phlebotomy services, ensuring...  ...at least 3-5 years of relevant experience. This role involves leading a team and managing daily operations while monitoring service... 
    Website

    ascendclinical

    Sunnyvale, CA
    1 day ago
  • Apple Inc. is seeking a Human Data Operations Lead in Santa Clara, California. This role will lead the planning, management, and execution of...  ...has a Bachelor's degree and 10+ years of experience in site management and user research operations. Extensive travel across... 
    Website

    Apple Inc.

    Santa Clara, CA
    3 days ago
  •  ...This role is on-site 5 days/week in San Francisco in the Dogpatch neighborhood We build...  ...us with their most complex support and operations workflows across voice, chat, and email....  ...closely with Finance, and sometimes take the lead, on payroll accuracy and compliance... 
    Website
    Immediate start

    Byte2

    Santa Clara, CA
    1 day ago
  • $181.1k - $318.4k

     ...California, United States Description Apple’s AI/ML Data Operations group is seeking a Human Data Operations Lead. The candidate will join a team of innovative...  ...role’s primary function is to support the planning, site management, and execution of large-scale human user... 
    Website
    Relocation package

    Apple Inc.

    Santa Clara, CA
    3 days ago
  • $120k - $150k

     ...Employment Type Full time Location Type On-site Department Human Resources Compensation...  ...About Us: UnitX builds the world's leading physical AI systems to automate repetitive...  ...APAC. Beyond core HR, you will own the operational cadence of the leadership team - running... 
    Website
    Full time

    UnitX

    Santa Clara, CA
    4 days ago
  • $2,000 per month

     ...millions from top-tier investors and staffed by leading engineers, Etched is redefining the...  ...Etched is looking for a Mechanical Lab Operations Lead to own the day-to-day operational backbone...  ...Facilities Coordination: Partner with site services to manage lab infrastructure... 
    Website
    For contractors
    Work at office
    Relocation package

    ETCHED LLC

    San Jose, CA
    4 days ago
  • $100k - $120k

    VC Stack is seeking an Operations Manager in Palo Alto, CA. This full-time, on-site role involves owning projects from inception to execution, ensuring smooth day-to-day operations, and collaborating closely with the investment team. Ideal candidates will have a Bachelor... 
    Website
    Full time

    VC Stack

    Palo Alto, CA
    14 hours ago
  • $31.5 - $38 per hour

    Prometheus Real Estate Group is seeking a Home Base Coordinator to oversee financial and leasing administration across multiple properties in Mountain View, CA. This role requires a strong customer service background and a deep understanding of housing rental laws. Candidates...
    Website
    Hourly pay

    Prometheus Real Estate Group

    Mountain View, CA
    3 days ago
  • $72.75k - $151.25k

    A Little About Us The Yield Operations team is a newly formed group focused on improving operational...  ...About the Role As the Yield Operations Lead, you will own the day-to-day execution...  ...Yahoo? Please apply on our internal career site. #J-18808-Ljbffr Yahoo Holdings Inc.
    Website
    Work at office
    Flexible hours

    Yahoo Holdings Inc.

    Mountain View, CA
    4 days ago
  • $30 - $34 per hour

     ...set up and broken down according to the site standards and are delivered for accuracy...  ...regarding any guest special needs Supports the operation, if needed, by assisting with service...  ..., and educational background. We look to lead our industry by example and to positively... 
    Website
    Hourly pay
    Full time
    Work at office
    Weekend work
    Afternoon shift

    ISS Facility Services - North America

    Palo Alto, CA
    3 days ago
  • $220k - $285k

     ...We are looking for a Finance & Ops Lead to partner directly with our founders and...  ...on the most consequential financial and operational decisions facing the company. This is a senior...  ...willing to relocate (you will be working on-site at our Mountain View office a few days a... 
    Website
    Full time
    Work at office
    Remote work
    Relocation

    Inworld AI

    Mountain View, CA
    3 days ago
  • $175k - $200k

     ...collaboration. It's time to build. We are looking for an AI Data Operations Lead to join Figure's Operations org and own a meaningful slice of...  ...- comfortable managing multiple contributors, sessions, or sites simultaneously and keeping all plates spinning ~ A... 
    Website
    Full time
    Work at office
    Local area

    Figure AI

    San Jose, CA
    14 hours ago
  • Position Summary We are seeking a Product Development Operations Lead — a strategic, cross‑functional leader responsible for driving operational...  ...funded and sequenced. Facilitate cross‑functional and cross‑site roadmap alignment across Product, Engineering, Design, and... 
    Website

    Samsung Electronics Perú

    Mountain View, CA
    1 day ago
  •  ...seeking a highly organized Admin and Ops Executive in Milpitas, CA. The role involves providing executive support, managing office operations, and facilitating smooth operations for the organization. The ideal candidate has 1 to 3 years of experience in operational roles... 
    Website
    Work at office

    AMISEQ

    Milpitas, CA
    5 hours ago
  • $20 - $22 per hour

     ...Join CloudKitchens as a Facility Operations Associate! Ready to thrive in our dynamic environment? As a Facility Operations Associate, you'll ensure seamless food order flow, handle admin tasks, troubleshoot issues, and deliver top-notch customer service. About... 
    Full time
    Temporary work
    Part time
    Flexible hours
    Shift work
    Night shift
    Day shift

    CloudKitchens

    Sunnyvale, CA
    1 day ago
  • A leading tech company located in Santa Clara is seeking a Growth Operations Manager to enhance global growth through data-driven insights and optimized processes. You will manage CRM systems, oversee web presence, and collaborate with marketing and sales for campaign... 

    FuriosaAI

    Santa Clara, CA
    3 days ago
  •  ...About BackOps BackOps is building the AI-native operating layer for supply chain and back-office teams. Our products, Relay and the AI...  ...meet demand. The Role We're hiring a Business Operations Lead to help us scale the company through its next phase of growth.... 
    Work at office

    BackOps AI

    Sunnyvale, CA
    4 days ago
  • $147.4k - $220.9k

     ...Clinical Study Operations Lead Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services...  ...their performance. Extended domestic travel to vendor sites is required. Minimum Qualifications Bachelors degree in... 
    Website
    Relocation

    Apple

    Cupertino, CA
    2 days ago
  • $101k - $139k

     ...while learning every day in a supportive leading global company. Visit our Careers website...  ...benefits ( . Role Overview Central Operations is the operating engine (“CPU”) of the group...  ..., Applied endeavors to make our careers site ( accessible to all users. If you would... 
    Website
    Full time
    Relocation

    Applied Materials

    Santa Clara, CA
    4 days ago
  • A leading junk removal company in California seeks motivated Operations Managers / Drivers to provide top-notch service. You will engage with customers, operate a company vehicle safely, and support our green mission through effective recycling and donation practices. If... 

    The Junkluggers of Silicon Valley

    Sunnyvale, CA
    2 days ago
  • NVIDIA Gruppe is seeking an experienced professional to manage GPU capacity for High Performance Computing (HPC) clusters in Santa Clara. The ideal candidate will have a minimum of 8 years in cloud computing and proven experience managing GPU resources effectively. You ...

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • Intuitive is seeking a Pre-clinical Lab Coordinator to lead and coordinate lab activities at our Sunnyvale location. This role demands a hands-on operator responsible for managing lab operations, scheduling, and ensuring readiness for pre-clinical studies. Your responsibilities... 

    Intuitive

    Sunnyvale, CA
    4 days ago
  • $136k - $276k

    A leading technology company in Santa Clara is looking for a Capacity Operations Manager to oversee GPU capacity and manage high-performance computing clusters. The ideal candidate will have a Bachelor's or Master's degree and over 8 years of experience in cloud computing... 
    Remote job

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • A sustainable junk removal company located in Sunnyvale is looking for Operations Managers / Drivers. This role is the face of the company and involves providing exceptional customer service, making operational decisions for efficient routing, and ensuring eco-friendly... 
    Immediate start
    Flexible hours

    Junkluggers FranServe, Inc.

    Sunnyvale, CA
    3 days ago
  • $171.5k - $236k

    Applied Materials, Inc. is seeking a Senior Manager, Global IT Operations to lead IT services in their semiconductor R&D Labs in Santa Clara, CA. This role entails strategic and operational leadership for high-availability Lab IT operations, including MES platforms and... 
    Full time

    Applied Materials, Inc.

    Santa Clara, CA
    4 days ago
  • $177k - $257k

    Google Inc. in Sunnyvale, CA, seeks a Product and Business Strategy Leader to drive Core strategy and operations initiatives across teams. This role entails synthesizing insights from data to inform investment decisions, supporting strategic agendas, and managing complex... 
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $134.4k - $245.3k

     ...The Opportunity Join Adobe as Strategy and Operations Lead - Adobe on Adobe, where you will play a pivotal role in driving business performance, program rigor, and effective communications. This outstanding opportunity allows you to collaborate with finance... 
    Website
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    3 days ago
  • Oklo is seeking an AI Lead, Strategy, and Operations to drive AI integration across key business functions. This role focuses on translating AI capabilities into practical improvements and establishing best practices across teams. Ideal candidates will have extensive experience... 
    Flexible hours

    Oklo

    Santa Clara, CA
    4 days ago
  • $87k - $123k

    Google is seeking a Security Operations Specialist in Sunnyvale, California. The role focuses on anticipating and mitigating security risks while supporting corporate events. You'll engage in incident response and implement security programs. The candidate should hold... 
    Full time

    Google

    Sunnyvale, CA
    4 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Operations Lead. Be the first to apply!