Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

System Engineer

$200k - $300k

FluidStack

About Fluidstack

We exist to make humanity more free. For most of human history, you farmed or you starved. Technology gave people more time for the things they wanted to do, instead of things they had to do. Powerful AI will be the biggest lever for human choice we've ever built - but only if models are aligned with what humanity actually wants. There are groups building AI who don't share these goals. Whoever deploys frontier compute infrastructure fastest will decide whether AI expands human freedom or shrinks it.
We're singularly focused on delivering 10 to 100s of GWs of compute faster than anyone else, rethinking every layer of the stack. We acquire power, design and build data centers, and operate them - with teams spanning hardware and software. Speed and scale are our key differentiators. Come be a part of building civilization-scale infrastructure for AI.
We hire people who care deeply about this problem space. If that is you, please apply!

About the Role

As a System Engineer, you will manage, operate, and optimize hyperscale GPU compute infrastructure supporting AI/ML training and inference workloads. Ensure high availability, performance, and reliability of GPU server fleet through automation, monitoring, troubleshooting, and collaboration with hardware engineering, platform teams, and datacenter operations.

Focus
  • Operate and maintain large-scale GPU server fleet (H100, B200, GB200) supporting AI/ML workloads; monitor system health, performance, and utilization to maximize uptime and ensure SLA compliance
  • Perform hands-on troubleshooting and root cause analysis of complex hardware, firmware, OS, and application issues across GPU clusters; coordinate with vendors and hardware teams to resolve systemic failures
  • Develop and maintain automation scripts for provisioning, configuration management, monitoring, and remediation at scale.
  • Build and improve tooling for GPU health checks, performance diagnostics, driver validation, and automated recovery
  • Execute server provisioning, configuration, firmware updates, and OS installation using automation frameworks; manage lifecycle operations including deployment, maintenance, and decommissioning
  • Participate in 24x7 on-call rotation; respond to production incidents and coordinate resolution with cross-functional teams including datacenter operations, network engineering, and application teams
  • Lead post-incident reviews, document root causes, and drive continuous improvement initiatives focused on automation, reliability, monitoring, and operational efficiency
Basic Qualifications
  • Bachelor's degree in Computer Science, Engineering, or related technical field (or equivalent practical experience)
  • 3+ years (System Engineer) or 5+ years (Senior System Engineer) in Linux system administration, datacenter operations, or infrastructure engineering
  • Strong Linux/Unix fundamentals including system administration, shell scripting (Bash, Python), troubleshooting, and performance tuning
  • Experience with server hardware architecture, troubleshooting techniques, and understanding of compute, memory, storage, and networking components
  • Experience in automation and configuration management tools (Ansible, Puppet, Chef, Terraform).
  • Strong analytical and problem-solving skills with ability to diagnose complex technical issues under pressure
  • Excellent communication and collaboration skills; ability to work effectively with cross-functional teams
Preferred Qualifications
  • Experience managing large-scale GPU infrastructure (NVIDIA H100, A100, B200, GB200) in production environments supporting AI/ML workloads
  • Deep knowledge of GPU architecture, CUDA toolkit, GPU drivers, monitoring tools (nvidia-smi, DCGM)
  • Experience with HPC cluster management, job schedulers (Slurm, PBS, LSF), and container orchestration (Kubernetes, Docker)
  • Proficiency in out-of-band management protocols (IPMI, Redfish, BMC) and firmware management for server hardware
  • Experience with high-performance networking (InfiniBand, RoCE, RDMA) and network troubleshooting in GPU cluster environments
  • Familiarity with datacenter operations including rack installations, cabling, power management, and thermal considerations
Salary & Benefits
  • Competitive total compensation package (salary + equity).
  • Retirement or pension plan, in line with local norms.
  • Health, dental, and vision insurance.
  • Generous PTO policy, in line with local norms.

The base salary range for this position is $200,000 - $300,000 per year, depending on experience, skills, qualifications, and location. This range represents our good faith estimate of the compensation for this role at the time of posting. Total compensation may also include equity in the form of stock options.

We are committed to pay equity and transparency.

Fluidstack is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Fluidstack will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

You will receive a confirmation email once your application has successfully been accepted. If there is an error with your submission and you did not receive a confirmation email, please email View email address on click.appcast.io with your resume/CV, the role you've applied for, and the date you submitted your application-- someone from our recruiting team will be in touch.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the System Engineer in San Francisco, CA vacancy
  •  ...Team: We've assembled authentication, integrations, distributed systems, and AI experts from Okta, Redis, Microsoft, Splunk, Ngrok,...  ...~ An insatiable desire to ship. ~7+ years of software engineering experience comprising of: ~5+ years of backend development... 
    Suggested
    Work at office
    Shift work

    Arcade

    San Francisco, CA
    2 days ago
  •  ...Systems Engineer Atlas Technica's mission is to shoulder IT management, user support, and cybersecurity for our clients, who are hedge funds and other investment firms. Founded in 2016, we have grown year over year through our uncompromising focus on service. We... 
    Suggested
    Work at office

    Atlas Technica

    San Francisco, CA
    1 day ago
  •  ...performance goals of the infrastructure. Help design, implement, and monitor testnets. Expert knowledge of peer-to-peer distributed system design and implementation (required) Ability to build and maintain high available infrastructure (required) Knowledge on how... 
    Suggested

    1872 Consulting

    San Francisco, CA
    2 days ago
  •  ...Commvault Systems Engineer (Data Protection / Backup) Employment Type: Full-Time, Experienced CGS is seeking an experienced Commvault Data Protection Engineer with extensive knowledge and experience in designing, developing, configuring, implementing, testing, troubleshooting... 
    Suggested
    Full time
    Flexible hours

    Contact Government Services LLC

    San Francisco, CA
    1 day ago
  •  ...Sesame Systems Engineer Role Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With this vision, we're designing a new kind of computer, focused on making voice agents... 
    Suggested
    Full time
    Contract work
    Flexible hours

    SESAME

    San Francisco, CA
    5 days ago
  •  ...System Engineer Location: Foster City, CA OR San Francisco, CA (Hybrid) Role Summary: The Systems Engineer will be responsible for developing system requirements, test cases, test automation, and power train systems issue RCA support for an autonomous vehicle.... 
    Shift work

    InterSources

    San Francisco, CA
    15 hours ago
  • $91.2k - $114k

     ...meaningfully shape the future of cardiac health, our company, and your career About This Role: About the Role As a Systems Engineer II on the Product Development System Engineering team, you'll play a key role in developing and improving test solutions that... 
    Work at office
    Work visa

    iRhythm Technologies

    San Francisco, CA
    4 days ago
  • $160k - $320k

     ...by those who show initiative and deliver excellence.  We seek engineers/researchers with strong intrinsic drive, a true passion for...  ...Westwood, Los Angeles. About the Role We’re looking for a systems engineer with HPC or parallel programming experience to help scale... 
    Full time
    Work at office

    Vast

    San Francisco, CA
    5 days ago
  • $181.1k - $318.4k

     ...Bluetooth Mac Systems Engineer Come and join Apple's growing wireless silicon development team. Our wireless SoC organization is responsible for all aspects of wireless silicon development, emphasizing highly energy-efficient design and new technologies that transform... 
    Relocation

    Apple

    San Francisco, CA
    3 days ago
  • $181.1k - $318.4k

     ...Cellular Phy Systems Engineer Work Locations (2) Submit Resume Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job, and there... 
    Relocation

    Apple

    San Francisco, CA
    2 days ago
  • $103.76k - $118.77k

     ...System Engineer Daly City, CA 94014 Overview Salary Range $103,760.80 - $118,768.00 Description The Systems Engineer will be responsible for overseeing the entire IT system infrastructure ensuring that servers, operating systems, applications, and related... 
    Full time
    Work experience placement

    North East Medical Services

    Daly City, CA
    4 days ago
  • $100k - $120k

     ...NPI Systems Engineer Are you interested in working with the World's leading AI-powered Quality Engineering Company? Ready to advance your career, team up with global thought leaders across industries and make a difference every day? Join us at Qualitest! We are looking... 
    Casual work
    Local area
    Flexible hours

    QualiTest Group

    San Francisco, CA
    5 days ago
  • $181.1k - $318.4k

     ...PHY Systems Engineer – Mobility Control Work Locations (2) Submit Resume Imagine what you could do here! At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job,... 
    Relocation

    Apple

    San Francisco, CA
    2 days ago
  •  ...data, run data applications, so they can spend more time putting knowledge into action. We're looking for engineers who want to build the operating system for AI Data Applications and Workflows. About the role We're looking for experienced distributed systems... 

    Tensorlake, Inc.

    San Francisco, CA
    4 days ago
  •  ...E2B Infrastructure Engineer E2B is a fast-growing Series A startup with 8-figure revenue. We've raised over $37M since our founding...  ...software apps. Your job will be: # Building a distributed system for millions and billions of AI agents running on E2B #... 
    Work from home
    Relocation

    E2B

    San Francisco, CA
    3 days ago
  • Job Title Drive post-silicon debug of HBM, LPDDR, and GDDR memory subsystems, focusing on data corruption and stability issues Execute and analyze loopback tests, PHY tuning, timing calibration, and controller configuration sweeps Work closely with IP vendor (...

    ACL Digital

    San Francisco, CA
    11 days ago
  • $139k - $204k

     ...Systems Engineer, People Systems CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    San Francisco, CA
    2 days ago
  • $137k - $161k

     ...Crusoe Systems Engineer II, Compute Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens... 
    Full time
    Temporary work

    Crusoe

    San Francisco, CA
    2 days ago
  •  ...explorers. We are motivated by the challenge of solving tough engineering problems, committed to creating sustainable change, and driven...  ...defense missions - including space-based IR sensing and interceptor systems. If you join the team, you will have the opportunity to work... 

    Exploration Technology Group

    San Francisco, CA
    5 days ago
  • $180k - $250k

     ...About Unto Labs Unto Labs is a team of low-level engineers pushing distributed systems to the physical limits of modern hardware. We're reimagining blockchains, from core consensus primitives to performance-tuned networking stacks, in the service of global scale applications... 
    Flexible hours

    Unto Labs

    San Francisco, CA
    1 day ago
  • $141.5k - $224k

     ...to join our team. What you'll do The Integrated Solutions Business Unit is seeking a visionary and technically versatile Systems Engineer to act as the "glue" that holds our complex projects together. While specialized engineers focus on specific components, you will... 
    Contract work

    Navstar

    San Francisco, CA
    1 day ago
  •  ...Distributed Systems Engineer As a distributed systems engineer, you'll work across the stack to solve problems as they come up and help build Archil volumes. You'll have significant influence over the technical and product direction. We'll expect you to be able... 
    Flexible hours

    Archil

    San Francisco, CA
    5 days ago
  •  ...Description Specify and develop procedures and mechanisms for system status monitoring and performance reporting on an LTE system....  .... Qualifications ~ Bachelor's degree in Electrical Engineering, Computer Science, or related field is required. ~5+ years... 
    Permanent employment
    Full time
    Remote work

    Puloli

    San Francisco, CA
    5 days ago
  • $196k - $248k

     ...Systems Engineer, Perception Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $160k - $185k

     ...sensors for precision mapping, robotics, automotive,  security systems, smart cities and various industrial solutions . We've transformed...  ...help!   The Role: We're seeking a proactive and skilled engineer who thrives in a dynamic, fast-paced environment to be an... 
    Work experience placement
    Local area

    Ouster

    San Francisco, CA
    2 days ago
  •  ...The CoreHPC team at UCSF is seeking an HPC Systems Engineer to play a key role in the development, maintenance, and day-to-day operations of the Institute's HPC clusters. The HPC Systems Engineer will: Apply advanced systems infrastructure concepts and skills... 
    Work experience placement
    Work at office
    Worldwide

    University of California , San Francisco

    San Francisco, CA
    14 hours ago
  • $160k - $210k

     ...safety. We believe the most innovative teams are inclusive of and celebrate all forms of diversity. Role As a Senior Systems Engineer at Liminal, you will join a passionate and agile team developing next-generation ultrasonic inspection systems for advanced... 
    Flexible hours

    Liminal

    San Francisco, CA
    2 hours ago
  • $126.8k - $220.9k

     ...Wireless Systems Engineer, Ranging and Sensing At Apple, we work every single day to craft products that enrich people's lives. Do you love working on challenges that no one has solved yet? As a member of our Wireless Silicon Design group, you will have the outstanding... 
    Relocation
    Flexible hours

    Apple

    San Francisco, CA
    2 days ago
  •  ...Engineering Manager We're looking for an Engineering Manager to lead a group of highly experienced engineers. This is a hands-on leadership...  ...foster a strong engineering culture as they tackle complex systems challenges in distributed computing, large-scale data handling... 

    Modal

    San Francisco, CA
    6 days ago
  • $196k - $242k

     ...Senior Systems Engineer, Fault Protection Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World... 
    Full time
    Remote work

    Waymo

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to System Engineer. Be the first to apply!