Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

System Engineer, GPU Fleet

$200k - $300k

Fluidstack

System Engineer, GPU Fleet

As a System Engineer, GPU Fleet, you will manage, operate, and optimize hyperscale GPU compute infrastructure supporting AI/ML training and inference workloads. Ensure high availability, performance, and reliability of GPU server fleet through automation, monitoring, troubleshooting, and collaboration with hardware engineering, platform teams, and datacenter operations.

Focus
  • Operate and maintain large-scale GPU server fleet (H100, B200, GB200) supporting AI/ML workloads; monitor system health, performance, and utilization to maximize uptime and ensure SLA compliance
  • Perform hands-on troubleshooting and root cause analysis of complex hardware, firmware, OS, and application issues across GPU clusters; coordinate with vendors and hardware teams to resolve systemic failures
  • Develop and maintain automation scripts for provisioning, configuration management, monitoring, and remediation at scale.
  • Build and improve tooling for GPU health checks, performance diagnostics, driver validation, and automated recovery
  • Execute server provisioning, configuration, firmware updates, and OS installation using automation frameworks; manage lifecycle operations including deployment, maintenance, and decommissioning
  • Participate in 24x7 on-call rotation; respond to production incidents and coordinate resolution with cross-functional teams including datacenter operations, network engineering, and application teams
  • Lead post-incident reviews, document root causes, and drive continuous improvement initiatives focused on automation, reliability, monitoring, and operational efficiency
Basic Qualifications
  • Bachelor's degree in Computer Science, Engineering, or related technical field (or equivalent practical experience)
  • 3+ years (System Engineer) or 5+ years (Senior System Engineer) in Linux system administration, datacenter operations, or infrastructure engineering
  • Strong Linux/Unix fundamentals including system administration, shell scripting (Bash, Python), troubleshooting, and performance tuning
  • Experience with server hardware architecture, troubleshooting techniques, and understanding of compute, memory, storage, and networking components
  • Experience in automation and configuration management tools (Ansible, Puppet, Chef, Terraform).
  • Strong analytical and problem-solving skills with ability to diagnose complex technical issues under pressure
  • Excellent communication and collaboration skills; ability to work effectively with cross-functional teams
Preferred Qualifications
  • Experience managing large-scale GPU infrastructure (NVIDIA H100, A100, B200, GB200) in production environments supporting AI/ML workloads
  • Deep knowledge of GPU architecture, CUDA toolkit, GPU drivers, monitoring tools (nvidia-smi, DCGM)
  • Experience with HPC cluster management, job schedulers (Slurm, PBS, LSF), and container orchestration (Kubernetes, Docker)
  • Proficiency in out-of-band management protocols (IPMI, Redfish, BMC) and firmware management for server hardware
  • Experience with high-performance networking (InfiniBand, RoCE, RDMA) and network troubleshooting in GPU cluster environments
  • Familiarity with datacenter operations including rack installations, cabling, power management, and thermal considerations
Salary & Benefits
  • Competitive total compensation package (salary + equity).
  • Retirement or pension plan, in line with local norms.
  • Health, dental, and vision insurance.
  • Generous PTO policy, in line with local norms.

The base salary range for this position is $200,000 - $300,000 per year, depending on experience, skills, qualifications, and location. This range represents our good faith estimate of the compensation for this role at the time of posting. Total compensation may also include equity in the form of stock options.

We are committed to pay equity and transparency.

Fluidstack is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Fluidstack will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

Vacancy posted 9 hours ago
Similar jobs that could be interesting for youBased on the System Engineer, GPU Fleet in Austin, TX vacancy
  • $147.05k - $230.85k

     ...Expert Systems Engineer, Applied AI & On-Device Technology (Windows / PC Automation) This role...  ...security, and hardware constraints (CPU/GPU/NPU, memory, power). PC Automation...  ...configure, control, evaluate, and manage large fleets of Windows PCs. Automate complex,... 
    Fleet
    Temporary work
    Work at office
    Local area
    Flexible hours

    HP

    Austin, TX
    3 days ago
  •  ...consulting services. We are in search of a highly motivated candidate to join our talented Team. Job Title: System Engineer Datacenter GPU Location(s): Austin, TX Client is looking for a System Engineer Datacenter GPU to work in IPP (... 
    Suggested
    Worldwide

    Ampcus

    Austin, TX
    8 hours ago
  •  ...data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and...  ...career. THE TEAM: AMD's Data Center GPU organization is transforming the industry...  ...AMD is looking for a lead systems engineer to provide thought leadership and subject... 
    Suggested

    Advanced Micro Devices , Inc.

    Austin, TX
    5 days ago
  • $104.5k - $160k

     ...Our team is part of the Enterprise Engineering organization and manages one of the largest Windows client fleets in the world. Our focus is to consistently raise Amazon'...  ...services at a large scale. We are looking for a Systems Engineer with a Windows-focused infrastructure... 
    Fleet
    Work experience placement
    Flexible hours

    Amazon

    Austin, TX
    9 hours ago
  •  ...Linux Systems Engineer Imagine what you could do here. The people here at Apple don't just create products — they create the kind of wonders...  ...Join our Infrastructure Systems Engineering team within the Fleet Operations Engineering organization, where we design, build,... 
    Fleet

    Apple

    Austin, TX
    16 days ago
  •  ...Database Systems Engineer Imagine what you could do here. The people here at Apple don't just create products — they create the kind of...  ...Description Join our Database Systems Engineering team within the Fleet Operation Engineering Group — the people responsible for... 
    Fleet

    Apple

    Austin, TX
    1 day ago
  • $97.5k - $199.5k

     ...Job Description Electrical Engineer - Power Systems Engineer Role Summary Join a team of exceptional engineers as a motivated Electrical...  ..., on-site generation, and energy storage across the fleet. Qualifications ~ Bachelor's degree in electrical ~1... 
    Fleet
    Temporary work
    Flexible hours

    Oracle

    Austin, TX
    2 days ago
  •  ...Insight Global is now looking for a motivated Engineering Technician for one of our semiconductor...  ...and maintaining a compute farm of systems which includes Builders, Packagers, and...  ...• Manage and maintain a high-performing fleet of builders, packagers, testers, and core... 
    Fleet
    Contract work
    Worldwide

    Insight Global

    Austin, TX
    4 days ago
  •  ...Senior Electrical Engineer Saronic Technologies is a leader in revolutionizing autonomy...  ...positioning technologies for autonomous marine systems. This individual will be responsible for...  ...sensing electronics — across Saronic's fleet of autonomous vessels. This is a hands... 
    Fleet
    Permanent employment
    Temporary work
    Work at office

    Saronic

    Austin, TX
    7 hours ago
  •  ...Senior Software Engineer - Fleet Software Management System Austin, TX About the Team We are a fast-moving team of infrastructure and platform engineers building Fleet Orchestrator, which manages our entire fleet state, including software versions and configuration... 
    Fleet
    Remote work
    Relocation

    Avride

    Austin, TX
    4 days ago
  •  ...data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and...  ...AMD is looking for a seasoned systems engineering professional with a strong hardware foundation...  ...platforms, networking, storage, and AI GPU solutions. You will collaborate closely with... 
    Remote work

    Advanced Micro Devices , Inc.

    Austin, TX
    3 days ago
  •  ...'s fragile, centralized grid into a resilient and abundant system. We are engineers, operators, and creatives solving some of the most complex,...  ...equipment, field deployments, and distributed energy resource fleet. As a Data Engineer working with our Hardware team, you... 
    Fleet
    Shift work

    Base Power Company

    Austin, TX
    4 days ago
  •  ...Saronic Technologies is seeking a Senior Electrical Engineer specializing in ruggedized computing and networking systems to join our Electrical Engineering – Advanced...  ...- supporting both technology evaluation and fleet deployment. Key Responsibilities Lead R... 
    Fleet
    Permanent employment
    Temporary work
    Work at office
    Remote work

    Saronic

    Austin, TX
    5 hours ago
  •  ...Apple Business Systems Engineer Manager The people here at Apple don't just create products — they create the kind that's revolutionized entire industries! It's the diversity of those people and their ideas that inspires the innovation that runs through everything... 
    Remote work

    Apple

    Austin, TX
    5 days ago
  • $68k - $75k

     ...of Virginia's Best Places to Work, is looking for a Weapon System Test Engineer to join our team in Dahlgren to support testing of the Tactical...  ...issues Preferred Skills and Experience: Prior US Navy fleet experience as an E-4 or above Fire Controlman (FC) or Fire... 
    Fleet
    Full time
    Contract work
    Temporary work
    Casual work
    Work at office
    Flexible hours

    SimVentions, Inc - Glassdoor 4.6

    Austin, TX
    4 days ago
  • $142.6k - $261.5k

     ...team of product leaders, data scientists, designers, and software engineers enable our clients to solve their most complex product...  ...of quality assurance and testing practices. Knowledgeable in system development lifecycle and technology integration. To qualify... 
    Summer holiday
    Flexible hours

    EY

    Austin, TX
    1 day ago
  •  ...last eight years developing and deploying autonomous logistics systems for real-world operations. Our V2 aircraft carries 30 lbs...  ...full-rate production and scaling toward large autonomous cargo fleets. Backed by Y Combinator and a $37M AFWERX STRATFI award from... 
    Fleet
    Permanent employment
    Contract work
    Local area

    Skyways

    Austin, TX
    7 days ago
  •  ...Job Description: JOB DESCRIPTION: The system engineer is the owner of engineering requirement translation, specification and implementation of projects across entire system design for products. Responsible for implementation and design of custom software/... 
    Work at office

    HRM INFO LLC

    Austin, TX
    3 days ago
  •  ...What to Expect Tesla is seeking a highly motivated Engineer to develop functional test equipment (such as dynamometers, electrical testers...  ...vendors to develop and deploy new Drive Unit and Actuator test systems. You will assume full ownership of the design, development, and... 
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Austin, TX
    7 hours ago
  • $80.31 - $85.31 per hour

     ...the status quo" and transform the finance industry together. Join us for significant technical transformation in Broker-Dealer Systems and the modernization of our core technology infrastructure. As a Senior AI Developer, this role will be a leader in AI focused workflows... 
    Hourly pay
    Contract work
    Temporary work
    Work experience placement

    Randstad

    Austin, TX
    8 hours ago
  • $54.4k - $57.99k

     ...traditional call center responsibilities, requiring strong analytical skills, attention to detail, and the ability to work across multiple systems and processes. Maintains end-to-end responsibility for customer’s support needs providing timely, reliable, and courteous... 
    Contract work
    Work at office

    ASM Research, An Accenture Federal Services Company

    Austin, TX
    5 days ago
  •  ...Role: System Engineer Preferred Location: Onsite (Austin) Key responsibilities: 1. Dashboard Development and Maintenance: - Design and implement monitoring dashboards for SAP HANA and SAP NetWeaver using Splunk and Grafana. - Create custom visualizations to... 

    Info Way Solutions

    Austin, TX
    5 days ago
  •  ...Job Title High Performance Computing Systems Engineer Visa: USC, GC or GC-EAD Duration: 9 months with potential extension Location: Onsite in Austin, TX They'll give preference to someone who is currently local to Austin and then will consider people willing... 
    Work experience placement
    Local area
    Relocation

    ShiftCode Analytics

    Austin, TX
    3 days ago
  •  ...the world moves earth for construction. Founded by former SpaceX engineers and backed by Bain Capital Ventures, TerraFirma is automating...  ...Overview In this role, you'll take hands-on ownership of various systems spanning hardware and software, and analog and digital domains.... 
    Worldwide
    Relocation
    Weekend work

    Terra Firma

    Austin, TX
    2 days ago
  •  ...Applifecycle Systems Engineer - UCCE and UCM Location: RTP, NC / Austin, TX / San Jose, CA Duration: Fulltime Job Description: Skills Desired: Designing, Managing Cisco Unified Contact Center Enterprise technologies (UCCE), Cisco Customer Voice Portal... 
    Full time
    Work experience placement

    JConnect Infotech

    Austin, TX
    8 hours ago
  •  ...development, and implementation of enterprise-wide Operations Support Systems (OSS) applications and their associated operating systems and...  ...Divisions. Will participate in full life-cycle of systems engineering activities of high-quality, scalable solutions. Actively... 
    Work experience placement
    Work at office
    Local area
    Visa sponsorship

    Charter Communications

    Austin, TX
    5 days ago
  •  ...An Amazing Career Opportunity for a Senior Systems Engineer!! Location: Austin, TX Job ID: 45288 The Senior Systems Engineer is a key member of the Platform Services Team, responsible for designing, operating, securing, and sustaining enterprise infrastructure... 
    Job sharing
    Part time
    Worldwide
    Flexible hours

    ASSA ABLOY

    Austin, TX
    5 days ago
  • $159.2k - $215.3k

     ...that will provide low-latency, high-speed broadband connectivity to unserved and underserved communities around the world. As a Systems Engineer, this role is primarily responsible for the design, development and integration of communication payload and customer terminal... 
    Permanent employment
    Local area
    Flexible hours

    Amazon

    Austin, TX
    4 days ago
  • $141.3k - $211.9k

     ...Job Title 3GPP System Engineer Company Qualcomm Technologies, Inc. Job Area Engineering Group, Engineering Group Technical Standards Engineering General Summary Qualcomm is seeking a 3GPP System Engineer with strong experience in wireless system design... 
    Work experience placement
    Remote work
    Work from home

    Qualcomm

    Austin, TX
    3 days ago
  •  ...we are developers and pioneers of out-of-the-box communication systems for satellites, UAVs, launch vehicles, and other space and airborne...  ...individuals to join our team. In this role, a systems engineer is responsible for utilizing commercial modeling and simulation... 
    Permanent employment
    Full time
    Contract work
    Work experience placement
    Local area

    CesiumAstro

    Austin, TX
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to System Engineer, GPU Fleet. Be the first to apply!