Senior/Staff Infrastructure Engineer San Francisco
$180k - $250kFal
You are a hands-on engineer who builds the software and processes that keep a large fleet of GPU servers healthy and productive. You write systems and tooling for managing 1000s of servers including provisioning, health monitoring, error detection, and recovery — and when something breaks that automation can’t fix, you drive resolution with partners. Key responsibilities Build and maintain Python fleet tracking system that manages the full lifecycle of servers including contracting and procurement, target use, pricing, availability, health, RMAs, etc Build server management tooling that automates provisioning, health checks, GPU diagnostics, recovery and alerting Create and maintain metrics, dashboards, and alerting for hardware health across the fleet (GPU errors, disk failures, network issues, thermals) Leverage AI to an extreme level to build tools and automate alerting and recovery Implement and enforce OS-level security: hardening baselines, SELinux/AppArmor policies, SSH key management, vulnerability scanning, and compliance automation Manage and optimize distributed and local storage systems supporting model weights, checkpoints, and ephemeral scratch: NVMe arrays, NFS, parallel file systems, and object storage Tune Linux systems for AI workloads: kernel parameters, NUMA topology, CPU pinning, hugepages, I/O schedulers, and GPU driver stack optimization (NVIDIA drivers, CUDA, container runtimes) Develop a suite of automated error detection and recovery processes Work with partners to solve technical issues Requirements 5+ years experience managing bare-metal and VM server fleets at scale (100+ nodes) Strong software engineering skills in Python; you write production tooling, not scripts Deep Linux systems knowledge: boot process, kernel tuning, networking, storage, systemd, cgroups, namespaces, performance profiling Strong experience with configuration management and infrastructure-as-code: Ansible, Terraform, cloud-init Solid understanding of storage technologies: LVM, RAID, NVMe, NFS, Lustre or GPFS, and Linux I/O stack tuning Familiarity with hardware diagnostics and failure modes (GPUs, NVMe, NICs, memory) Experience building internal tools or dashboards for infrastructure visibility Excellent communication and ability to drive technical decisions across teams Self-starter who executes quickly, takes ownership, and constantly seeks improvement Nice to have Familiarity with network configuration and diagnostics (VLAN, VXLAN, ECMP, BGP, tcpdump) Experience with NVIDIA GPU infrastructure: driver management, health monitoring, DCGM, NVLink/NVSwitch diagnostics, RDMA, InfiniBand/RoCEv2 Experience with AMD GPUs Experience with bare metal and VM provisioning (PXE/iPXE, Kickstart, libvirt, Qemu/KVM) Experience with compliance frameworks relevant to cloud providers (SOC 2, ISO 27001) Compensation $180,000-250,000 plus equity + benefits Location What we offer at fal Interesting and challenging work A lot of learning and growth opportunities We are currently hiring in downtown San Francisco. We offer visa sponsorship and will help you relocate to San Francisco. Health, dental, and vision insurance (US) #J-18808-Ljbffr
- ...What you’ll do As a Senior / Staff Network Engineer, you will define the global technical strategy,... ...Airwallex’s enterprise and cloud network infrastructure. You will design and deploy highly... ...Singapore, Sydney, Melbourne or San Francisco. Responsibilities: Build the...SeniorFlexible hoursWeekend work
$180k - $250k
...we offer at fal Interesting and challenging work A lot of learning and growth opportunities We are currently hiring in downtown San Francisco. We offer visa sponsorship and will help you relocate to San Francisco. Health, dental, and vision insurance (US) #J-18808-Ljbffr...SeniorCurrently hiringRelocationVisa sponsorship$232k - $290k
...Please note this is for San Francisco, CA, United States. You only need to apply to one location if there are multiple listed... ...opportunities, join us, and build real world value. THE WORK As a Senior Staff Security Engineer focused on AI Security, you will be Ripple's deepest...SeniorFull timeWork at officeLocal area$145k - $186k
...Staff Engineer, Cloud Engineering | Phoenix, AZ or San Francisco, CA (Hybrid, 3 days/week) A leading FinTech is in the final stretch of a multi-year cloud... ...scalable, secure, and cost-effective AWS infrastructure (EC2, EKS, Lambda, RDS, S3, IAM) • Driving Infrastructure...SuggestedWork at officeFlexible hours3 days per week$170k - $220k
...Staff Cloud Engineer / AWS / Hybrid in San Francisco San Francisco, California Hybrid Full Time $170k - $220k A leading financial technology... ...-critical payments platform from on-premise infrastructure to AWS. This is a high-impact initiative tied directly...SuggestedFull time3 days per week$204k - $233k
...Staff DevOps Engineer San Francisco, CA (Hybrid) | Full-Time We're partnering with a well-capitalized infrastructure technology company building the foundation for the next generation of transportation. This organization sits at the intersection of energy...Full timeLocal area- ..., a self-hostable inference engine for pre-trained models under... ...and growing. Headquartered in San Francisco, backed by Index Ventures... ...images, model caching, eval infrastructure. Today we deploy to AWS and... ...billions of tokens per week). Senior ICs own a requirement end‑to...SeniorRemote work
$180k - $286k
...Senior Software Engineer, AI Platform and Enablement About the Role We’re building a next‑... ...Design, implement, and maintain our AI infrastructure supporting our machine learning life... ...located in the Mission District of San Francisco, CA. We’re hiring for a mix of...SeniorWork at officeRemote workFlexible hours$202.5k - $247.5k
...Software Engineer III/Senior, Infra Platform About ngrok Inc. ngrok is an all-in-one... ...operate ngrok itself. We think about infrastructure the way software engineers think about... ...within commuting distance to San Francisco. Our Bay Area employees commute to the...SeniorPermanent employmentFull timeWork at officeLocal areaRemote workHome officeFlexible hours$207k - $345k
...Senior Engineering Manager - Payroll Platform Rippling gives businesses one place to run HR,... ...365—all within 90 seconds. Based in San Francisco, CA, Rippling has raised $1.4B+ from... ...leader. You will focus on systems design, infrastructure reliability, and backend efficiency...SeniorWork at office3 days per week- ...Senior Cloud Data Operations Engineer Responsibilities Support/Operate an Enterprise Data Services Platform (RedShift/EMR/OpenSearch Service... ...in Scala & Java This is a long-term contract in San Francisco (hybrid/remote). MUST be a US citizen, or at least...SeniorLong term contractContract workRemote work
$117.2k - $229.2k
Senior Software Engineer - Azure Object Storage job at Microsoft Corporation. San Francisco, CA. Azure Object Storage team is looking for a talented and highly motivated Senior Software Engineer to design and develop the next generation of our object storage stack. We are...SeniorLocal area- ...A leading AI infrastructure company is seeking a Staff Infrastructure Engineer in San Francisco. In this role, you will own the systems that power the company at scale, focusing on reliability, scalability, and developer velocity. You will be responsible for designing...SeniorWork at office
- Junior Network Engineer job at Revel Staffing. San Francisco, CA. Key Responsibilities Firewall Operations & Security... ...and DataClear standards. Assist senior engineers with firewall changes... ...‑critical environment. Network Infrastructure Support Collaborate with network...Work experience placement
$160k - $300k
...our mission is to revolutionize how engineering decisions are made, turning... ...together. About the Role As a Senior / Staff Infrastructure Engineer at Apiphany, you’ll design... ...Sponsorship ~ Hybrid work: 3 days in San Francisco office ~401(k) plan ~ Medical,...SeniorWork at officeVisa sponsorshipFlexible hours$160k - $210k
Zip in San Francisco is seeking an experienced Tech Lead to oversee the Core Infrastructure team. The role involves managing Zip's Kubernetes platform and collaborating with... ...candidate has over 6 years of software engineering experience in infrastructure, with a focus...Senior$102.5k - $188.9k
Cyber Oracle Cloud Security - Senior Consultant job at Deloitte. San Francisco, CA. Our Deloitte Cyber team understands... ...Security, Information Security, Engineering, Information Technology,... ...EPM) Experience with Oracle Cloud Infrastructure (OCI) security Knowledge of Oracle...SeniorVisa sponsorship- ...PhDs, creatives, technologists, and engineers working together to empower people... ...in the Mission District in San Francisco, the SoHo neighborhood of New York... ...experienced and highly motivated "Senior or Staff Security Infrastructure Engineer" to join our team as one...SeniorHourly payFull timeFlexible hours
$225k - $275k
Crusoe Energy Systems LLC in San Francisco is looking for a Senior Staff Network Operations Engineer to ensure production reliability across its global network. In this role, you will lead incident response and define key operational standards. Ideal candidates will bring...Senior- ...Patch Technologies, Inc, located in San Francisco, is hiring a Product Engineer to take ownership of the product development lifecycle. The role involves building workflows for environmental commodities, collaborating with cross-functional teams, and maintaining high engineering...Senior
$200k - $275k
...A leading technology firm in San Francisco is seeking a Staff Software Engineer focused on Product Security. This role involves building secure frameworks, resolving security risks, and collaborating with teams to ensure best practices in security. The ideal candidate...Senior$250k - $350k
...Senior Software Engineer - Infrastructure Platform - San Francisco, CA - $250K-$350K Location: San Francisco, CA Work Arrangement: Onsite Overview: We're seeking a Senior Software Engineer to help build and scale the core infrastructure powering...SeniorFull timeVisa sponsorshipRelocation package$200k - $240k
...secure world for all. The AI Engineering Team is chartered with... ...pipelines, high-performance infrastructure, and operational tooling... ...faster than the market. As a Senior or Staff AI Infrastructure Engineer... ...others. Headquartered in San Francisco, TRM operates as a...SeniorRemote workWorldwide- A leading tech company in San Francisco seeks a Senior Staff Engineer to architect and lead the payroll platform. The role involves setting technical strategies and mentoring engineers to ensure robust and scalable solutions in a high-growth environment. Candidates should...Senior
- Airbnb, Inc. is looking for a Senior Technical Individual Contributor to define and execute the long-term vision for the Trust Platform in San Francisco. With over 12 years of experience in backend and platform engineering, you will drive strategic architectural decisions...Senior
- Epoch Biodesign in San Francisco is seeking a Senior Staff Cloud Support Engineer to lead technical escalations and improve cloud infrastructure. You will mentor engineers and influence architectural decisions while ensuring high availability for AI workloads. The ideal...Senior
- A fast-growing AI company in San Francisco is seeking a Senior/Staff Infrastructure Engineer to build and operate cloud infrastructure. This full-time, hybrid role focuses on GCP, Kubernetes, and infrastructure-as-code. You will be responsible for securing deployments and...SeniorFull time
$185k - $275k
...passionate, skilled, and experienced Cloud Infrastructure Engineer to help architect, build, and operate... ...5k per year. Preferred locations: San Francisco Bay Area or Seattle. We provide... ..., base pay varies based on location, seniority, skills, and experience. Wherobots...SeniorFull timeWork at officeRemote workWork visaShift work- OpenAI is looking for a Backend Engineer to join the Codex for Finance team in San Francisco. In this role, you will own the end-to-end development lifecycle for new platform capabilities, working closely with product and research teams. Ideal candidates will have 7+ years...Senior
$207k - $362.25k
...like Slack and Microsoft 365—all within 90 seconds. Based in San Francisco, CA, Rippling has raised $1.4B+ from the world’s top... ...sent from @Rippling.com addresses. About the role As the Senior Staff Engineer for the Payroll Platform team, you will be the lead architect...SeniorWork at office3 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior/Staff Infrastructure Engineer San Francisco. Be the first to apply!
- software engineer staff San Francisco, CA
- assistant engineer San Francisco, CA
- assistant engineering manager San Francisco, CA
- staff design engineer San Francisco, CA
- project engineer assistant project manager San Francisco, CA
- technology administrator San Francisco, CA
- staff data engineer San Francisco, CA
- assistant chief engineer San Francisco, CA
- senior staff systems engineer San Francisco, CA
- staff engineer San Francisco, CA

