Principal/Staff HPC Network Engineer
The San Francisco Compute Company
The San Francisco Compute Company
We're building the company which will de-risk the largest infrastructure build-out in history.
When people finance GPU clusters, the datacenters housing them, and the infrastructure powering them, they need "offtake" - meaning someone has signed a contract to lease the cluster for a period of time before its even built.
Financing a GPU cluster is inherently risky, since margins are thin and volumes are huge. Lenders don't want to take on the risk that cluster developers can't repay their loan, and cluster developers really don't want to risk not selling their cluster. As a result, risk is offloaded to the customer using fixed-price long-term contracts.
If you don't mitigate this customer risk, there's a bubble. This isn't SaaS anymore - application layer companies sign multi-year contracts for computer and inference, but sell to customers on monthly subscriptions. If you mess up a purchase, it's game over: a minor shift in your revenue growth rate might mean the difference between profit or bankruptcy. But what if companies could exit their contract by selling it back to the market?
Otherwise, as AI scales, compute only becomes available to folks who can effectively take on that risk. A 2-person startup in a San Francisco Victorian can't realistically sign a 5-year take or pay contract on $100m supercomputers. But they may be able to buy the month of liquidity that someone else sold back.
So that's what we make: a liquid market for GPU offtake.
About the Role
GPU clusters are some of the most performant computers on the planet. Even smaller clusters by today's standards would have ranked in the TOP500 five years ago. Our infrastructure team is responsible for architecting and deploying new clusters around the world and keeping them running smoothly. You'll participate in on-call rotation, fix issues when they arise, and lean into automation to enable deployments at scale. We're a small but ambitious team so you'll be an early contributor helping to shape culture, mentor junior engineers, and learn from our customers.
About You
You will have 10+ years of experience with hands-on management or architecture with network for at least one GPU cluster in the past (ideally a cluster with >1k GPU's but not required)
You deeply understand the fundamentals of Ethernet (RoCEv2) and/or InfiniBand networks in CLOS/fat-tree topologies
You have built HPC network architectures (eBGP, fat-tree, VXLAN, MCLAG, etc.)
The idea of implementing zero-touch provisioning for a large multi-layer network excites you (you embrace automation)
You appreciate, value and generate good documentation
You have the ability and willingness to mentor junior engineers
You're open to coming in to our San Francisco office 3-4 days per week
Some Nice to Haves
You understand data center concepts including power, cooling and how to engage with colo providers
You have experience with Linux systems administration including managing kernel drivers and tuning the network stack
You have experience with Linux virtualization (KVM, QEMU, libvirt, etc.)
You've had exposure to containers and Kubernetes operators
Benefits
Team members are offered a competitive salary along with equity in the company
Visa Sponsorships
Yes, we sponsor visas and work permits
Retirement Matching
We match 401(k) plans up to 4%
Medical, Dental & Vision
We offer competitive medical, dental, vision insurance for employees and dependents and cover 100% of premiums
Time Off
We offer unlimited paid time off as well as 10+ observed holidays
Parental Leave
We offer biological, adoptive, and foster parents paid time off to spend quality time with family
Daily Lunch
We cover lunch daily for employees
Unlimited Office Book Budget
You can buy as many books for the office as you want
The San Francisco Compute Company is committed to maintaining a workplace free from discrimination and harassment.
We make employment decisions based on business needs, job requirements, and individual qualifications, without regard to race, color, religion, belief, national origin, social or ethical origin, age, physical, mental, or sensory disability, sexual orientation, gender identity or expression, marital status, civil union or domestic partnership status, past or present military service, HIV status, family medical history or genetic information, family or parental status including pregnancy, or any other status protected by law.
We welcome the opportunity to consider qualified applicants with prior arrest or conviction records. Our commitment to diversity includes hiring talented individuals regardless of their criminal history, in accordance with local, state, and federal laws, including San Francisco's Fair Chance Ordinance and California's ban-the-box laws.
- A tech-driven energy solutions firm based in San Francisco is seeking a Staff Network Deployment Engineer. The candidate will lead the deployment of advanced network systems that support high-performance GPU compute clusters. The role requires a minimum of 8 years of network...Suggested
- Lawrence Berkeley National Laboratory is seeking a Senior Network Platform Engineer to advance networking for high-performance computing. The role involves managing software and automation for network operations and contributing to modernization initiatives. Ideal candidates...SuggestedFlexible hours
- ...Staff / Sr. / Principal Nuclear Medicine Technologist | University of California San Francisco The technologist is licensed in all scopes of practice as defined by the State of California Radiologic Health branch, and performs routine to complex diagnostic and therapeutic...PrincipalWork experience placementShift workRotating shiftDay shift
$172k - $215k
...F5 Principal Network Engineer At Early Warning, we've powered and protected the U.S. financial system for over thirty years with cutting-edge solutions like Zelle®, Paze℠, and so much more. As a trusted name in payments, we partner with thousands of institutions to...PrincipalHourly payWork at officeLocal areaImmediate startRemote workVisa sponsorshipWork visaFlexible hours- ...Sr. / Principal Nuclear Medicine Technologist (Fusion) The Technologist position is an 8-hour work schedule with variable hours and... ...department. Participates in orientation and training of assigned staff, students, residents, and faculty. Participates in performance...PrincipalWork experience placementWorldwideShift workRotating shiftAfternoon shift
$320k
Principal Engineer, AI And Data Platform Engineering (r4941) Own the AI data platform from training to deployment... ...models for edge hardware Familiarity with HPC systems (schedulers, parallel storage, high‑speed networking) Why Join Us You will define the infrastructure...PrincipalFull timeTemporary workPart time- ...Senior HPC & GPU Infrastructure Engineer Sciforium is an AI infrastructure company developing next-generation... ...Liaison: Coordinate with data center staff, hardware vendors, and on-site... ...maintenance of the cluster. 2. Linux & Network Administration OS Management:...Flexible hours
$175k - $300k
...design and stand up a 400GbE spine-leaf network from scratch to do eBGP and then... ...button click, this role is for you! As a principal network engineer, you’ll be responsible for the design,... ...with technologies associated with HPC and GPU networks including RoCEv2, InfiniBand...PrincipalWork at officeLocal areaRemote workVisa sponsorship- An innovative tech platform is seeking a Senior Principal Software Engineer to lead the development of its next-gen API Platform. The role involves defining the technical vision, collaborating with various departments, and mentoring other engineers. The ideal candidate...PrincipalRemote work
- Palo Alto Networks, Inc. is seeking a Sr. Principal Software Engineer to innovate in secure cloud environments. You will lead automation in cloud security and design cutting-edge infrastructure solutions. The ideal candidate will have extensive experience in GCP, Kubernetes...Principal
$261k - $326k
...technology company specializing in AI infrastructure is seeking a Principal Engineer to enhance reliability and scalability of cloud systems.... ...operational excellence. Candidates should have strong networking expertise and systems fundamentals, especially in high-scale...Principal- ...Distributed Systems Software Engineer - Public Cloud (Senior/Lead/Principal) Our Public Cloud engineering teams are responsible for innovating and maintaining a large scale distributed systems engineering platform that ships hundreds of features to production for tens...Principal
$250k - $325k
Electric Capital is looking for an experienced engineer to join their team in San Francisco. You will play a key role in architecting... ...mentor junior engineers. The role requires a deep understanding of networking, including Ethernet and InfiniBand. The position offers...$225k - $275k
...Crusoe. About this Role Crusoe Cloud is seeking a Senior Staff Network Deployment Engineer to serve as the technical owner of how we deploy network... ...rapidly scale our footprint of high-performance compute (HPC) and GPU-based AI infrastructure, you will define the deployment...Temporary workRemote work- ...future of AI. About this Role Gimlet Labs is seeking a Network Engineer to design, build, and scale the network infrastructure... ...VXLAN, and routing policies. Understand high-performance AI/HPC networking concepts such as RoCEv2, InfiniBand, lossless Ethernet...
$320k - $405k
...group of committed researchers, engineers, policy experts, and business... ...infrastructure — the network, compute, and storage backbone... ...coherent, LAGs). * Experience with HPC fabrics like InfiniBand, RoCE... ...policy: Currently, we expect all staff to be in one of our offices...Contract workWork at officeVisa sponsorshipFlexible hours- Abby Care is seeking a Principal Engineer to lead the technical direction of its platform in San Francisco. This full-time role demands over 10 years of experience in architecture for large-scale systems, with a focus on scalable AI-driven workflows. The successful candidate...PrincipalFull time
$240k
Convex is seeking experienced engineers to design and maintain its global cloud infrastructure in San Francisco. This role involves architectural decisions and collaboration with teams to improve system performance and reliability while prioritizing simplicity. The ideal...Principal- A leading technology company in San Francisco is seeking a Principal Software Engineer to guide technical direction and architecture across projects. The ideal candidate has over 10 years of software engineering experience, proficiency in modern backend languages, and a...Principal
$285k - $315k
Ironclad Inc. is seeking a Principal Engineer in San Francisco to drive the development of AI-powered contract solutions. The role requires over 10 years of experience in software engineering, especially in designing and evolving distributed systems. You'll collaborate...PrincipalContract work- Health Universe, Inc. is seeking a Principal Engineer to enhance their platform that revolutionizes science and medicine. This role focuses on developing a web application that supports health data scientists in deploying cutting-edge ML apps while ensuring compliance...Principal
$240k - $250k
Saviynt Inc. is seeking a Principal Software Engineer in San Francisco, CA, to join their AI Security team. In this role, you will design and implement workflows for AI security products and develop secure, scalable software across major cloud platforms. The ideal candidate...Principal- Jack & Jill is looking for a Principal Software Engineer to join their team in San Francisco. In this role, you will architect and build secure embedded finance products using Java. You’ll work closely with a seasoned team to shape a high-scale platform and innovate on...Principal
$156.86k - $191.72k
...System Infrastructure / Platform Engineer The National Energy Research Scientific Computing... ...Engineer to help build and manage HPC systems and Linux-based infrastructure. NERSC... ...GPU clusters, parallel storage, high-speed networking, Slurm, and Kubernetes, balancing...Full timeRemote workFlexible hours$275k - $300k
Snorkel AI is seeking a Principal Software Engineer to shape product and technical systems to meet today's AI challenges. The role demands 12+ years of experience in software engineering, focusing on building scalable AI data solutions for enterprise clients. This position...Principal$170k - $277k
Palo Alto Networks, Inc. is seeking a Principal Software Engineer in San Francisco, CA to drive technical leadership for next-generation cloud security solutions. The role involves designing, implementing, and troubleshooting high-scale distributed systems while collaborating...Principal- ...Java Backend Engineer Lead the design, development, and scalability of core Java-based backend services that power our multi-tenant platform. Architect and optimize platform components to support high-throughput, low-latency, and mission-critical workflows. Integrate...Principal
$240k - $250k
Saviynt in San Francisco is hiring a Principal Software Engineer to lead the development of AI security products. With over 10 years of software engineering experience required, you will design, implement, and release end-to-end workflows across cloud platforms like AWS...Principal- Dormont Manufacturing Co is seeking a Principal Backend Engineer in San Francisco, California, to lead backend development for the Cortex platform. The ideal candidate will have over 8 years of experience in software engineering, strong programming skills in Go and Python...Principal
- B Capital is looking for a Principal Software Engineer for the Data Platform. In this high-impact role, you will be responsible for architecting the company's foundational data ecosystem using technologies such as Snowflake, dbt, and Neo4j. You will lead the design and...Principal
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal/Staff HPC Network Engineer. Be the first to apply!
- principal reliability engineer San Francisco, CA
- chief design engineer San Francisco, CA
- principal infrastructure engineer San Francisco, CA
- civil engineer project manager San Francisco, CA
- principal security engineer San Francisco, CA
- principal data engineer San Francisco, CA
- chief engineer San Francisco, CA
- principal developer San Francisco, CA
- director data engineering San Francisco, CA
- general engineer San Francisco, CA

