Staff HPC Infrastructure Engineer
$173k - $237.95kDormont Manufacturing Co
Company Description Guardant Health is a leading precision oncology company focused on guarding wellness and giving every person more time free from cancer. Founded in 2012, Guardant is transforming patient care and accelerating new cancer therapies by providing critical insights into what drives disease through its advanced blood and tissue tests, real-world data and AI analytics. Guardant tests help improve outcomes across all stages of care, including screening to find cancer early, monitoring for recurrence in early-stage cancer, and treatment selection for patients with advanced cancer. For more information, visit guardanthealth.com and follow the company on LinkedIn , X (Twitter) and Facebook . About the Role: You enjoy an agile, very fast paced and highly technical environment. You are a self-driven accomplished technologist who strives to be ever improving your skills, value to the company and improve the computational infrastructure. You are dedicated to engineering excellence yet pragmatic and flexible. You have the ability to maintain the day-to-day support SLA while running various key projects that move the business forward. Essential Duties and Responsibilities:
- Act as a technical lead in day to day operations
- Help manage the HPC interconnects
- Help integrate the HPC systems with the bandwidth on-demand system
- Help integrate the HPC system with the single namespace storage system
- Help integrate cloud bursting as part of the HPC abstraction work
- Work with the networking infrastructure team to manage and optimize the connectivity to and from the HPC systems and locales
- Help manage multiple HPC clusters and cluster file systems.
- Help research, develop and implement the next generation HPC solution
- Troubleshoot the production system stack down to source code level e.g. shell scripts, python and others.
- Maintain, monitor, and support the infrastructure environment and/or facilities.
- Use and maintain enhanced production monitoring and additional capability.
- Support improvements for increased system reliability and performance.
- Support multiple systems or applications of medium to high complex (complexity defined by size, technology used, and system feeds and interfaces) with multiple concurrent users, ensuring control, integrity, and accessibility.
- Support systems at remote locations, including internationally
- Work with offsite consultants to maintain the infrastructure
- Work with vendors to troubleshoot, upgrade and repair systems as needed
- Participate in a 24/7 on-call rotation
- B.S. in Computer Science or related field
- 4+ years of TCP/IP networking experience
- 2+ years of RDMA networking experience
- 4+ years of Linux/Unix administration, knowledge of Unix network protocols, TCP/IP network fundamentals, core infrastructure technologies and virtualization
- 2+ years of large-scale data storage and compute clusters (HPC) infrastructure
- 2+ years working in and with on-premise and cloud-based (AWS, Google, IBM and Azure) data-centers
- 2+ years of building software release and ops processes and automation toolset
- 2+ years providing documentation of system administration
- Cisco Certified Network Professional certification
- Experience with Arista and compatible networking, up to and including 400 gb/s links
- Experience with Mellanox infiniband fabric
- Experience administering IBM’s General Parallel File System
- Experience administering SLURM scheduler
- Experience with using warewulf
- Experience with cloud bursting technologies
- Experience with wide area file systems
- Experience with docker and container technologies
- Experience with Kubernetes
- Operating infrastructure compliant with HIPAA and SOX standards
- ...small, highly motivated, and focused on engineering excellence. This organization is for... ...allow us to seamlessly build-out new GPU infrastructure with little to no engineering assistance... ...with 5 years in the ethernet AI/HPC space. Deep understanding of congestion...Suggested
$230.2k - $316.6k
...Guardant Health’s High-Performance Computing team (HPC) builds and operates the computational technology infrastructure backbone of the company. This includes scalable... ..., the HPC team is seeking a strong technical engineering leader who can help maintain and grow the HPC...SuggestedFor contractorsWork experience placementWork at officeRemote workWork from home$230.2k - $316.6k
Guardant Health is seeking a Director of HPC Infrastructure Engineering to lead their engineering team in Palo Alto, CA. The role oversees HPC environments, ensuring operational stability and performance. Candidates are preferred from the Bay Area but partial remote work...SuggestedRemote work$173k - $237.95k
Guardant Health is seeking a technical lead for managing HPC infrastructure in Palo Alto, CA. Ideal candidates will have strong backgrounds in TCP/IP networking and Linux administration. The role encompasses the integration of complex systems and improving computational...SuggestedWork at office$90k - $110k
...startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate... ...are seeking a dedicated and detail-oriented Operations Engineer to join our HPC Networking Team. HPC Networking at CoreWeave is tasked...SuggestedPermanent employmentTemporary workCasual workWork at officeFlexible hours$120k - $250k
What MatX Is Building We're a small engineering team designing a custom chip. The work is... ...engineers depend on every day. The infrastructure that supports all of this — CI/CD, compute... ...batch compute or job schedulers — HPC, Slurm, Nomad, Kubernetes batch, or similar...Full timeWork experience placementWork at officeLocal areaRemote workMonday to FridayFlexible hoursDay shift3 days per week$207k - $300k
Google Inc. is looking for a Staff Software Engineer in Sunnyvale, CA, to lead on-device Machine Learning infrastructure projects. The role requires 8 years of software development experience, strong knowledge in ML design, and the ability to work across teams on complex...- ...established industry player is seeking a Senior Director of Solutions Engineering to lead innovative teams in AI and high-performance computing... .... The ideal candidate will have extensive experience in HPC and AI systems design, with a proven track record in managing technical...
- Job Description Overview We’re looking for a skilled and passionate Customer Experience Strategist to shape and elevate the QuickBooks Live Expert Assisted experience. You'll play a key role in defining the strategy and end-to-end experience for expert-assisted onboarding...Work experience placement
- A cutting-edge robotics company based in California is looking for an experienced Machine Learning Infrastructure Engineer. This role involves designing scalable ML training platforms, optimizing high-performance computing systems, and ensuring robust job scheduling and...
$160k - $225k
SPACE EXPLORATION TECHNOLOGIES CORP, also known as SpaceX, is seeking a Sr. Software Engineer in Sunnyvale, CA, focusing on High Performance Computing for the Starlink program. The role requires extensive experience in developing real-time software and strong C/C++ skills...$262k - $365k
Google Inc. is seeking a Senior Staff Software Architect to lead the development of... ...software technologies for AI and HPC infrastructure. The ideal candidate will possess a Bachelor... ...in Computer Science or Electrical Engineering and have 8 years of experience in software...$203.45k - $344.3k
...Senior Staff AI Data Infrastructure/Pipeline Engineer Santa Clara, CA XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric...Full timeOverseas- black.ai is looking for a skilled platform engineer in Palo Alto to enhance our AWS infrastructure and support quantum simulations. This role requires strong experience in platform engineering, DevOps practices, and GPU workloads. As a platform engineer, you will improve...
$120k - $300k
Decisive Point is seeking a software engineer to design and build core libraries and improve developer infrastructure in Mountain View, CA. You will enhance the speed and reliability of tools used daily by engineers. The ideal candidate has a Bachelor's in Computer Science...Full time- ...AV efforts. We’re proud to serve as the infrastructure platform for teams developing autonomous... ...are seeking a Senior ML Infrastructure engineer to help build and scale robust Compute... ...Experience with high performance computing (HPC). Experience working with or...Local areaWork from home
$152k - $241.5k
NVIDIA Gruppe is seeking a motivated Performance Engineer to influence the roadmap of our communication libraries. The role involves conducting in-depth performance characterization on large multi-GPU and multi-node clusters and studying the interaction of our libraries...$152k - $241.5k
NVIDIA Gruppe in Santa Clara is seeking a Senior Software Engineer to enhance their HPC infrastructure. The role involves applying distributed systems patterns, automation, and building scalable services in a hybrid multi-cloud environment. Candidates should have strong...- NVIDIA Corporation is seeking a talented SDK Engineer to join the NVLink SDK group in Santa Clara, California. In this role, you will design... ..., contributing to the development of technologies for AI, HPC, and cloud environments. Applicants should have a B.Sc. in Computer...
$153k - $222k
Applied Intuition is seeking a Senior Engineer based in Mountain View, CA, to manage the HD maps infrastructure crucial for smart vehicle solutions. The position entails defining storage formats, collaborating on designs, and ensuring robust map functionalities across products...- ...Job Title: Lab Infrastructure Engineer Location: Menlo Park, California (100% Onsite) Note: Only GC & USC need to apply for this opportunity as per requirement. Role Summary This role owns end-to-end execution for building, expanding, and operating high-performance...Afternoon shift
$200k - $400k
A dedicated research lab is seeking a Network Engineer to design and optimize low-latency, high-bandwidth networking solutions for AI supercomputing clusters. You will work on cutting-edge technologies in collaboration with world-class researchers. The ideal candidate...$25 - $75 per hour
...replacements. Team Collaboration and Additional Support: Maintain effective communication with the House Manager or other household staff. Adapt flexibly to the household’s schedule and be available for overtime, weekend, or holiday shifts as needed. Provide any...Hourly payFull timeLive outImmediate startShift workWeekend workDay shiftAfternoon shift- A leading technology firm in California is seeking network engineers with hands-on experience in InfiniBand and Ethernet for managing high-performance computing (HPC) and artificial intelligence (AI) environments. Candidates should have advanced knowledge of networking...
- Overview Sycamore is building the infrastructure that makes autonomous AI agents reliable, secure, and enterprise‑ready. Responsibilities Own systems at the intersection of agent isolation, networking, storage, orchestration, and enterprise integration. Design and implement...
- ...Forward Deployed Infrastructure Engineer Title of Role: Forward Deployed Infrastructure Engineer Location: Washington D.C., hybrid Company Stage of Funding: Secondary Market - Software Development Office Type: Hybrid Salary: [To be confirmed with final...Work at officeRemote workVisa sponsorship
- ...started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast... ...-specific search solutions to a composable, agent-native infrastructure foundation that agents and applications build on to locate, search...Full timeWork at officeRemote workFlexible hoursShift work
- NVIDIA Gruppe in Santa Clara is seeking a Senior Software Engineer to design and build next-generation cloud platforms. This role involves... ...of software development experience and a passion for cloud infrastructure. With competitive salaries and a generous benefits package,...
$200k - $340k
...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who... ...ABOUT THE ROLE: We are seeking a talented and motivated Cloud/Infrastructure Security Engineer to join our security team. In this role,...Temporary work$132.1k - $279.8k
...performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast. Senior Infrastructure Engineer Mission At Groq, we’re building a custom cloud from the ground up — one data center at a time. Our Infrastructure Platform...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff HPC Infrastructure Engineer. Be the first to apply!
- software engineer staff Palo Alto, CA
- assistant engineer Palo Alto, CA
- staff design engineer Palo Alto, CA
- technology administrator Palo Alto, CA
- staff data engineer Palo Alto, CA
- senior staff systems engineer Palo Alto, CA
- staff engineer Palo Alto, CA
- senior staff engineer Palo Alto, CA
- assistant mechanical engineer Palo Alto, CA
- assistant electrical engineer Palo Alto, CA


