Senior HPC Engineer
$140k - $160kVistronix
ASRC Federal is looking for a Senior HPC Engineer, as ASRC Federal InuTeq provides High Performance Computing services across the full HPC lifecycle including computational requirements, architecture, acquisition, and operations for federal government customers, while promoting innovation, continuous standards-driven improvement, and industry best practices; this senior role supports the NASA NACS High Performance Computing contract by delivering continuous architectural enhancements and operational excellence, with the successful candidate serving as a proactive senior member of the team reporting to the Manager of the HPC Computer Systems and Storage (CSS) group and bringing extensive experience in designing, installing, maintaining, and upgrading large-scale HPC environments, including expertise with common batch schedulers such as PBS, Slurm, or Moab/Torque and InfiniBand troubleshooting and optimization, while actively participating in day-to-day HPC operations such as system patching, OS upgrades, new system deployments, scripting, troubleshooting, testing, benchmarking, and user tool development, as well as directly supporting scientific users by diagnosing and reproducing application performance issues, analyzing trouble tickets for recurring patterns, and contributing to both system improvements and user education.
Key Responsibilities:
Design, deploy and maintain HPC clusters with over 2000+ nodes with InfiniBand, 100+ petabytes of data storage in production.
Shepherd and/or contribute to scalable feature designs through the entire software development process, from requirements and use cases to release
Designs and develops scripts for system administration, monitoring and usage reporting.
Modify existing software to correct errors and/or improve performance
Designs and develops scripts for system regression test and performance (file systems (Luster), scheduler (PBS), interconnect (HDR/NDR, Slingshot, ), high availability, etc.).
Troubleshoots, isolates and resolves application, system and other technical problems (hardware, software, and network).
Understands research use cases, researches and deploys new technologies, defining cost, performance and other trade-offs.
Manages and maintains tools for provisioning, configuration management (HPCM, Ansible & GIT), resource management, scheduling and all necessary aspects of HPC in accordance with best practices.
Researches, deploys and manages networking and security infrastructure, including development of policies and procedures.
Assists in developing and writing proposals and publications.
Creates and provides clear documentation.
Mentoring junior staff and cross training peers
After hours/weekend support as required
Moderate Supercomputing System Administration that contributes to:
Day-to-day operations of the Linux HPC clusters and storage systems
Proactive monitoring, analyze, and correct system issues
Development of scripts to automate repetitive tasks or tools to enhance support of the HPC systems
System performance analysis and tuning
Building, installing, and supporting user-requested software
Supporting evaluation and assessment of new HPC technology
Resolving user report issues and manage support tickets requests in Remedy
Requirements:
Bachelor's degree in computer science or related field
Strong computer science background with in-depth systems-level knowledge in operating systems and networking
A minimum of 10 years of experience in the administration of HPC systems and scheduling software (PBS, Slurm, or Moab/Torque)
A minimum of 10 years of experience of systems programming in heterogeneous, multi-platform HPC environments
Strong ability to analyze, debug and maintain the integrity of an existing code base
Demonstrated equivalence of 5 years of Linux/UNIX user support experience and hands-on experience with administration of Linux systems
Experience working with HPC applications and proficiency in at least C, C++, or Fortran
Superior scripting skills and excellent attention to detail; proficiency in at least Python, Perl, or Bash
Strong ability to interact with customers to understand needs, elicit requirements, and get feedback on prototype solutions
Excellent communication and people skills; excellent time management and organizational skills
Experience with system configuration management tools e.g. , puppet, chef, ansible
Experience with revision control software e.g. CVS, SVN, Git
Track record of delivering commercial quality software on schedule with excellent quality through multiple release cycles
Proficiency at documentation and technical writing
Preferred Skills:
Proficiency with analysis and problem-solving skills for debugging and optimization of applications
Familiarity/proficiency with OpenMP and Message Passing Interface (MPI) programming
Experience with Lustre, and InfiniBand
Experience with cloud technologies (AWS, Azure, GCP), OpenStack or Kubernetes is a plus
We invest in the lives of our employees, both in and out of the workplace, by providing competitive pay and benefits packages. This position is offering a pay range of $140,000.00 - $160,000.00 depending on experience, seniority, geographic locations, and other factors permitted by law. Benefits offered may include healthcare, dental, vision, life insurance; 401(k); education assistance ; paid time off including PTO, holidays, and any other paid leave required by law.
Job Details
Job Family Information Technology
Job Function Systems Administration
Pay Type Salary
Education Level Bachelor's Degree
Hiring Rate 160,000 USD
$165k - $220k
...aligns closely with the internal and customer engineering teams, offering valuable insights from... ...development. About the role: As a Senior Specialist Field Engineer CoreWeave, you... ...within high-performance compute (HPC) environments Collaborate closely with...SeniorPermanent employmentTemporary workCasual workWork at officeFlexible hours- A pioneering technology firm in Sunnyvale, CA is seeking an ASIC Design Verification Engineer to ensure the functional correctness of high-speed low-power digital integrated circuits. The ideal candidate will have significant experience in ASIC verification, particularly...Senior
- NVIDIA is searching for a highly skilled HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for Electronic Design Automation and high-performance computing workloads across multiple teams and projects. The role collaborates with researchers and infrastructure...Senior
$148k - $287.5k
Sentinel Labs in Santa Clara, California, is seeking a motivated Performance Engineer to advance communication libraries for deep learning and HPC. You will conduct in-depth performance analysis on multi-GPU clusters, collaborate with dynamic teams, and evaluate proof-of...Senior- NVIDIA Gruppe seeks a skilled HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for high-performance computing workloads. This role involves collaboration with various teams to ensure effective and reliable cluster performance. Key responsibilities...Senior
- NVIDIA Gruppe is looking for a senior engineer to join their Math Libraries team in Santa Clara, California. This role involves designing and... .... The ideal candidate has over 8 years of experience in HPC software development using C++, along with leadership skills and...Senior
$148k - $235.75k
...Libraries and Networking team at NVIDIA. We deliver libraries like NCCL, NVSHMEM, UCX for Deep Learning and HPC. We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC applications of today have a huge...Senior$184k - $356.5k
Senior Math Libraries Engineer - AI and HPC page is loaded## Senior Math Libraries Engineer - AI and HPClocations: US, CA, Santa Clara: US, PA, Remote: US, WA, Remote: US, CA, Remote: US, MA, Remotetime type: Full timeposted on: Posted Todayjob requisition id: JR1998721...SeniorRemote work$184k - $287.5k
...implement scalable, next‑gen distributed storage services for HPC workloads, optimizing both performance and cost‑effectiveness to... ...need to see Bachelor’s degree in Computer Science, Electrical Engineering or related field or equivalent experience. 8+ years of experience...Senior$152k - $241.5k
NVIDIA Gruppe in Santa Clara is seeking a Senior Software Engineer to enhance their HPC infrastructure. The role involves applying distributed systems patterns, automation, and building scalable services in a hybrid multi-cloud environment. Candidates should have strong...Senior- ...environment remains resilient, measurable, and aligned with long‑term engineering demands. What you’ll be doing: Manage, scale, and optimize job... ...and tuning job scheduling systems (LSF, Slurm, etc.) in HPC or silicon design environments Proficiency in Linux systems administration...Senior
$152k - $241.5k
...intelligence. Job Overview We’re looking for a Senior SRE to join our Compute Farm team and... ..., ensuring they integrate cleanly with HPC schedulers, storage, and network fabrics.... ...Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction...Senior$152k - $287.5k
NVIDIA Corporation is seeking a motivated Performance Engineer to enhance the roadmap of communication libraries. In this role, you will conduct in-depth performance characterization on multi-GPU clusters and analyze the interaction of libraries with hardware and software...Senior$200k - $400k
...Institute Of Foundation Models Engineer The Institute of Foundation Models (IFM) designs and operates ultra-scale GPU supercomputing... ...GitHub (required) · Provide links to relevant distributed systems, HPC, or large-scale training projects · Include a list of...SeniorVisa sponsorship$136k - $218.5k
NVIDIA AI is seeking a Hardware Applications Engineer in Santa Clara, California. The ideal candidate will focus on customer enablement of enterprise products in a datacenter environment, creating technical documentation and engaging with customers to resolve issues. This...Senior- NVIDIA Gruppe in Santa Clara is hiring for a role in their Hardware Infrastructure EDA Compute team to optimize workload scheduling systems and improve overall service reliability. The successful candidate will manage and scale job scheduling systems while driving measurable...Senior
$181.3k - $245.3k
...autonomous devices like vehicles and robots to make more intelligent and safe decisions. Role overview: We are looking for a Senior Perception Engineer to work on our classical perception algorithms stack. You will be part of a growing team focused on exploring how far...SeniorFlexible hours$150k - $300k
...Education Bachelor's degree in Computer Science, Computer Engineering, Software Engineering, Information Technology, Electrical Engineering... ...Linux systems administration, SRE, DevOps, cloud operations, HPC, or infrastructure operations. • Strong Linux troubleshooting...Night shift$170k - $260k
...established start-up, where a collective of visionary scientists, engineers, and entrepreneurs are dedicated to transforming the landscape... ...~ Knowledge of performance profiling and optimization tools for HPC and deep learning. ~ Familiarity with resource management and...Work at office$90k - $140k
Job Description Key Responsibilities Design, deploy, operate, and scale multi-region CockroachDB clusters in production environments Ensure high availability, fault tolerance, and data consistency for globally distributed clusters Monitor...Senior$165k - $242k
...HPC Performance Engineer New York, NY / Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hours- Hewlett Packard Enterprise Development LP in Sunnyvale, California, is seeking a Senior PCB Layout Engineer to design complex, high-speed printed circuit boards. This role includes responsibilities such as advanced layout design, signal integrity awareness, and mentoring...Senior
- Synopsys, Inc. is seeking an experienced engineer specializing in Physical Verification to join our team in Sunnyvale, California. The successful candidate will develop and validate runsets, collaborate with leading foundries, and automate processes using scripting. With...Senior
- SpaceX is seeking a motivated Sr. ASIC DFT Engineer for their team in Sunnyvale, California. The engineer will work on developing next-generation ASICs for deployment in space and ground infrastructures. Responsibilities include optimizing DFT architectures and collaborating...Senior
$300 per month
...energy-efficient, AI-optimized cloud platform — and Production Engineering sits at the heart of that mission. As a Production Engineer focused... ...helping scale infrastructure that supports demanding AI and HPC workloads. You’ll partner closely with Production Engineers,...SeniorTemporary work- Moveworks is looking for a Senior Identity & Access Management Engineer in Mountain View to shape our identity and access strategy. This role requires hands-on development of IAM solutions across cloud infrastructure and internal systems, ensuring secure access. Ideal...Senior
- B Capital is seeking a Senior Perception Learning Engineer in Sunnyvale, California to lead the development of advanced perception systems for humanoid robots. This role involves designing and optimizing deep learning models for real-time detection and integrating data...Senior
- A leading aerospace company in Sunnyvale, CA is seeking a Signal and Power Integrity Engineer to own the SI and PI aspects of satellite hardware. You will work closely with a variety of engineering teams to ensure successful design and integration of cutting-edge communication...Senior
- A leading cloud technology company is seeking a highly skilled HPC Performance Engineer to join their HAVOCK Team in Sunnyvale, California. In this role, you will optimize bare-metal systems and ensure the performance of complex workloads using various technologies including...
$165k - $220k
...CX organization aligns closely with the internal and customer engineering teams, offering valuable insights from the field and having the... ...focusing on storage technologies within high-performance compute (HPC) environments Collaborate closely with customers to...SeniorTemporary workCasual workWork at officeRemote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior HPC Engineer. Be the first to apply!
- senior automation controls engineer Mountain View, CA
- senior brand designer Mountain View, CA
- senior business analyst contract Mountain View, CA
- senior app developer Mountain View, CA
- senior digital account manager Mountain View, CA
- senior account executive Mountain View, CA
- sr android developer Mountain View, CA
- senior database analyst Mountain View, CA
- legal senior counsel family office Mountain View, CA
- senior aws cloud engineer Mountain View, CA



