Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior HPC Engineer

$140k - $160k

Vistronix

ASRC Federal is looking for a Senior HPC Engineer, as ASRC Federal InuTeq provides High Performance Computing services across the full HPC lifecycle including computational requirements, architecture, acquisition, and operations for federal government customers, while promoting innovation, continuous standards-driven improvement, and industry best practices; this senior role supports the NASA NACS High Performance Computing contract by delivering continuous architectural enhancements and operational excellence, with the successful candidate serving as a proactive senior member of the team reporting to the Manager of the HPC Computer Systems and Storage (CSS) group and bringing extensive experience in designing, installing, maintaining, and upgrading large-scale HPC environments, including expertise with common batch schedulers such as PBS, Slurm, or Moab/Torque and InfiniBand troubleshooting and optimization, while actively participating in day-to-day HPC operations such as system patching, OS upgrades, new system deployments, scripting, troubleshooting, testing, benchmarking, and user tool development, as well as directly supporting scientific users by diagnosing and reproducing application performance issues, analyzing trouble tickets for recurring patterns, and contributing to both system improvements and user education.

Key Responsibilities:

  • Design, deploy and maintain HPC clusters with over 2000+ nodes with InfiniBand, 100+ petabytes of data storage in production.

  • Shepherd and/or contribute to scalable feature designs through the entire software development process, from requirements and use cases to release

  • Designs and develops scripts for system administration, monitoring and usage reporting.

  • Modify existing software to correct errors and/or improve performance

  • Designs and develops scripts for system regression test and performance (file systems (Luster), scheduler (PBS), interconnect (HDR/NDR, Slingshot, ), high availability, etc.).

  • Troubleshoots, isolates and resolves application, system and other technical problems (hardware, software, and network).

  • Understands research use cases, researches and deploys new technologies, defining cost, performance and other trade-offs.

  • Manages and maintains tools for provisioning, configuration management (HPCM, Ansible & GIT), resource management, scheduling and all necessary aspects of HPC in accordance with best practices.

  • Researches, deploys and manages networking and security infrastructure, including development of policies and procedures.

  • Assists in developing and writing proposals and publications.

  • Creates and provides clear documentation.

  • Mentoring junior staff and cross training peers

  • After hours/weekend support as required

  • Moderate Supercomputing System Administration that contributes to:

  • Day-to-day operations of the Linux HPC clusters and storage systems

  • Proactive monitoring, analyze, and correct system issues

  • Development of scripts to automate repetitive tasks or tools to enhance support of the HPC systems

  • System performance analysis and tuning

  • Building, installing, and supporting user-requested software

  • Supporting evaluation and assessment of new HPC technology

  • Resolving user report issues and manage support tickets requests in Remedy

Requirements:

  • Bachelor's degree in computer science or related field

  • Strong computer science background with in-depth systems-level knowledge in operating systems and networking

  • A minimum of 10 years of experience in the administration of HPC systems and scheduling software (PBS, Slurm, or Moab/Torque)

  • A minimum of 10 years of experience of systems programming in heterogeneous, multi-platform HPC environments

  • Strong ability to analyze, debug and maintain the integrity of an existing code base

  • Demonstrated equivalence of 5 years of Linux/UNIX user support experience and hands-on experience with administration of Linux systems

  • Experience working with HPC applications and proficiency in at least C, C++, or Fortran

  • Superior scripting skills and excellent attention to detail; proficiency in at least Python, Perl, or Bash

  • Strong ability to interact with customers to understand needs, elicit requirements, and get feedback on prototype solutions

  • Excellent communication and people skills; excellent time management and organizational skills

  • Experience with system configuration management tools e.g. , puppet, chef, ansible

  • Experience with revision control software e.g. CVS, SVN, Git

  • Track record of delivering commercial quality software on schedule with excellent quality through multiple release cycles

  • Proficiency at documentation and technical writing

Preferred Skills:

  • Proficiency with analysis and problem-solving skills for debugging and optimization of applications

  • Familiarity/proficiency with OpenMP and Message Passing Interface (MPI) programming

  • Experience with Lustre, and InfiniBand

  • Experience with cloud technologies (AWS, Azure, GCP), OpenStack or Kubernetes is a plus

We invest in the lives of our employees, both in and out of the workplace, by providing competitive pay and benefits packages. This position is offering a pay range of $140,000.00 - $160,000.00 depending on experience, seniority, geographic locations, and other factors permitted by law. Benefits offered may include healthcare, dental, vision, life insurance; 401(k); education assistance ; paid time off including PTO, holidays, and any other paid leave required by law.

Job Details

Job Family Information Technology

Job Function Systems Administration

Pay Type Salary

Education Level Bachelor's Degree

Hiring Rate 160,000 USD

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Senior HPC Engineer in Mountain View, CA vacancy
  • $165k - $220k

     ...aligns closely with the internal and customer engineering teams, offering valuable insights from...  ...development. About the role: As a Senior Specialist Field Engineer CoreWeave, you...  ...within high-performance compute (HPC) environments Collaborate closely with... 
    Senior
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    29 days ago
  • A pioneering technology firm in Sunnyvale, CA is seeking an ASIC Design Verification Engineer to ensure the functional correctness of high-speed low-power digital integrated circuits. The ideal candidate will have significant experience in ASIC verification, particularly... 
    Senior

    Avicena Inc.

    Sunnyvale, CA
    4 days ago
  • NVIDIA is searching for a highly skilled HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for Electronic Design Automation and high-performance computing workloads across multiple teams and projects. The role collaborates with researchers and infrastructure... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $148k - $287.5k

    Sentinel Labs in Santa Clara, California, is seeking a motivated Performance Engineer to advance communication libraries for deep learning and HPC. You will conduct in-depth performance analysis on multi-GPU clusters, collaborate with dynamic teams, and evaluate proof-of... 
    Senior

    NVIDIA

    Santa Clara, CA
    2 days ago
  • NVIDIA Gruppe seeks a skilled HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for high-performance computing workloads. This role involves collaboration with various teams to ensure effective and reliable cluster performance. Key responsibilities... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe is looking for a senior engineer to join their Math Libraries team in Santa Clara, California. This role involves designing and...  .... The ideal candidate has over 8 years of experience in HPC software development using C++, along with leadership skills and... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $148k - $235.75k

     ...Libraries and Networking team at NVIDIA. We deliver libraries like NCCL, NVSHMEM, UCX for Deep Learning and HPC. We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC applications of today have a huge... 
    Senior

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $356.5k

    Senior Math Libraries Engineer - AI and HPC page is loaded## Senior Math Libraries Engineer - AI and HPClocations: US, CA, Santa Clara: US, PA, Remote: US, WA, Remote: US, CA, Remote: US, MA, Remotetime type: Full timeposted on: Posted Todayjob requisition id: JR1998721... 
    Senior
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...implement scalable, next‑gen distributed storage services for HPC workloads, optimizing both performance and cost‑effectiveness to...  ...need to see Bachelor’s degree in Computer Science, Electrical Engineering or related field or equivalent experience. 8+ years of experience... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $152k - $241.5k

    NVIDIA Gruppe in Santa Clara is seeking a Senior Software Engineer to enhance their HPC infrastructure. The role involves applying distributed systems patterns, automation, and building scalable services in a hybrid multi-cloud environment. Candidates should have strong... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...environment remains resilient, measurable, and aligned with long‑term engineering demands. What you’ll be doing: Manage, scale, and optimize job...  ...and tuning job scheduling systems (LSF, Slurm, etc.) in HPC or silicon design environments Proficiency in Linux systems administration... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $152k - $241.5k

     ...intelligence. Job Overview We’re looking for a Senior SRE to join our Compute Farm team and...  ..., ensuring they integrate cleanly with HPC schedulers, storage, and network fabrics....  ...Python, Go, Perl, or Ruby. Mentored other engineers and influenced technical direction... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $152k - $287.5k

    NVIDIA Corporation is seeking a motivated Performance Engineer to enhance the roadmap of communication libraries. In this role, you will conduct in-depth performance characterization on multi-GPU clusters and analyze the interaction of libraries with hardware and software... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $200k - $400k

     ...Institute Of Foundation Models Engineer The Institute of Foundation Models (IFM) designs and operates ultra-scale GPU supercomputing...  ...GitHub (required) · Provide links to relevant distributed systems, HPC, or large-scale training projects · Include a list of... 
    Senior
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $136k - $218.5k

    NVIDIA AI is seeking a Hardware Applications Engineer in Santa Clara, California. The ideal candidate will focus on customer enablement of enterprise products in a datacenter environment, creating technical documentation and engaging with customers to resolve issues. This... 
    Senior

    NVIDIA AI

    Santa Clara, CA
    1 day ago
  • NVIDIA Gruppe in Santa Clara is hiring for a role in their Hardware Infrastructure EDA Compute team to optimize workload scheduling systems and improve overall service reliability. The successful candidate will manage and scale job scheduling systems while driving measurable...
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $181.3k - $245.3k

     ...autonomous devices like vehicles and robots to make more intelligent and safe decisions. Role overview: We are looking for a Senior Perception Engineer to work on our classical perception algorithms stack. You will be part of a growing team focused on exploring how far... 
    Senior
    Flexible hours

    Aeva Inc.

    Mountain View, CA
    8 hours ago
  • $150k - $300k

     ...Education Bachelor's degree in Computer Science, Computer Engineering, Software Engineering, Information Technology, Electrical Engineering...  ...Linux systems administration, SRE, DevOps, cloud operations, HPC, or infrastructure operations. • Strong Linux troubleshooting... 
    Night shift

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $170k - $260k

     ...established start-up, where a collective of visionary scientists, engineers, and entrepreneurs are dedicated to transforming the landscape...  ...~ Knowledge of performance profiling and optimization tools for HPC and deep learning. ~ Familiarity with resource management and... 
    Work at office

    GenBio AI

    Palo Alto, CA
    1 day ago
  • $90k - $140k

    Job Description Key Responsibilities Design, deploy, operate, and scale multi-region CockroachDB clusters in production environments Ensure high availability, fault tolerance, and data consistency for globally distributed clusters Monitor...
    Senior

    Tata Consultancy Services

    Sunnyvale, CA
    2 days ago
  • $165k - $242k

     ...HPC Performance Engineer New York, NY / Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    4 days ago
  • Hewlett Packard Enterprise Development LP in Sunnyvale, California, is seeking a Senior PCB Layout Engineer to design complex, high-speed printed circuit boards. This role includes responsibilities such as advanced layout design, signal integrity awareness, and mentoring... 
    Senior

    Hewlett Packard Enterprise Development LP

    Sunnyvale, CA
    5 days ago
  • Synopsys, Inc. is seeking an experienced engineer specializing in Physical Verification to join our team in Sunnyvale, California. The successful candidate will develop and validate runsets, collaborate with leading foundries, and automate processes using scripting. With... 
    Senior

    Synopsys, Inc.

    Sunnyvale, CA
    5 days ago
  • SpaceX is seeking a motivated Sr. ASIC DFT Engineer for their team in Sunnyvale, California. The engineer will work on developing next-generation ASICs for deployment in space and ground infrastructures. Responsibilities include optimizing DFT architectures and collaborating... 
    Senior

    Latent AI

    Sunnyvale, CA
    1 day ago
  • $300 per month

     ...energy-efficient, AI-optimized cloud platform — and Production Engineering sits at the heart of that mission. As a Production Engineer focused...  ...helping scale infrastructure that supports demanding AI and HPC workloads. You’ll partner closely with Production Engineers,... 
    Senior
    Temporary work

    Crusoe

    Sunnyvale, CA
    16 days ago
  • Moveworks is looking for a Senior Identity & Access Management Engineer in Mountain View to shape our identity and access strategy. This role requires hands-on development of IAM solutions across cloud infrastructure and internal systems, ensuring secure access. Ideal... 
    Senior

    Moveworks

    Mountain View, CA
    5 days ago
  • B Capital is seeking a Senior Perception Learning Engineer in Sunnyvale, California to lead the development of advanced perception systems for humanoid robots. This role involves designing and optimizing deep learning models for real-time detection and integrating data... 
    Senior

    B Capital

    Sunnyvale, CA
    1 day ago
  • A leading aerospace company in Sunnyvale, CA is seeking a Signal and Power Integrity Engineer to own the SI and PI aspects of satellite hardware. You will work closely with a variety of engineering teams to ensure successful design and integration of cutting-edge communication... 
    Senior

    SPACE EXPLORATION TECHNOLOGIES CORP

    Sunnyvale, CA
    1 day ago
  • A leading cloud technology company is seeking a highly skilled HPC Performance Engineer to join their HAVOCK Team in Sunnyvale, California. In this role, you will optimize bare-metal systems and ensure the performance of complex workloads using various technologies including... 

    CoreWeave

    Sunnyvale, CA
    1 day ago
  • $165k - $220k

     ...CX organization aligns closely with the internal and customer engineering teams, offering valuable insights from the field and having the...  ...focusing on storage technologies within high-performance compute (HPC) environments Collaborate closely with customers to... 
    Senior
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior HPC Engineer. Be the first to apply!