Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior HPC Cluster Engineer

ShiftCode Analytics

GPU Compute Clusters Design And Implementation

Candidate should possess in depth understanding of design and implementation of groundbreaking GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads.

Qualifications

  • Minimum 7 years of experience designing and operating large scale compute infrastructure
  • Experience analyzing and tuning performance for a variety of HPC workloads
  • Working knowledge of cluster configuration management tools such as Ansible, Puppet, Salt
  • Experience with HPC cluster job schedulers such as SLURM, LSF
  • In depth understanding of container technologies like Docker, Singularity, Shifter, Charliecloud
  • Proficient in Centos/RHEL and/or Ubuntu Linux distros including Python programming and bash scripting
  • Experience with HPC workflows that use MPI

Ways to stand out from the crowd:

  • Understanding of MLPerf benchmarking
  • Familiarity with InfiniBand with IBOP and RDMA
  • Understanding of fast, distributed storage systems like Lustre and GPFS for HPC workloads
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior HPC Cluster Engineer in United States vacancy
  • $152k - $241.5k

     ...Come join the team and see how you can make a lasting impact on the world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA (Electronic Design Automation) and high-performance computing... 
    Senior

    NVIDIA

    Santa Clara, CA
    2 days ago
  • NVIDIA Corporation is hiring a Performance Engineer to conduct in-depth performance characterization on multi-GPU and multi-node clusters. The ideal candidate will have experience with parallel programming, performance benchmarking, and understand computer system architecture... 
    Senior

    NVIDIA Corporation

    New Bremen, OH
    4 days ago
  • $175k - $250k

     ...Senior HPC Engineer Millennium's Infrastructure organization designs, engineers, and operates a robust global computing platform supporting...  ...build, and operate large-scale, high-throughput HPC and GPU clusters (for example, tens of thousands of CPU cores and hundreds of... 
    Senior

    Millennium Management Corp

    New York, NY
    2 days ago
  •  ...Role: Senior HPC Engineer Location: Remote Skills: HPC, Python, Linux Job Description: Key responsibilities...  ...immediate responsibility will be setting up new systems as a cluster and development environment to replicate our current production... 
    Senior
    Immediate start
    Remote work

    United IT Solutions

    Irving, TX
    4 days ago
  • $140k - $160k

     ...ASRC Federal is looking for a Senior HPC Engineer, as ASRC Federal InuTeq provides High Performance Computing services across the full HPC...  ...Key Responsibilities: Design, deploy and maintain HPC clusters with over 2000+ nodes with InfiniBand, 100+ petabytes of data... 
    Senior
    Contract work
    Weekend work

    ASRC Federal Holding Company

    Mountain View, CA
    2 days ago
  • $120.44k

     ...Senior HPC Systems Engineer Job Number: 25342 Functional Area: Information Technology Department: MA Green High Performance Computing Ctr...  ...for deploying, maintaining, and optimizing HPC clusters, storage systems, and networking for AI/ML workloads. Join... 
    Senior
    Full time
    Visa sponsorship

    Massachusetts Institute of Technology

    Cambridge, MA
    4 days ago
  • $184k - $287.5k

     ...next-gen distributed storage services for HPC workloads, optimizing both performance...  ...our researchers to run their flows on our clusters including performance analysis and...  ...degree in Computer Science, Electrical Engineering or related field or equivalent experience... 
    Senior

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $110.3k - $196.8k

    The Federal Reserve Bank of Kansas City is seeking a High Performance Computing Engineer to design, administer, and optimize HPC clusters. This role involves collaborating with researchers on complex computational challenges and implementing innovative solutions. The position... 
    Senior
    Remote work

    Federal Reserve Bank of Kansas City

    Oklahoma City, OK
    13 hours ago
  • $110.3k - $196.8k

    The Federal Reserve Bank of Kansas City is seeking a High Performance Computing Engineer to support data-intensive research. Responsibilities include managing HPC clusters, ensuring optimal system performance, and working collaboratively with researchers. Applicants should... 
    Senior
    Remote work

    Federal Reserve Bank of Kansas City

    Denver, CO
    13 hours ago
  •  ...Corvid Technologies is seeking a Sr. HPC Systems Engineer with a strong background and enthusiasm for Linux to support our Linux based High...  ...slow, hanging, or failing HPC jobs on internal or customer HPC clusters Automate repetitive tasks and implement custom solutions... 
    Senior
    Full time
    Flexible hours

    Corvid Technologies

    Mooresville, NC
    4 days ago
  •  ...are crucial for scaling Deep Learning and HPC applications! We## **What you will be...  ...analysis on large multi-GPU and multi-node clusters.* Study the interaction of our libraries...  ...related field with relevant performance engineering and HPC experience* 3+ yrs of experience... 
    Senior

    NVIDIA Corporation

    New Bremen, OH
    13 hours ago
  • $224k - $356.5k

     ...make a lasting impact on the world. We are seeking a Senior HPC & Quantum Systems Engineer to help architect, deploy, and operate a first-of-its-...  ...hybrid computing platform combining large-scale NVIDIA GPU clusters with physical quantum processors (neutral atom, trapped... 
    Senior
    Work at office
    Remote work

    NVIDIA

    Boston, MA
    4 days ago
  • $114k - $253k

     ...objectives. The impact you'll make We are seeking a Senior HPC Storage Engineer to architect and manage the next generation of high-...  ...6 years dedicated to HPC environments. Experience in Cluster computing and Server, Storage and Networking components... 
    Senior
    Local area
    Remote work
    Flexible hours
    2 days per week
    3 days per week
    1 day per week

    Lam Research

    Fremont, CA
    13 hours ago
  • $30 per hour

     ...SDSC presently operates multiple large HPC systems ranging from a 120k x86 CPU...  ...SDSC's high-performance computing clusters and related systems. The group operates...  ...designs POSTION OVERVIEW The Senior HPC Systems and Storage Engineer will apply advanced systems and software... 
    Senior
    Hourly pay
    Local area
    Afternoon shift

    UC San Diego

    San Diego, CA
    1 day ago
  •  ...careers and thrive. We are seeking a Senior Software Integration Engineer for a funded role to provide customer support for HPC software to enable High Performance Computers...  ...include Multi-vendor HPC servers, HPC clusters, and SPD servers. Position requires an... 
    Senior
    Temporary work
    Local area
    Immediate start

    Vibrint

    Hanover, MD
    2 days ago
  •  ...management of High-Performance Computing (HPC) systems within a classified environment....  ...with experience in HPC architecture, cluster management, and parallel computing, with...  .... Serve as a technical mentor for HPC engineers, guiding best practices in automation, performance... 
    Senior
    Work at office

    Oak Ridge National Laboratory

    Oak Ridge, TN
    1 day ago
  •  ...supercomputers in the world, Frontier, along with numerous commodity clusters and specialized programs and partnerships. Frontier is one...  ...to some of today's most challenging problems. As an HPC Linux Systems Engineer, you will work within the HPC Scalable Systems Group... 
    Senior
    Work at office
    Relocation package
    Flexible hours

    Oak Ridge National Laboratory

    Oak Ridge, TN
    4 days ago
  •  ...Hpc-Ai Engineer NVIDIA is looking for an experienced HPC-AI Engineer to join the Networking Clusters Solutions Infrastructure team. We are focused on building supercomputers and AI clusters based on groundbreaking technologies. We are looking for an outstanding engineer... 
    Senior
    Remote work

    NVIDIA

    United States
    1 day ago
  • $78.57 per hour

     ...Payrate: $78.57 - $79.00/hr. Summary: We are seeking experienced Senior High-Performance Computing (HPC) Engineers to join our dynamic team. The ideal candidates will have a strong background in heterogeneous HPC and proven experience in GPU performance... 
    Senior
    Full time

    Aditi Consulting

    Waukesha, WI
    3 days ago
  •  ...JOB SUMMARY We are seeking a Senior HPC Engineer with deep hands-on experience in high-performance computing (HPC), hybrid cloud infrastructure...  ...with HPC workload schedulers such as Slurm or similar cluster resource managers. • Deep experience developing scalable... 
    Senior
    Local area

    Baylor Genetics

    Houston, TX
    1 day ago
  •  ...Senior Hpc Cluster Administrator NVIDIA's Deep Learning Frameworks (DLFW) Infrastructure team is looking for a deeply technical Senior...  ...resolution of hardware and software incidents Collaborate with ML engineers and software teams to tune cluster configuration for large-... 
    Senior
    Remote work

    NVIDIA

    United States
    2 days ago
  • $184k - $287.5k

     ...NVIDIA Math Libraries team is looking for a senior engineer to join our development efforts in the area of kernel generation for AI and HPC, specifically targeting matrix operations, JITing and fusions. Around the world, leading commercial and academic organizations are... 
    Senior
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...Federal Reserve Bank of Kansas City is seeking a High Performance Computing Engineer to design and maintain advanced cyberinfrastructure. The ideal candidate will have strong experience in HPC architectures, parallel computing, and scientific applications. This role involves... 
    Senior
    Remote work
    Relocation package

    Federal Reserve Bank of Kansas City

    Omaha, NE
    13 hours ago
  • A growing infrastructure company is seeking a Senior Systems Engineer to support the Department of Energy. This role involves guiding national labs...  .... Strong technical presentation skills and understanding of HPC and AI/ML are crucial. Join us for this pivotal opportunity... 
    Senior

    VAST Data

    New York, NY
    1 day ago
  • $85.5k - $149.8k

     ...Sr. HPC Systems Engineer ****@*****.*** Research Computing is seeking a Sr. HPC Systems Engineer who will design, build, and maintain advanced high...  ...optimization of HPC and AI systems, including multi-node CPU and GPU clusters, high-speed InfiniBand and Ethernet networks, and large-... 
    Senior
    Full time

    Johns Hopkins University

    Baltimore, MD
    1 day ago
  • $184k - $287.5k

    NVIDIA is seeking a skilled member for its NVHPC compilers & tools group in Massachusetts to analyze and run high-performance computing (HPC) applications. The role involves assisting customers in optimizing HPC applications, providing insights to development teams, and... 
    Senior

    NVIDIA

    Oklahoma City, OK
    1 day ago
  • $152k - $241.5k

     ...us today! As a member of the GPU AI/HPC Infrastructure team, you will provide leadership...  ...of ground breaking GPU compute clusters that run demanding deep learning, high performance...  ...degree in Computer Science, Electrical Engineering or related field or equivalent... 
    Senior
    Remote work

    NVIDIA

    United States
    1 day ago
  • $140k - $158k

    A leading technology company in San Jose is seeking a Sr. System Engineer to roll out and maintain business-critical applications and services. The role requires expertise in HPC/AI and offers a competitive salary range of $140,000 - $158,000. Candidates should have a degree... 
    Senior

    Victrays

    San Jose, CA
    2 hours ago
  • $200k

     ...What would it look like to engineer the physical backbone of one of the most advanced trading environments in the world? Join a...  ...Utilise advanced telemetry to monitor the thermal profile of HPC clusters, proactively identifying cooling inefficiencies before they impact... 
    Senior
    Permanent employment
    Austin, TX
    more than 2 months ago
  • $170k - $260k

     ...Are you a Senior HPC Systems Engineer who is ready for a new challenge that will launch your career to the next level? Tired of being treated like a company drone? Tired of promised adventures during the hiring phase, then dropped off on a remote contract and never... 
    Senior
    Full time
    Contract work
    Remote work
    Work from home
    Relocation package

    GliaCell Technologies LLC

    Annapolis, MD
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior HPC Cluster Engineer. Be the first to apply!