Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior HPC Performance Engineer

$148k - $235.75k

NVIDIA

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. We are the GPU Communications Libraries and Networking team at NVIDIA. We deliver libraries like NCCL, NVSHMEM, UCX for Deep Learning and HPC. We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC applications of today have a huge compute demand and run on scales which go up to tens of thousands of GPUs. The GPUs are connected with high-speed interconnects (eg. NVLink, PCIe) within a node and with high-speed networking (eg. Infiniband, Ethernet) across the nodes. Communication performance between the GPUs has a direct impact on the end-to-end application performance; and the stakes are even higher at huge scales! This is an outstanding opportunity for someone with HPC and performance background to advance the state of the art in this space. Are you ready to contribute to the development of innovative technologies and help realize NVIDIA's vision? What you will be doing: Conduct in-depth performance characterization and analysis on large multi-GPU and multi-node clusters. Study the interaction of our libraries with all hardware (GPU, CPU, Networking) and software components in the stack. Evaluate proof-of-concepts, conduct trade‑off analysis when multiple solutions are available. Triage and root‑cause performance issues reported by our customers. Collect a lot of performance data; build tools and infrastructure to visualize and analyze the information. Collaborate with a very dynamic team across multiple time zones. What we need to see: M.S. (or equivalent experience) or Ph.D. in Computer Science, or related field with relevant performance engineering and HPC experience. 3+ years of experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM). Experience conducting performance benchmarking and triage on large scale HPC clusters. Good understanding of computer system architecture, HW‑SW interactions and operating systems principles. Implement micro‑benchmarks in C/C++, read and modify the code base when required. Ability to debug performance issues across the entire HW/SW stack. Proficient in a scripting language, preferably Python. Familiar with containers, cloud provisioning and scheduling tools (Kubernetes, SLURM, Ansible, Docker). Adaptability and passion to learn new areas and tools. Flexibility to work and communicate effectively across different teams and timezones. Ways to stand out from the crowd: Practical experience with Infiniband/Ethernet networks in areas like RDMA, topologies, congestion control. Experience debugging network issues in large scale deployments. Familiarity with CUDA programming and/or GPUs. Experience with Deep Learning Frameworks such as PyTorch, TensorFlow. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until November 6, 2025. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior HPC Performance Engineer in Santa Clara, CA vacancy
  • $148k - $287.5k

    Sentinel Labs in Santa Clara, California, is seeking a motivated Performance Engineer to advance communication libraries for deep learning and HPC. You will conduct in-depth performance analysis on multi-GPU clusters, collaborate with dynamic teams, and evaluate proof-of... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe seeks a skilled HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for high-performance computing workloads. This role involves collaboration with various teams to ensure effective and reliable cluster performance. Key responsibilities... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • NVIDIA Gruppe is looking for a senior engineer to join their Math Libraries team in Santa Clara...  ...designing and implementing high-performance numerical linear algebra software on GPUs...  ...candidate has over 8 years of experience in HPC software development using C++, along... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...solutions to enable runs of demanding high performance computing, and computationally intensive...  ...-gen distributed storage services for HPC workloads, optimizing both performance...  ...degree in Computer Science, Electrical Engineering or related field or equivalent experience... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

     ...NVIDIA Math Libraries team is looking for a senior engineer to join our development efforts in the area of kernel generation for AI and HPC, specifically targeting matrix...  ...designing, and implementing high quality and performance numerical dense linear algebra software... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...impact on the world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA (Electronic Design Automation) and high-performance computing workloads used across multiple teams and projects. Join... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • NVIDIA is searching for a highly skilled HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for Electronic Design Automation and high-performance computing workloads across multiple teams and projects. The role collaborates with researchers and infrastructure... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $152k - $241.5k

     ...measurable, and aligned with long-term engineering demands. What you'll be doing: Manage...  ...Analyze scheduler and infrastructure performance data to identify systemic bottlenecks and...  ...scheduling systems (LSF, Slurm, etc.) in HPC or silicon design environments... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $287.5k

    NVIDIA Corporation is seeking a motivated Performance Engineer to enhance the roadmap of communication libraries. In this role, you will conduct in-depth performance characterization on multi-GPU clusters and analyze the interaction of libraries with hardware and software... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $152k - $241.5k

     ...Artificial Intelligence, High-Performance Computing and Visualization....  ...Overview We’re looking for a Senior SRE to join our Compute Farm...  ...they integrate cleanly with HPC schedulers, storage, and network...  ..., or Ruby. Mentored other engineers and influenced technical direction... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $165k - $220k

     ...combines superior infrastructure performance with deep technical expertise...  ...the internal and customer engineering teams, offering valuable...  ...About the role: As a Senior Specialist Field Engineer CoreWeave...  ...within high-performance compute (HPC) environments Collaborate... 
    Senior
    Performance
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    16 days ago
  • $140k - $160k

     ...ASRC Federal is looking for a Senior HPC Engineer, as ASRC Federal InuTeq provides High Performance Computing services across the full HPC lifecycle including computational requirements, architecture, acquisition, and operations for federal government customers, while... 
    Senior
    Performance
    Contract work
    Weekend work

    ASRC Federal Holding Company

    Mountain View, CA
    1 day ago
  • $277k - $358k

     ...Job Description Senior Director, CTIO Engineering Technologists From applied research to advanced engineering...  ...Leads technology investigations, performs a strategic analysis of the industry...  ...their organization related to ~ HPC (high-performance compute) clusters,... 
    Senior
    Performance

    Dell

    Santa Clara, CA
    1 day ago
  •  ...Gruppe in Santa Clara is looking for an experienced AI Software Engineer to lead the integration of new communication libraries into AI frameworks...  ...Science, with at least 5 years of software engineering and HPC/AI experience, along with proficiency in Python and C++. A... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $200k - $400k

     ...Institute Of Foundation Models Engineer The Institute of Foundation Models (IFM) designs...  ...foundation models. We believe performance, fault tolerance, and scalability are co...  ...links to relevant distributed systems, HPC, or large-scale training projects · Include... 
    Senior
    Performance
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    5 days ago
  • $152k - $241.5k

    We are looking for a Senior DL Algorithms Engineer for LLM/Omni model optimizations! Seeking senior engineers who are mindful of performance analysis and optimization to help us squeeze every...  ...or equivalent frameworks for AI, or HPC-heavy application development. Deep... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $186k - $279k

     ...Senior Storage Benchmarking Engineer Santa Clara, California We're in an unbelievably exciting area of...  ...Engineer will design, execute, and analyze performance benchmarks using industry-standard...  .... What You'll Do Configure HPC lab environment so all systems can... 
    Senior
    Performance
    Work at office
    Flexible hours

    Pure Storage

    Santa Clara, CA
    26 days ago
  • $152k - $241.5k

     ...technologies to accelerate AI workloads, and we are looking for an engineer focused on performance validation, analysis, and tracking. In this role, you will...  ...to measure latency, throughput, and efficiency of AI and HPC workloads Analyze performance trends over time and... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...accelerated computing platform has revolutionized HPC and AI, and we have built the cuQuantum...  ...Computing. This role will be part of an engineering team developing, scaling, and optimizing...  ...about designing and developing high‑performance software and want to help us build... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • We’re currently seeking a Senior Developer Technology Engineer, Artificial Intelligence! Would you enjoy researching parallel...  ...experts in industry and academia to perform in-depth analysis and optimization of complex AI and HPC algorithms to ensure the best possible AI... 
    Senior
    Performance
    Work experience placement

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...programming models by investigating application performance and developer productivity. Occasional...  ...MS or PhD degree in Computer Science, Engineering, or a STEM field, or equivalent...  ...developing or optimizing workflows involving HPC and AI models. Compensation & Benefits Base... 
    Senior
    Performance
    Work experience placement

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves...  ...related field (or equivalent experience) with 5+ software engineering and HPC/AI experience Development or integration experience with... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $155k - $185k

     ...Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We...  ...talented, passionate, and committed engineers, technologists, and business leaders to...  ...end-to end computing solution and high performance networking products. We have an immediate... 
    Senior
    Performance
    Contract work
    Immediate start
    Worldwide

    Super Micro Computer

    San Jose, CA
    9 days ago
  •  ...highly motivated Chiplet Package Design Engineer to drive the development of advanced...  ...heterogeneous integration , enabling high-performance computing, AI, networking, and advanced...  ...of HBM integration, AI accelerators, or HPC systems Experience with thermal/... 
    Senior
    Performance
    Full time

    Rapidus Corporation US

    Santa Clara, CA
    2 hours ago
  • $184k - $287.5k

    Senior Developer Technology Engineer, Public Sector page is loaded## Senior Developer Technology Engineer, Public Sectorlocations: US, CA,...  ...Graph Theory, Weather/Climate Modeling, and AI in HPC. You will be performing in-depth analysis and optimization to ensure the best... 
    Senior
    Performance
    Work experience placement
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    16 hours ago
  • $184k - $287.5k

     ...datacenter provisioning and management. As a Senior Software Engineer - Datacenter Systems, you will work...  .... These clusters run today's fastest HPC and AI workloads. This role suits...  ...managing infrastructure or systems in high-performance or distributed environments. ~... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...Gruppe in Santa Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design and implement cutting-edge GPU compute clusters, focusing on deep learning and high-performance computing. The ideal candidate will have at least 5+ years of... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...the endless possibilities of AI. Hardware Qualification Engineer, Senior Staff Location: Santa Clara, CA (Onsite/Hybrid) About the...  ...qualification or reliability, specifically with high-performance computing (HPC), servers, or networking hardware. Technical Depth: Mastery... 
    Senior
    Performance
    Contract work

    d-Matrix inc.

    Santa Clara, CA
    5 days ago
  • $136k - $218.5k

     ...hardworking, motivated and creative Senior Verification Engineer for our Tegra SoC Memory Subsystem IP...  ...functional correctness and for meeting performance expectations. This position offers...  ...consumer graphics, self-driving cars, HPC, cloud computing, and AI! What you’ll... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $168k - $258.75k

    High Performance Computing (HPC) and Artificial Intelligence (AI) are key markets for NVIDIA. Researchers and scientists actively embrace full...  ...Computing domain. Master’s degree in Science or Computer Engineering (or equivalent experience) Fluent in using NVIDIA... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior HPC Performance Engineer. Be the first to apply!