Senior HPC Cluster Engineer
$152k - $241.5kNVIDIA
Senior HPC Cluster Engineer page is loaded## Senior HPC Cluster Engineerlocations: US, CA, Santa Clara: US, TX, Austin: US, WA, Redmondtime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2014289NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA (Electronic Design Automation) and high-performance computing workloads used across multiple teams and projects. Join our engineering team and collaborate with researchers and infrastructure teams to ensure our GPU clusters are highly performant, scalable and reliable.**What you'll be doing:*** Develop and enhance our ecosystem around GPU-accelerated computing including developing scalable automation solutions.* Continuously improve infrastructure provisioning, management, observability and day to day operation through automation.* Provide technical leadership and strategic guidance for managing large-scale HPC systems, including the deployment of compute, networking, and storage.* Foster strong customer and multi-functional partnerships to ensure consistent cluster support and rapidly adapt to evolving user needs* Support our researchers to run their EDA workloads including performance analysis and optimizations.* Conduct root cause analysis and suggest corrective action. Proactively find and fix issues before they occur.* Build innovative tooling to accelerate researchers' velocity, debugging and software performance at scale.**What we need to see:*** Bachelor’s degree in Computer Science, Electrical Engineering or related field or equivalent experience.* Minimum of 5 years of proven experience crafting and operating large scale compute infrastructure, including cluster configuration managements tools such as BCM or Ansible.* Experience with AI/HPC job schedulers and orchestrators, such as Slurm, LSF, PBS or K8s. Applied experience with AI/HPC workflows that use MPI and NCCL.* Proficient in using Linux including Rocky/Centos/RHEL and/or Ubuntu Linux distributions. A solid understanding of container technologies such Enroot and Docker.* Proficiency in Python and Bash* Experience analyzing and tuning performance for a variety of EDA workloads. Excellent problem-solving to analyze complex systems, identify bottlenecks, and implement scalable solutions.* Excellent communication and collaboration skills, with the ability to work effectively with various teams and individuals.* Passion for continual learning and staying ahead of new technologies and effective approaches in the HPC infrastructure fields.**Ways to stand out from the crowd:*** Background with NVIDIA GPUs, CUDA Programming, NCCL and MLPerf benchmarking.* Experience supporting EDA workloads and tools.* Familiarity with High-Speed Networking pertaining to HPC including InfiniBand, RDMA and RoCE.* Understanding of fast, distributed storage systems such as Lustre and GPFS for AI/HPC workload.* Familiarity with metrics collection and visualization at scale with Prometheus, OpenSearch and Grafana.Our technology has no boundaries! NVIDIA is building the most groundbreaking and powerful compute platforms for the world to use. It’s because of our work that scientists, researchers and engineers can advance their ideas. At its core, our visual computing technology not only enables an amazing computing experience, but it is also energy efficient! We pioneered a supercharged form of computing loved by the most demanding computer users in the world - scientists, designers, artists, and gamers. It’s not just technology though! It is our people, some of the brightest in the world, and our diverse company culture make NVIDIA one of the most fun, innovative and dynamic places to work in the world! At the center of NVIDIA's culture are our core values like innovation, excellence and determination and team, that guide us to be the best we can be.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and .Applications for this job will be accepted at least until March 15, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr
$152k - $241.5k
Senior HPC Cluster Engineer page is loaded## Senior HPC Cluster Engineerlocations: US, CA, Santa Clara: US, TX, Austin: US, WA, Redmondtime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2014289NVIDIA has been transforming computer graphics, PC gaming,...Senior$184k - $287.5k
...Senior HPC Storage Engineer page is loaded## Senior HPC Storage Engineerlocations: US, CA, Santa Clara: US, TX, Austintime type: Full timeposted... ...utilization.* Supporting our researchers to run their flows on our clusters including performance analysis and optimizations of deep...Senior$152k - $241.5k
...A leading technology company in Austin is seeking a Senior HPC Cluster Engineer to design and deploy GPU Compute Clusters. The role involves providing technical leadership, collaborating with researchers, and supporting EDA workloads. Candidates should have a Bachelor'...Suggested$152k - $241.5k
Senior Site Reliability Engineer - HPC page is loaded## Senior Site Reliability Engineer - HPClocations: US, CA, Santa Clara: US, TX, Austin: US, NC,... ...critical services.* Experience supporting large‑scale HPC clusters using Slurm, LSF or Kubernetes clusters, including...Senior$152k - $241.5k
...Senior HPC and LSF Operations Engineer page is loaded## Senior HPC and LSF Operations Engineerlocations: US, CA, Santa Clara: US, MA, Westford: US, TX, Austin: US, NC, Durhamtime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2014127As a member of the...Senior$200k
...What would it look like to engineer the physical backbone of one of the most advanced trading environments in the world? Join a... ...Utilise advanced telemetry to monitor the thermal profile of HPC clusters, proactively identifying cooling inefficiencies before they impact...SeniorPermanent employment- ...future of AI and beyond. Together, we advance your career. THE ROLE Cluster Thermal Engineer We are seeking a Cluster Thermal Engineer to help architect and deliver scalable thermal solutions for AI/HPC clusters and data center deployments. In this role, you will...Internship
- A leading technology firm in Austin is seeking a Senior HPC and LSF Operations Engineer to manage and optimize large-scale job scheduling systems. This role demands a Bachelor's degree in a relevant field and over 5 years of experience in Linux-based environments. Ideal...Senior
- Advanced Micro Devices is looking for a Cluster Thermal Engineer in Austin, Texas. In this role, you will architect and deliver scalable thermal solutions for AI and HPC clusters. The ideal candidate is an early-career mechanical engineer with solid understanding of thermal...
- A leading technology company is seeking a Senior HPC Storage Engineer in Austin, Texas. The role involves designing and implementing scalable storage solutions for high performance computing. Candidates should have over 8 years of experience in large scale storage infrastructure...Senior
- ...JOB SUMMARY Apptronik is seeking a SimOps Engineer to architect and maintain the large-scale simulation... ...and synthetic data generation. HPC & Cloud Orchestration: Deploy and manage High-Performance Computing (HPC) clusters (e.g., GCP, on-prem) to enable sub-24-hour...SeniorFull timeLocal area
$150k - $250k
...We are seeking an experienced Storage Engineer who enjoys being challenged, appreciates... ...opportunity to join a small team focused on HPC storage and help set the direction for storage... ...performance and monitoring of storage clusters * Research and experiment with new...Work at officeLocal areaImmediate startWorldwide$150k - $190k
...Fully remote IT Infrastructure & Network Engineering & Operations Overview GovCIO is... .... The High Performance Computing (HPC) Engineer supports and optimizes HPC environments... ...of HPC architectures including clusters, GPUs, and high-speed storage systems. The...Full timeRemote workFlexible hours- ...Role: Cockroach DB Senior Engineer Location: Austin, TX & Sunnyvale, CA Contract Visa - USC, GC & OPT Job Description... ...Design, deploy, operate, and scale multi-region CockroachDB clusters in production environments Ensure high availability,...SeniorContract work
$152k - $241.5k
...runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with NCCL. Most DL/HPC applications run on large clusters with high-speed networking (Infiniband, RoCE,...$152k - $241.5k
A leading technology company is seeking a Senior Site Reliability Engineer in Austin, Texas. This role involves owning SRE solutions, supporting large-scale HPC clusters, and utilizing CI/CD techniques. Ideal candidates should have a B.S. in Computer Science, over 5 years...Senior- ...industry‑defining speeds. About the Role We are seeking an Electrical Engineer with a comprehensive understanding of end-to-end data center... ...Haves Experience with high‑density compute environments (AI/ML, HPC) and the unique electrical demands of GPU‑heavy workloads is a...SeniorLocal area
- ...leading technology company in Austin, Texas is looking for an EngOps Engineer to maintain high-performance management solutions in datacenter... ...position requires at least 5 years of experience in deploying clusters and managing infrastructure, along with a degree in a related...Senior
- ...A leading fintech company in Austin, TX is seeking a Senior Identity Access Management Engineer. This role will involve improving and maintaining the IAM systems and driving automation for better efficiency. The ideal candidate will have a Bachelor's degree, 5+ years...Senior
- ...documentation. You will maintain comprehensive materials, including API references and tutorials, while collaborating closely with engineering teams. The ideal candidate should possess 2–5 years of experience and strong coding skills. Familiarity with XML DITA and tools...Senior
$130.1k - $176k
...Hpc Operations Engineer Engineering IT provides the high-performance compute platforms that enable Arm's engineering teams to design, verify, and deliver world-class products. The team operates a mix of on-premises and cloud-based HPC environments, EDA enablement services...Work at officeLocal area- ...around the clock. This environment demands engineering excellence, operational precision, and a... ...operations. Position Summary The Senior Windows Systems Engineer is responsible... ...administration. Strong background in Windows clustering technologies, AD maintenance, and...SeniorFull timeFlexible hours
$150k - $190k
...leading IT services provider is seeking a High Performance Computer Engineer to support and optimize high performance computing resources for... ...analysis. This remote position requires strong experience with HPC environments and collaboration with researchers to improve...Remote work$102.31k - $194.88k
...drive progress, and leave a lasting legacy. HDR is seeking a Senior Coastal Engineer (PE) to join our nationally ranked team of coastal and... ...engineering preferred Experience with high-performance computing (HPC) environments Experience with the application of artificial...SeniorFull timeTemporary workPart time- ...Jabil Malaysia is seeking a Sr. Principal Electrical Design Engineer. This role focuses on designing and overseeing electrical systems crucial for data centers, including power distribution and integration with AI technologies. The successful candidate will have over...SeniorRemote work
- ...Texas to join its OCI team. The role involves designing and developing image automation software and cloud services, with a focus on GPU/HPC infrastructure solutions. Candidates should have a robust Linux OS background and experience in enterprise distributed systems....Senior
- A technology company in Austin is seeking a Senior Principal Network Development Engineer to lead the NIC qualification and New Product Introduction for advanced networking platforms. Candidates should possess 8-12+ years of experience in networking, strong knowledge of...Senior
- ...to expand U.S. chip production capacity, the Taylor facility positions top engineering talent at the center of a rapidly growing technology cluster with global impact. Position Summary The Senior Linux Engineer is responsible for the architecture, operation, and...SeniorFull timeFlexible hours
$179k - $268.4k
...A leading technology company is seeking a Senior Data Center Post Silicon Power and Performance Engineer in Austin, Texas. You will be responsible for characterizing and optimizing complex SoCs, developing validation plans, and ensuring CPU performance. A Bachelor's or...Senior- We’re looking for a Senior Analog Layout Engineer with strong hands-on experience in high-speed, full-custom analog/mixed-signal layouts. Key requirements: 10+ years of AMS layout experience Full-custom layout, top-level floorplanning, power planning & integration Strong...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior HPC Cluster Engineer. Be the first to apply!
- senior cost analyst Austin, TX
- senior computer engineer Austin, TX
- senior development engineer Austin, TX
- senior program specialist Austin, TX
- senior manager quality engineering Austin, TX
- senior software test automation engineer Austin, TX
- senior design technologist Austin, TX
- senior design verification engineer Austin, TX
- senior director quality Austin, TX
- senior director of development Austin, TX

