Senior HPC and LSF Operations Engineer
$152k - $241.5kNVIDIA
Senior HPC and LSF Operations Engineer page is loaded## Senior HPC and LSF Operations Engineerlocations: US, CA, Santa Clara: US, MA, Westford: US, TX, Austin: US, NC, Durhamtime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2014127As a member of the Hardware Infrastructure EDA Compute team, you will optimize, scale, and support workload scheduling systems that directly impact design velocity and infrastructure efficiency. Success in this role requires both operational precision along with developing and supporting forward-looking resource management solutions that address evolving compute demands. Beyond day-to-day operations, the role drives improvements in observability, service reliability, and automation, ensuring the EDA compute environment remains resilient, measurable, and aligned with long-term engineering demands.**What you'll be doing:*** Manage, scale, and optimize job scheduling systems (LSF, Slurm, etc.) in a large-scale, multi-site environment supporting EDA and other compute-intensive workloads* Analyze scheduler and infrastructure performance data to identify systemic bottlenecks and drive measurable improvements in utilization, throughput, and turnaround time* Lead problem solving across scheduler, OS, and workload layers, ensuring timely resolution of service-impacting issues* Identify recurring operational challenges and implement targeted automation or process improvements to reduce manual effort and prevent repeat incidents* Help define and track reliable metrics and SLOs for service performance and reliability, partnering with customers to ensure expectations are realistic and measurable* Contribute to operational standards, documentation, and best practices to improve consistency across sites* Partner directly with customer teams to clarify requirements, translate technical tradeoffs, and drive issues to closure**What we need to see:*** Bachelor’s degree in Computer Science or related field, or equivalent experience* Minimum 5+ years of experience operating and supporting large-scale Linux-based compute infrastructure* Strong hands-on experience supporting and tuning job scheduling systems (LSF, Slurm, etc.) in HPC or silicon design environments* Proficiency in Linux systems administration (CentOS/RHEL)* Strong problem solving skills and the ability to independently analyze complex system behavior under load* Clear and effective communication skills, including the ability to articulate technical tradeoffs and reliability metrics to engineering stakeholders**Ways to stand out from the crowd:*** Experience implementing reliability engineering practices within HPC scheduling environments* Deep knowledge of job scheduling systems (LSF, Slurm, etc.) configuration tuning, scheduler internals, and advanced troubleshooting techniques* Experience building or enhancing observability systems, including metrics collection, monitoring pipelines, alerting strategies, and performance dashboards* Background with container technologies such as Docker, Singularity, or Podman in HPC environments* Experience influencing adoption of new infrastructure standards across multiple teams or sitesNVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and hardworking people in the world on our team and our collaborative talent continues to drive NVIDIA's growth. We are seeking creative and independent engineers with real passion for technology! #LI-HybridYour base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and .Applications for this job will be accepted at least until March 15, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA
- A leading technology firm in Austin is seeking a Senior HPC and LSF Operations Engineer to manage and optimize large-scale job scheduling systems. This role demands a Bachelor's degree in a relevant field and over 5 years of experience in Linux-based environments. Ideal...Senior
- A leading energy company located in Austin, TX, is seeking a Senior Operations Engineer to lead operations for synchronous condenser assets. This role involves providing technical support, responding to operational issues, and coordinating outage requests. Ideal candidates...Senior
- TOTAL Deutschland GmbH is seeking an Operations Engineer to ensure optimal operation of solar and battery storage assets in Austin, Texas. This role requires a Bachelor's degree in Engineering and 6+ years of relevant experience in solar PV or battery energy storage systems...Senior
$152k - $241.5k
Senior HPC Cluster Engineer page is loaded## Senior HPC Cluster Engineerlocations: US, CA, Santa Clara:... ...Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA (... ...schedulers and orchestrators, such as Slurm, LSF, PBS or K8s. Applied experience with AI...Senior- Applied Materials, Inc. in Austin, TX is seeking an Operations and Customer Quality Engineer III with expertise in quality solving tools and statistical software. The successful candidate will develop and apply quality standards while designing inspection methods and documenting...Senior
- ...collaboration that fosters trust and unlocks creativity. About the role Engineer. Detective. Communicator. You are the whole package and you’re... ...help supporting our first-of-its-kind, SaaS-based revenue operations solution. What you'll do Performs second-level application...SeniorWork at officeLocal areaRemote workFlexible hours2 days per week3 days per week
- Gotransverse in Austin, TX is seeking a Technical Support Engineer to provide application support for their SaaS-based revenue operations solution. This role emphasizes strong technical skills in Linux and Java, alongside automation and system security responsibilities....Senior
- Advanced Micro Devices is seeking a Senior Technical Program Manager in Austin to lead the execution of AI and HPC programs. This role involves translating customer requirements into action plans, ensuring delivery timelines, and collaborating effectively with cross-functional...SeniorRemote job
$152k - $241.5k
A leading technology company is seeking a Senior Site Reliability Engineer in Austin, Texas. This role involves owning SRE solutions, supporting large-scale HPC clusters, and utilizing CI/CD techniques. Ideal candidates should have a B.S. in Computer Science, over 5 years...Senior- Senior Operations Engineer, Retail Store Operations & Support - Retail & Marcom Engineering Austin, Texas, United States Corporate Functions At Apple, we don’t just build products — we revolutionize entire industries. Our innovation is driven by the diverse ideas and...SeniorWork experience placementWorldwideWeekend work
$152k - $241.5k
Senior Site Reliability Engineer - HPC page is loaded## Senior Site Reliability Engineer - HPClocations: US, CA... ..., from design and implementation to operation and continuous improvement, ensuring... ...large‑scale HPC clusters using Slurm, LSF or Kubernetes clusters, including...Senior- ...of diverse teams and take your career wherever you want it to go. Join EY and help to build a better working world. WAF Operations Solution Engineer Practice Description As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web...SeniorFlexible hours
- ...company, is a leading developer, owner, and operator of utility‑scale energy storage and... ...Responsibilities Lead the Operations Engineering team supporting utility‑scale BESS and solar... ..., and safety; Collaborate with senior leadership on long‑term operational planning...SeniorRemote work
- ...Texas to join its OCI team. The role involves designing and developing image automation software and cloud services, with a focus on GPU/HPC infrastructure solutions. Candidates should have a robust Linux OS background and experience in enterprise distributed systems....Senior
$135.96k - $203.94k
Cacheflow is seeking a Senior Backend Engineer to design, build, and scale systems for AI-driven operations. This position involves owning the entire product lifecycle, from stakeholder discovery to deployment, while collaborating with diverse teams to develop secure and...Senior- Digital Turbine Media, Inc. is seeking a Principal Engineer of Security Operations to lead technical advancements in their Security Operations Center (SOC). This full-time hybrid role focuses on cloud security, incident response, and collaboration across teams to maintain...SeniorFull time
$165k - $210k
...solve complex challenges, and stay at the forefront of data engineering and AI advancements. Remote first with casual, award-winning... ...background with most of your experience in infrastructure and operations (managing enterprise data platforms). Responsibilities Leading...SeniorCasual workRemote work- ...hold each other to a high bar — and we’re looking for a Senior Test Automation Engineer who thinks in data, builds in systems, and treats quality... ...: a stream replay and orchestration engine . Our product operates on real-time audio and video streams — the kind that flow...SeniorRemote work
- A leading technology company in Austin, Texas seeks a SoC Power Analysis and Optimization Engineer to drive automation for SOC power optimization, collaborate with cross-functional teams, and explore machine learning methodologies. The ideal candidate has a bachelor's...Senior
- GetReal Security LLC in Austin, TX is seeking a Senior Test Automation Engineer to design and implement a robust stream replay and orchestration engine. This role emphasizes building scalable testing frameworks for real-time audio and video streams, validating product performance...Senior
- Apex Fintech Solutions UK Ltd. is seeking a Senior IAM Automation Engineer to transform their management of workforce identity. This role intertwines DevOps practices with IAM expertise, aiming to build self-service, API-driven solutions within a multi-cloud environment...Senior
$106.8k - $194.8k
...diverse teams and take your career wherever you want it to go. Join EY and help to build a better working world. WAF Operations Solution Engineer PRACTICE DESCRIPTION: As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web...SeniorFull timeSummer holidayFlexible hours- Jaide Health in Austin, Texas, seeks a Senior Security Engineer to enhance application security through automation and AI. You'll lead the vulnerability management program, focusing on securing development practices without slowing down the team. The ideal candidate has...Senior
- Our client, a leader in financial services supporting innovative trading and investment solutions, is seeking a Test Automation Engineer - Senior to join their team. As a Test Automation Engineer - Senior, you will be part of the Quality Assurance department supporting...SeniorWeekly payTemporary workFlexible hours
- A government agency in Austin, Texas, is looking for a Test Engineer to design and validate automated test systems used in manufacturing. Responsibilities include designing circuit boards, developing test code, and analyzing product health trends. Ideal candidates will...Senior
- Plasticos Castella SA seeks a Sr. Manufacturing Test Automation Engineer in Austin, TX. The role focuses on test solution development for optical systems in a high-performance environment, requiring extensive hands-on experience and engineering expertise. Qualified candidates...SeniorRemote job
- ...Job Description Job Description Salary: POSITION OVERVIEW The Senior Operations Engineer (Engineer) is responsible for providing technical support and analysis to system operations and field operations staff. In this role, the engineer will be responsible for...SeniorWork experience placementWork at officeShift work
- ...Description Job Description We are seeking an experienced Senior Mainframe Automation Migration Engineer to lead migration and optimization of mainframe... ...Point to IBM Systems Automation for Integrated Operations Management (SAIOM) . Key Responsibilities Lead...Senior
- A leading technology company in Austin, Texas is seeking a Senior Software Engineer for Enterprise Technology Services. This role involves developing intelligent automation solutions, managing global teams, and influencing decision-making through data-driven insights....Senior
- Ernst & Young Oman is looking for a talented Adobe Workfront Fusion Specialist to join their Marketing Transformation team. You will leverage your expertise in Adobe Workfront Fusion to design and implement solutions that optimize marketing processes. Candidates should ...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior HPC and LSF Operations Engineer. Be the first to apply!
- application operations engineer Austin, TX
- data center operations engineer Austin, TX
- production network engineer Austin, TX
- remote operation drilling engineer Austin, TX
- senior security operations engineer Austin, TX
- cloud operations engineer Austin, TX
- production operations engineer Austin, TX
- security operations center engineer Austin, TX
- security operations engineer Austin, TX
- post production engineer Austin, TX


