Senior HPC and LSF Operations Engineer
NVIDIA
Hardware Infrastructure EDA Compute Team Member
As a member of the Hardware Infrastructure EDA Compute team, you will optimize, scale, and support workload scheduling systems that directly impact design velocity and infrastructure efficiency. Success in this role requires both operational precision along with developing and supporting forward-looking resource management solutions that address evolving compute demands. Beyond day-to-day operations, the role drives improvements in observability, service reliability, and automation, ensuring the EDA compute environment remains resilient, measurable, and aligned with long-term engineering demands.
What you'll be doing:
- Manage, scale, and optimize job scheduling systems (LSF, Slurm, etc.) in a large-scale, multi-site environment supporting EDA and other compute-intensive workloads
- Analyze scheduler and infrastructure performance data to identify systemic bottlenecks and drive measurable improvements in utilization, throughput, and turnaround time
- Lead problem solving across scheduler, OS, and workload layers, ensuring timely resolution of service-impacting issues
- Identify recurring operational challenges and implement targeted automation or process improvements to reduce manual effort and prevent repeat incidents
- Help define and track reliable metrics and SLOs for service performance and reliability, partnering with customers to ensure expectations are realistic and measurable
- Contribute to operational standards, documentation, and best practices to improve consistency across sites
- Partner directly with customer teams to clarify requirements, translate technical tradeoffs, and drive issues to closure
What we need to see:
- Bachelor's degree in Computer Science or related field, or equivalent experience
- Minimum 5+ years of experience operating and supporting large-scale Linux-based compute infrastructure
- Strong hands-on experience supporting and tuning job scheduling systems (LSF, Slurm, etc.) in HPC or silicon design environments
- Proficiency in Linux systems administration (CentOS/RHEL)
- Strong problem solving skills and the ability to independently analyze complex system behavior under load
- Clear and effective communication skills, including the ability to articulate technical tradeoffs and reliability metrics to engineering stakeholders
Ways to stand out from the crowd:
- Experience implementing reliability engineering practices within HPC scheduling environments
- Deep knowledge of job scheduling systems (LSF, Slurm, etc.) configuration tuning, scheduler internals, and advanced troubleshooting techniques
- Experience building or enhancing observability systems, including metrics collection, monitoring pipelines, alerting strategies, and performance dashboards
- Background with container technologies such as Docker, Singularity, or Podman in HPC environments
- Experience influencing adoption of new infrastructure standards across multiple teams or sites
NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and hardworking people in the world on our team and our collaborative talent continues to drive NVIDIA's growth. We are seeking creative and independent engineers with real passion for technology!
$168k - $195k
...Senior Cyber Security Engineer - Siem And Automation At Corebridge Financial, we believe action is everything. That's why every day we partner... ...leaders to design and execute new strategies through IT and operations services and ensures the necessary IT risk management and...SeniorWork at officeLocal areaImmediate startRemote work- ...Sr. Cloud Network Automation Engineer SME Key Responsibilities Automation Framework Development: Design, develop, and implement... ...automation processes to improve performance and reduce operational overhead. Collaboration and Mentorship: Work closely with...Senior
- ...Test Automation Engineer (Experienced or Senior) Spectrolab, Inc., a wholly owned subsidiary of The Boeing Company, seeks a Test Automation Engineer to join our Precision Engagement Systems Team based in Sylmar, CA. Spectrolab is the world's leading merchant supplier...SeniorPermanent employmentRelocation
- A leading technology firm is seeking a Sr Test Engineer to develop and execute test plans and test scripts for high-end products. Candidates should have at least 5 years of experience in related fields and possess a BS/MS in Computer Science. The role involves collaboration...Senior
$150k - $180k
...making, the Network Optix Enterprise Video Operating System helps innovative organizations... ...About the Role We are seeking a Senior Mobile SDET to take full end-to-end ownership... ...is a strategic role for a high-autonomy engineer who will not just execute, but define...SeniorWork at officeRemote workFlexible hours- ...Senior QA Automation Engineer This position plays a key role within the IT Quality Assurance Department. The candidate will be a member of an Agile software development team and ensure that Quality Assurance tests are written, executed and automated. This is a senior...Senior
$110.04k - $204.36k
...celebrated, here you can thrive. Your New Role : The Sr. Network Reliabili ty and Automation Engineer function will be responsible for working with the Operations and Engineering Teams around the support and strategic improvement of the network platform using programming...SeniorTemporary workLocal areaFlexible hours$170k - $200k
...Senior Software Engineer – AI & Workflow Automation POSITION TITLE: Senior Software Engineer – AI & Workflow Automation Location: Hybrid... ...workflows. You'll collaborate closely with the Tech Lead, operations, and other engineers to deliver production-ready automation...SeniorLocal areaRemote workWorldwide- ...in Rolling Meadows, IL, San Diego, CA or Woodland Hills, CA. This is an onsite position that offers the 9/80 work schedule. The Senior Principal Program Control Analyst - PCIS Tools ( Sr PCA-PCIS Tools ) will be responsible for administration, support and continuous...Senior
$170k - $277k
...Software Engineer At Palo Alto Networks®, we're united by a shared mission—to protect our digital way of life. We thrive at the intersection of innovation and impact, solving real-world problems with cutting-edge technology and bold thinking. Here, everyone has a voice...SeniorFull timeWork at office$76.2 - $129.74 per hour
...Senior Principal Software Engineer IS - Hybrid The Senior Principal Software Engineer takes end-to-end ownership for development and quality of solutions and services that delight caregivers and add strategic value to Providence St. Joseph Health. They evaluate requirements...Senior$114k - $171k
...Principal Or Senior Principal Systems Engineer At Northrop Grumman, our employees have incredible opportunities to work on revolutionary systems that impact people's lives around the world today, and for generations to come. Our pioneering and inventive spirit has enabled...SeniorFull timeContract workShift work- ...Senior Principal Cyber Engineer Forcepoint simplifies security for global businesses and governments. Forcepoint's all-in-one, truly cloud-native... ...deep expertise in enterprise architecture, security operations, and complex software solution delivery. Multi-Product...SeniorFull timeRemote work
$141.6k - $212.4k
...compassionate world. About the Role The Senior Principal Cloud Security Architect is... ...This is a control-plane role, not an operational security role. The architect owns what... ...are implemented through Platform Engineering and enforced through automation and governance...SeniorWork at officeLocal areaFlexible hours$169.6k - $254.4k
...compassionate world. About the Role The Senior Principal Cloud Architect - AWS, is... ...apps, and clinical systems. Mentor engineers and elevate engineering practices across... ...models: DynamoDB for high-throughput operational workloads (session/state, telemetry indexes...SeniorWork at officeLocal areaFlexible hours- ...Senior Human Resources Business Partner (HRBP) Our client is a fast-growing, global technology consulting & services company. The company provides digital solutions to business challenges in a variety of industries, including healthcare, financial services, higher...Senior
- Job Title Experience in validating QMS and different modules in QMS 5+ years’ experience with System Development Lifecyle 10+ years’ experience in Computer System Validation (Based on the role selected) Experience in FDA and/or Global regulated environment with...Senior
- Job Responsibilities Follow the development life cycle and adhere to the standard procedures Support acquisitions work for the Kronos iSeries system Research and resolve issues identified in systems and interfaces Determine root cause for problems and implement...Senior
$223k - $306.5k
...Sr Principal Ai Engineer At Palo Alto Networks®, we're united by a shared mission—to protect our digital way of life. We thrive at... ...to design workflows for Go-To-Market (GTM), sales, or support operations. Familiarity with Java or Node.js for enterprise integration...SeniorFull timeWork at office- Senior Data Engineer We are looking for Senior Data Engineers with strong experience in Data Engineering using Python and PySpark, solid expertise in API integration, and proficiency with AWS data services.Senior
- Job Title For positions that will be based in CA, the annual salary range for this position is below. Actual salaries may vary based on numerous factors including, among other things, an individual applicant's experience and qualifications for the position. This range...Senior
$27.09 per hour
...Indefinite Job #: 30509 Primary Duties and Responsibilities Salary Rate: $27.09 Job Qualifications Similar Jobs Senior Custodian - Roxbury Dr. Los Angeles, CA, USA Senior Custodian, Per Diem - 1131 Wilshire Blvd. SM Santa Monica, CA, USA Senior...SeniorHourly payDaily paidMonday to Friday$142.5k - $190k
A prominent entertainment agency is seeking a Principal Architect to lead the design of technology infrastructure spanning on-premises and cloud environments. This role focuses on Microsoft Azure, driving zero-trust security models, and creating a multi-year infrastructure...Senior- ...Senior Front-end React Developer We are an early-stage medical AI startup based out of California, USA with a research and development office located in India. Be a part of our dynamic team of forward-thinkers and innovators, where cutting-edge technology meets...SeniorWork at office
- IBM Software Job Opportunity At IBM Software, we transform client challenges into solutions. Building the world's leading AI-powered, cloud-native products that shape the future of business and society. Our legacy of innovation creates endless opportunities for IBMers...Senior
$60 - $65 per hour
Job Title Experience 6-8 years Expertise in developing test automation using JUnit, SOAP UI Expertise in test planning, test plan development, test case writing Experience in driving the team and working/coordinating in onsite/offshore model Experience in...Immediate start$132.8k - $199.2k
...managing their disease. MiniMed is looking for a highly motivated Senior Principal IT Business Systems Analyst - SAP FICO responsible... ...the SAP FICO module, ensuring minimal disruption to business operations. Collaboration: Work closely with other IT and business...SeniorWork at officeLocal areaFlexible hours- ...VP Operations Rapidly growing regional senior living AL/MC management company is looking for a vice president of operations with experience motivating and developing teams, building occupancy, and maximizing NOI. You should be located in the Southeast. Some travel...SeniorFlexible hours
- ...Senior Embedded Linux Engineer – Next Generation Sdv Platform (Nvidia Thor) We are seeking a senior embedded linux engineer to design and develop advanced software features for next-generation automotive platforms running on linux os. This role focuses on communication...Senior
- Concrete Estimating Specialist Responsible for detailed quantity take-off and preliminary pricing for multi-story cast-in-place concrete structures in California. Requirements Must have a minimum of 10 years estimating self-performed multi-story cast-in-place...SeniorHourly pay
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior HPC and LSF Operations Engineer. Be the first to apply!
- senior vmware engineer Encino, CA
- senior performance engineer Encino, CA
- senior software design engineer Encino, CA
- senior application security engineer Encino, CA
- senior tableau developer Encino, CA
- senior magento developer Encino, CA
- senior sas developer Encino, CA
- senior grant accountant Encino, CA
- senior mainframe developer Encino, CA
- senior leadership Encino, CA

