Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior HPC and LSF Operations Engineer

$152k - $241.5k

NVIDIA

Senior HPC and LSF Operations Engineer page is loaded## Senior HPC and LSF Operations Engineerlocations: US, CA, Santa Clara: US, MA, Westford: US, TX, Austin: US, NC, Durhamtime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2014127As a member of the Hardware Infrastructure EDA Compute team, you will optimize, scale, and support workload scheduling systems that directly impact design velocity and infrastructure efficiency. Success in this role requires both operational precision along with developing and supporting forward-looking resource management solutions that address evolving compute demands. Beyond day-to-day operations, the role drives improvements in observability, service reliability, and automation, ensuring the EDA compute environment remains resilient, measurable, and aligned with long-term engineering demands.**What you'll be doing:*** Manage, scale, and optimize job scheduling systems (LSF, Slurm, etc.) in a large-scale, multi-site environment supporting EDA and other compute-intensive workloads* Analyze scheduler and infrastructure performance data to identify systemic bottlenecks and drive measurable improvements in utilization, throughput, and turnaround time* Lead problem solving across scheduler, OS, and workload layers, ensuring timely resolution of service-impacting issues* Identify recurring operational challenges and implement targeted automation or process improvements to reduce manual effort and prevent repeat incidents* Help define and track reliable metrics and SLOs for service performance and reliability, partnering with customers to ensure expectations are realistic and measurable* Contribute to operational standards, documentation, and best practices to improve consistency across sites* Partner directly with customer teams to clarify requirements, translate technical tradeoffs, and drive issues to closure**What we need to see:*** Bachelor’s degree in Computer Science or related field, or equivalent experience* Minimum 5+ years of experience operating and supporting large-scale Linux-based compute infrastructure* Strong hands-on experience supporting and tuning job scheduling systems (LSF, Slurm, etc.) in HPC or silicon design environments* Proficiency in Linux systems administration (CentOS/RHEL)* Strong problem solving skills and the ability to independently analyze complex system behavior under load* Clear and effective communication skills, including the ability to articulate technical tradeoffs and reliability metrics to engineering stakeholders**Ways to stand out from the crowd:*** Experience implementing reliability engineering practices within HPC scheduling environments* Deep knowledge of job scheduling systems (LSF, Slurm, etc.) configuration tuning, scheduler internals, and advanced troubleshooting techniques* Experience building or enhancing observability systems, including metrics collection, monitoring pipelines, alerting strategies, and performance dashboards* Background with container technologies such as Docker, Singularity, or Podman in HPC environments* Experience influencing adoption of new infrastructure standards across multiple teams or sitesNVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and hardworking people in the world on our team and our collaborative talent continues to drive NVIDIA's growth. We are seeking creative and independent engineers with real passion for technology! #LI-HybridYour base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and .Applications for this job will be accepted at least until March 15, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Senior HPC and LSF Operations Engineer in Austin, TX vacancy
  • A leading technology firm in Austin is seeking a Senior HPC and LSF Operations Engineer to manage and optimize large-scale job scheduling systems. This role demands a Bachelor's degree in a relevant field and over 5 years of experience in Linux-based environments. Ideal... 
    Senior

    NVIDIA Corporation

    Austin, TX
    4 days ago
  • A leading energy company located in Austin, TX, is seeking a Senior Operations Engineer to lead operations for synchronous condenser assets. This role involves providing technical support, responding to operational issues, and coordinating outage requests. Ideal candidates... 
    Senior

    Wind Energy Transmission Texas, LLC.

    Austin, TX
    20 hours ago
  • TOTAL Deutschland GmbH is seeking an Operations Engineer to ensure optimal operation of solar and battery storage assets in Austin, Texas. This role requires a Bachelor's degree in Engineering and 6+ years of relevant experience in solar PV or battery energy storage systems... 
    Senior

    TOTAL Deutschland GmbH

    Austin, TX
    2 days ago
  • $152k - $241.5k

    Senior HPC Cluster Engineer page is loaded## Senior HPC Cluster Engineerlocations: US, CA, Santa Clara:...  ...Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA (...  ...schedulers and orchestrators, such as Slurm, LSF, PBS or K8s. Applied experience with AI... 
    Senior

    NVIDIA

    Austin, TX
    4 days ago
  • Applied Materials, Inc. in Austin, TX is seeking an Operations and Customer Quality Engineer III with expertise in quality solving tools and statistical software. The successful candidate will develop and apply quality standards while designing inspection methods and documenting... 
    Senior

    Applied Materials, Inc.

    Austin, TX
    1 day ago
  •  ...collaboration that fosters trust and unlocks creativity. About the role Engineer. Detective. Communicator. You are the whole package and you’re...  ...help supporting our first-of-its-kind, SaaS-based revenue operations solution. What you'll do Performs second-level application... 
    Senior
    Work at office
    Local area
    Remote work
    Flexible hours
    2 days per week
    3 days per week

    Rippling

    Austin, TX
    4 days ago
  • Gotransverse in Austin, TX is seeking a Technical Support Engineer to provide application support for their SaaS-based revenue operations solution. This role emphasizes strong technical skills in Linux and Java, alongside automation and system security responsibilities.... 
    Senior

    Rippling

    Austin, TX
    4 days ago
  • Advanced Micro Devices is seeking a Senior Technical Program Manager in Austin to lead the execution of AI and HPC programs. This role involves translating customer requirements into action plans, ensuring delivery timelines, and collaborating effectively with cross-functional... 
    Senior
    Remote job

    Advanced Micro Devices

    Austin, TX
    4 days ago
  • $152k - $241.5k

    A leading technology company is seeking a Senior Site Reliability Engineer in Austin, Texas. This role involves owning SRE solutions, supporting large-scale HPC clusters, and utilizing CI/CD techniques. Ideal candidates should have a B.S. in Computer Science, over 5 years... 
    Senior

    NVIDIA Corporation

    Austin, TX
    1 day ago
  • Senior Operations Engineer, Retail Store Operations & Support - Retail & Marcom Engineering Austin, Texas, United States Corporate Functions At Apple, we don’t just build products — we revolutionize entire industries. Our innovation is driven by the diverse ideas and... 
    Senior
    Work experience placement
    Worldwide
    Weekend work

    Apple Inc.

    Austin, TX
    2 days ago
  • $152k - $241.5k

    Senior Site Reliability Engineer - HPC page is loaded## Senior Site Reliability Engineer - HPClocations: US, CA...  ..., from design and implementation to operation and continuous improvement, ensuring...  ...large‑scale HPC clusters using Slurm, LSF or Kubernetes clusters, including... 
    Senior

    NVIDIA Corporation

    Austin, TX
    4 days ago
  •  ...of diverse teams and take your career wherever you want it to go. Join EY and help to build a better working world. WAF Operations Solution Engineer Practice Description As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web... 
    Senior
    Flexible hours

    Ernst & Young Oman

    Austin, TX
    4 days ago
  •  ...company, is a leading developer, owner, and operator of utility‑scale energy storage and...  ...Responsibilities Lead the Operations Engineering team supporting utility‑scale BESS and solar...  ..., and safety; Collaborate with senior leadership on long‑term operational planning... 
    Senior
    Remote work

    Aypa Power

    Austin, TX
    3 days ago
  •  ...Texas to join its OCI team. The role involves designing and developing image automation software and cloud services, with a focus on GPU/HPC infrastructure solutions. Candidates should have a robust Linux OS background and experience in enterprise distributed systems.... 
    Senior

    Oracle

    Austin, TX
    2 days ago
  • $135.96k - $203.94k

    Cacheflow is seeking a Senior Backend Engineer to design, build, and scale systems for AI-driven operations. This position involves owning the entire product lifecycle, from stakeholder discovery to deployment, while collaborating with diverse teams to develop secure and... 
    Senior

    Cacheflow

    Austin, TX
    20 hours ago
  • Digital Turbine Media, Inc. is seeking a Principal Engineer of Security Operations to lead technical advancements in their Security Operations Center (SOC). This full-time hybrid role focuses on cloud security, incident response, and collaboration across teams to maintain... 
    Senior
    Full time

    Digital Turbine Media, Inc.

    Austin, TX
    20 hours ago
  • $165k - $210k

     ...solve complex challenges, and stay at the forefront of data engineering and AI advancements. Remote first with casual, award-winning...  ...background with most of your experience in infrastructure and operations (managing enterprise data platforms). Responsibilities Leading... 
    Senior
    Casual work
    Remote work

    Jobot

    Austin, TX
    3 days ago
  •  ...hold each other to a high bar — and we’re looking for a Senior Test Automation Engineer who thinks in data, builds in systems, and treats quality...  ...: a stream replay and orchestration engine . Our product operates on real-time audio and video streams — the kind that flow... 
    Senior
    Remote work

    GetReal Security LLC

    Austin, TX
    3 days ago
  • A leading technology company in Austin, Texas seeks a SoC Power Analysis and Optimization Engineer to drive automation for SOC power optimization, collaborate with cross-functional teams, and explore machine learning methodologies. The ideal candidate has a bachelor's... 
    Senior

    Apple Inc.

    Austin, TX
    20 hours ago
  • GetReal Security LLC in Austin, TX is seeking a Senior Test Automation Engineer to design and implement a robust stream replay and orchestration engine. This role emphasizes building scalable testing frameworks for real-time audio and video streams, validating product performance... 
    Senior

    GetReal Security LLC

    Austin, TX
    2 days ago
  • Apex Fintech Solutions UK Ltd. is seeking a Senior IAM Automation Engineer to transform their management of workforce identity. This role intertwines DevOps practices with IAM expertise, aiming to build self-service, API-driven solutions within a multi-cloud environment... 
    Senior

    Apex Fintech Solutions UK Ltd.

    Austin, TX
    2 days ago
  • $106.8k - $194.8k

     ...diverse teams and take your career wherever you want it to go. Join EY and help to build a better working world. WAF Operations Solution Engineer PRACTICE DESCRIPTION: As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web... 
    Senior
    Full time
    Summer holiday
    Flexible hours

    EY

    Austin, TX
    20 hours ago
  • Jaide Health in Austin, Texas, seeks a Senior Security Engineer to enhance application security through automation and AI. You'll lead the vulnerability management program, focusing on securing development practices without slowing down the team. The ideal candidate has... 
    Senior

    Jaide Health

    Austin, TX
    4 days ago
  • Our client, a leader in financial services supporting innovative trading and investment solutions, is seeking a Test Automation Engineer - Senior to join their team. As a Test Automation Engineer - Senior, you will be part of the Quality Assurance department supporting... 
    Senior
    Weekly pay
    Temporary work
    Flexible hours

    ManpowerGroup Global, Inc.

    Austin, TX
    20 hours ago
  • A government agency in Austin, Texas, is looking for a Test Engineer to design and validate automated test systems used in manufacturing. Responsibilities include designing circuit boards, developing test code, and analyzing product health trends. Ideal candidates will... 
    Senior

    City of Shakopee, MN

    Austin, TX
    4 days ago
  • Plasticos Castella SA seeks a Sr. Manufacturing Test Automation Engineer in Austin, TX. The role focuses on test solution development for optical systems in a high-performance environment, requiring extensive hands-on experience and engineering expertise. Qualified candidates... 
    Senior
    Remote job

    Plasticos Castella SA

    Austin, TX
    20 hours ago
  •  ...Job Description Job Description Salary: POSITION OVERVIEW The Senior Operations Engineer (Engineer) is responsible for providing technical support and analysis to system operations and field operations staff. In this role, the engineer will be responsible for... 
    Senior
    Work experience placement
    Work at office
    Shift work

    Wind Energy Transmission Texas

    Austin, TX
    29 days ago
  •  ...Description Job Description We are seeking an experienced Senior Mainframe Automation Migration Engineer to lead migration and optimization of mainframe...  ...Point to IBM Systems Automation for Integrated Operations Management (SAIOM) . Key Responsibilities Lead... 
    Senior

    Phizenix

    Austin, TX
    9 days ago
  • A leading technology company in Austin, Texas is seeking a Senior Software Engineer for Enterprise Technology Services. This role involves developing intelligent automation solutions, managing global teams, and influencing decision-making through data-driven insights.... 
    Senior

    Apple Inc.

    Austin, TX
    2 days ago
  • Ernst & Young Oman is looking for a talented Adobe Workfront Fusion Specialist to join their Marketing Transformation team. You will leverage your expertise in Adobe Workfront Fusion to design and implement solutions that optimize marketing processes. Candidates should ...
    Senior

    Ernst & Young Oman

    Austin, TX
    20 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior HPC and LSF Operations Engineer. Be the first to apply!