Senior Platform and EngOps Engineer - Cluster Operations
$176k - $276kDormont Manufacturing Co
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for phenomenal people like you to help us accelerate the next wave of artificial intelligence. Join our team of innovative engineers who develop and maintain software facilitating GPU communication, driving groundbreaking solutions in High Performance Computing and Deep Learning. We’re looking for highly motivated EngOps and Platform Engineers to boost execution efficiency while managing and maintaining large GPU clusters interconnected via NVLink and InfiniBand. What you will be doing Develop automated tools to efficiently deploy, provision, and maintain extensive GPU clusters interconnected via NVLink and InfiniBand Implement modern DevOps tools to automate software updates, perform maintenance tasks, and monitor cluster availability, ensuring seamless operations. Take ownership of daily cluster failures and issues, troubleshooting them promptly to maintain optimal cluster availability and performance. Manage the rollout and rollback of cluster software and firmware updates, ensuring smooth transitions and minimal disruptions. Collaborate effectively with dynamic Engineering and Product Teams across multiple time zones to align cluster operations with evolving project requirements. What we need to see BS or MS in Computer Science, Computer Engineering, Electrical Engineering, or a related field, or equivalent experience. 8+ years of hands‑on experience in deploying and administrating clusters, servers, switches, and related infrastructure. Automation expert with hands on skills in Ansible, Python and Shell Scripting. Deep understanding of operating systems, computer networks, and high‑performance applications. Proven ability to work effectively with developers and test engineers across different teams and time zones. Proficient with Linux fundamentals. Ways to stand out from the crowd Familiarity with resource scheduling managers, preferably Slurm. Direct experience with industry standard alerting tools and emergency response practices. Hands‑on experience with GPU‑focused hardware and software, such as DGX systems and Compute Clusters. Proficiency in crafting and implementing a robust metrics collection and alerting infrastructure. Proficiency in designing large scale networking technologies and the associated challenges. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 176,000 USD - 276,000 USD for Level 4, and 208,000 USD - 333,500 USD for Level 5. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until March 20, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr Dormont Manufacturing Co
$83.9k - $155.7k
...and infectious diseases, and other applications. The Senior Systems Engineer will support Next Generation Sequencing (NGS) products and become an expert on the operation of the Roche Single Molecular Sequencing Platform. Responsibilities Serve as a hands‑on key...PlatformOperationsSeniorLocal areaRelocation package$160k - $200k
...under the Sydney Harbour Bridge to now operating globally, we’re spread across the... ...and we’re growing. We’re looking for a Senior Sales Engineer to join our world‑class team and help... ...bot management and fraudulent account platform to life for customers across the globe...PlatformOperationsSeniorRemote workFlexible hours- Role Summary Oracle Health Platform Engineering builds core platform capabilities that enable Oracle... ..., highly available services. We operate with an AI-first engineering culture—engineers... ..., and operations. We are seeking a Senior Software Developer (IC3) to design, develop...PlatformOperationsSeniorVisa sponsorship
- Our Deloitte AI & Engineering team works to transform technology platforms, drive innovation, and help make a significant... ...professionals reimagining and reengineering operations and processes that are critical... ...innovation. Work you'll do As a Senior OpenAI FDE, you will work side...PlatformOperationsSenior
- ...computation. About The Role As a Kernel Engineer on our team, you will develop high-performance... ..., optimizing, and scaling deep learning operations to fully leverage our custom, massively... ...Cerebras: Build a breakthrough AI platform beyond the constraints of the GPU. Publish...PlatformOperationsSenior
$66.52 - $88.14 per hour
Stanford Health Care seeks a Cloud Engineer in California to manage the Enterprise Information Management platform. The role requires expertise in Azure and Databricks,... ...experience, and a strong understanding of data operations. You will lead automation projects, ensure...PlatformOperationsSeniorHourly pay- ...live entertainment company is seeking a Cloud Engineer to manage and administer SQL Server... ...role involves designing and running database platforms, troubleshooting database issues, and supporting large scale operations. candidates should have strong expertise in...PlatformOperationsSenior
$133.95k - $165k
...warehouse logistics. As a Field Service Engineer, you will help to strengthen our world‑class... ...across Sales, Engineering, and Service Operations to drive technical solutions and resolve... ...with highly specialized and complex platform technology. Customer management and communication...PlatformOperationsSeniorVisa sponsorship- Job Description: Senior Core Banking Engineer— XGEN & IBM i Data Specialist Function: Core BankingEngineering... ...extraction, and output generation platform used within IBM i / AS400- based... ...at the intersection of core banking operations, data engineering, and regulatory...PlatformOperationsSeniorRemote jobFull timeBank staff
- ...1456 Job Summary F3EA is seeking a Senior CI/CD & Integration Engineer to support the Blue Water Instrumentation... ...Engineer will establish and operate the automated infrastructure that enables... ...tools, knowledge management platforms, and business process automation solutions...PlatformOperationsSeniorFull timeApprenticeshipWork at officeLocal area
$130k - $180k
Senior IT Systems and Automation Engineer About Moon An ambitious and independent stealth SaaS company incubated by... ...experience and deliver operational excellence for businesses across the world through a unified platform supercharged with proprietary AI agents...PlatformOperationsSeniorContract workFor contractorsWork at office$72 - $89 per hour
...motivated and technically accomplished Senior Process Engineer, MSAT to serve as a critical... ...establishment and maturation of GMP‑ready platform processes that form the foundation of... ...internal manufacturing capabilities, operating with a high degree of autonomy across...PlatformOperationsSeniorFull timeContract work- ...you. Open up opportunities with HPE. Senior Presales Systems Engineer Job Family Definition: Responsible... ...security, automation, and AI‑driven operations solutions. The preferred candidate... ...Access Assurance, Cisco ISE, or similar platforms Solid working knowledge of data...PlatformOperationsSeniorWork experience placementWork at officeRemote workWork from home
$72 - $86 per hour
GeneFab is seeking a Senior Manufacturing Engineer to lead the identification, implementation, and management... ...across our GMP manufacturing operations. This role will serve as the primary... ...systems such as batch record (EBR) platform to replace paper batch records across...PlatformOperationsSeniorFull timeContract workApprenticeship- At Commure, we're building the AI Operating System for healthcare, the foundation that defines... ..., documented, and financed. Our platform spans the full care journey: Ambient AI... ...anything touches production. Scale a rule engine that runs hundreds of configurable conditions...PlatformOperationsSeniorWork at officeImmediate start
$170k - $200k
...missions in every domain. Umbra’s ecosystem operates through three business units: Remote... ...), and Mission Solutions (the platforms). Together, our teams develop capabilities... ...Umbra’s Radar Processing Group as a Senior Software Engineer. In this pivotal role, you will play...PlatformOperationsSeniorPermanent employmentWork at officeLocal areaRemote workWorldwideFlexible hours$208k - $333.5k
Systems Engineering is an engineering discipline focused on building, automating, and operating the platforms and tooling that deliver large-scale production systems with high efficiency, reliability, and velocity. It combines software and systems engineering practices...PlatformOperationsSenior- Senior Instrumentation & Controls Engineer page is loaded## Senior Instrumentation & Controls Engineerlocations... ...This role is part of Blue Origin Operations, which is comprised of Integrated... ..., control systems, and automation platforms that allow a crew to safely monitor...PlatformOperationsSeniorFor contractorsWork at officeRelocation
$112.6k - $168.85k
...software, analytics, Site reliability engineers, Cloud Operations, Medical, Marketing, Data engineering... ...mobile applications on Android/iOS platforms a plus• Experience reviewing verification... ...status as a protected veteran.()The Senior Systems Engineer is a member of the...PlatformOperationsSeniorWork experience placementWork at office$116k - $170k
# Senior Software Engineer, Windows Sensor - CTIO (Hybrid)CrowdStrikeFull TimeseniorHybridCAPosted... ...the world’s most advanced AI-native platform. Our customers span all industries, and... ...second to provide deep visibility into operations on the endpoint, and performs rich...PlatformOperationsSeniorFull timeWork experience placementWork at officeLocal areaRemote work$157.5k - $254.35k
...’s Intelligent Agreement Management platform, companies can create, commit, and manage... ...self‑motivated, driven and creative Senior Site Reliability Engineer to join the Site Reliability team.... ...to eliminate toil and reduce operational risk, drive improvements in observability...PlatformOperationsSeniorContract workWork at officeLocal areaRemote work$119.3k - $145k
...Tandem Diabetes Care is hiring a Senior Software Test Engineer I to lead test projects and ensure quality... .... Oversee documentation of test operations and report results to software engineering... ...of bug fixes across the software platform. Develop and assist with...PlatformOperationsSeniorLocal areaRemote workFlexible hours2 days per week3 days per week$250.5k - $335.9k
Sr Principal Site Reliability Engineer - Media Engineering Req ID: 10144162 Location: San... ...Engineering team to build a high‑availability platform that delivers streaming, advertising,... ...incidents. Partner with Infrastructure, Operations, Product, and Development teams to...PlatformOperationsSenior- ...Across North America, ASR Group® companies operate five sugar refineries, located in... ...sugarcane. OVERVIEW The Sr. Packaging Controls Engineer is responsible for maintaining... ...more of the following: Wonderware System Platform & InTouch, Allen Bradley PLC systems (PLC...PlatformOperationsSeniorFor contractorsWork experience placementLocal area
$101.97k - $203.94k
...one community at a time. Job Purpose and Summary: As the Sr. Quality Engineer IT Wellpartner you will lead various teams supporting the ongoing operation and evolution of the Wellpartner platform including datacenter, network, infrastructure operations and engineering...PlatformOperationsSeniorHourly payFull timeTemporary workLocal area- ...company. We are currently looking for a Senior Unix System Engineer - REMOTE. In this pivotal role, you... ...directly to the successful operations of our partner. The role is essential... ...configure, and patch RHEL on various platforms Utilize Splunk for log analysis and...PlatformOperationsSeniorRemote jobTemporary workFlexible hours
- ...We are seeking a Sr. Fill-Finish Process Engineer with strong experience in aseptic... ...along with a strong understanding of GMP operations and regulatory expectations. This position... ...systems. Experience with digital validation platforms such as KNEAT is a plus. Ensure all...PlatformOperationsSeniorWork at officeRemote workVisa sponsorshipWork visaFlexible hours
$140k - $190k
...Software Engineer - Robotics Perception Sensors Seeking a skilled... ...directly influence how autonomous platforms make decisions in dynamic... ...with autonomy, UI, and operations teams to visualize and interpret... ...algorithms (segmentation, clustering, Kalman filters, etc.) Collaborative...PlatformOperationsWork at officeFlexible hours$132k - $207k
NVIDIA is seeking a highly skilled QA Engineer to join our Workstation and Virtualization... ...Validate NVIDIA products on customer‑specific platforms and configurations to ensure... ...architecture, supercomputers, and computer clusters, including caches, buses, memory controllers...PlatformSeniorRemote workFlexible hours$112.9k - $155.24k
Scope/Purpose of Position The Senior Systems Engineer is responsible for working... ...the systems in which they operate, and how Corning products... ...network architects and ASIC/platform engineers Perform system level... ...‑out), including GPU/XPU cluster designs and optical I/O...PlatformSeniorFull timeWork experience placement
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Platform and EngOps Engineer - Cluster Operations. Be the first to apply!
- platform developer California, MO
- senior platform engineer California, MO
- platform engineer California, MO
- data platform engineer California, MO
- senior data management analyst California, MO
- senior app developer California, MO
- senior game producer California, MO
- senior manager quality engineering California, MO
- senior software test automation engineer California, MO
- senior quantitative risk analyst California, MO

