Manager, HPC Storage Engineer
$150k - $240kRunPod, Inc.
Overview Runpod is pioneering the future of AI and machine learning, offering cutting-edge cloud infrastructure for full‑stack AI applications. Founded in 2022, we are a rapidly growing, well‑funded, remote‑first company with a global team across the US, Canada, and Europe. Our mission is to create a foundational platform that enables developers and companies to build, deploy, and scale custom AI systems with speed and flexibility. As AI workloads continue to push the limits of throughput, latency, and parallelism, Runpod is investing heavily in next-generation storage architectures purpose-built for GPU-centric compute. We are looking for an Engineering Manager, Datacenter Storage Engineering to lead the team responsible for Runpod’s distributed storage infrastructure across all regions. This role owns the end-to-end storage stack — from NAND and NVMe devices through filesystems, transport protocols, and cluster-level deployment — ensuring performance, reliability, and scalability for AI workloads. You will manage engineers designing and operating large-scale SAN and NFS-based systems, including high-performance shared filesystems for training workloads. This role requires deep technical fluency and architectural leadership, combined with strong people management and operational discipline. Responsibilities Own Distributed Storage Architecture: Define, evolve, and operate Runpod’s global storage platforms, supporting training, inference, checkpointing, and dataset access at scale. Build the Storage Engineering Team: Manage and grow a team of storage and systems engineers. Set clear ownership, technical direction, and operational standards across regions. High-Performance Shared Filesystems: Design and operate large-scale SAN and NFS deployments , including performance-sensitive shared storage for GPU clusters. Advanced Filesystems & Platforms: Lead deployments and operations of VAST Data and experience with Lustre or similar parallel filesystems used in HPC and AI environments. End-to-End Performance Ownership: Drive performance optimization from NAND and NVMe media through controllers, networking, and client access patterns. Next-Generation Storage Technologies: Evaluate and deploy cutting-edge capabilities such as NFS over RDMA, GPU Direct Storage (GDS) , and low-latency data paths for accelerated workloads. Reliability & Scale: Establish best practices for replication, data tiering, data protection, failure recovery, capacity planning, and lifecycle management. Automation & Observability: Build automation for provisioning, expansion, upgrades, and monitoring. Ensure deep observability into throughput, latency, and error characteristics. Cross-Functional Collaboration: Partner with Datacenter Networking, GPU Platform, SRE, and Product teams to ensure storage systems meet evolving workload and customer needs. Vendor & Partner Management: Own technical relationships with storage vendors, hardware partners, and colocation providers; drive roadmap alignment and issue resolution. Requirements Engineering Leadership Experience: 3+ years managing storage, systems, or infrastructure engineering teams in production environments. Distributed Storage Expertise: 8+ years designing and operating large-scale storage systems, including SAN and NFS architectures at multi-petabyte scale. VAST Data Experience: Hands-on experience deploying, operating, or deeply integrating VAST Data in production environments is required. Parallel Filesystems: Experience with Lustre or comparable HPC filesystems (e.g., GPFS, BeeGFS) supporting high-concurrency workloads. Low-Level Storage Knowledge: Deep understanding of NAND, NVMe, PCIe, storage controllers , and performance characteristics across the stack. High-Performance Data Paths: Proven experience with NFS over RDMA, RDMA-capable transports , or similar technologies. Familiarity with GPU Direct Storage strongly preferred. Linux Systems Expertise: Strong Linux internals knowledge, including filesystems, I/O scheduling, memory management, and tuning for performance workloads. Operational Excellence: Experience running 24/7 storage platforms with strong incident response, change management, and post-mortem discipline. Communication & Leadership: Ability to clearly communicate complex technical tradeoffs and lead teams through high-stakes infrastructure decisions. Successful completion of a background check. Preferred Qualifications Experience supporting AI training pipelines, large-scale model checkpointing, and dataset streaming workloads. Familiarity with RDMA fabrics and close collaboration with datacenter networking teams. Experience designing storage systems for multi-tenant isolation and secure data access. Background in hyperscale, HPC, or AI-focused infrastructure environments. Experience building internal storage platforms or abstractions consumed by product teams. What You’ll Receive The competitive base pay for this position ranges from $150,000 - $240,000 USD. This salary range may be inclusive of several career levels at Runpod and will be narrowed during the interview process based on a number of factors, including the candidate’s experience, qualifications, and location. Meaningful equity in a fast-growing company — everyone on the team receives stock options — your impact drives our growth, and you share in the upside. Generous medical, dental & vision plans — we cover 100% for all employees and partial for dependents. Flexible PTO — take the time you need to recharge. Most roles are remote work first with an inclusive, collaborative teams utilizing Slack as the main form of internal communication. Join a passionate team on the cutting edge of AI infrastructure — where culture, learning, and ownership are at the heart of how we scale. #J-18808-Ljbffr RunPod, Inc.
- ...sub-second container startups, and native storage, Modal makes it simple to train models,... ...to production without the burden of managing infrastructure. We're a fast-growing team... ...international olympiad medalists, and experienced engineering and product leaders with decades of...Suggested
- ...Appliances is seeking an Executive Sales / Sales Engineer specialized in cold room systems to... ...or engineering, particularly in cold storage and industrial refrigeration. This role... ...identifying new business opportunities, managing client relationships, and collaborating...Suggested
$150k - $180k
...DESRI Project Management Services, L.L.C. seeks a Manager or Senior Manager in Design Engineering for Battery Energy Storage Systems in New York. This role involves leading technical oversight of battery systems from early development to operation. The candidate will...Suggested- Revamp Engineering, Inc is seeking a skilled Power Systems Engineer to lead engineering studies for utility-scale solar and battery storage projects in the United States. The successful candidate will execute technical analyses, manage multiple projects, and guide less...Suggested
- ...Renewable Properties is seeking a Solar Engineering Manager to lead technical strategy and projects for solar and energy storage. This role spans the project lifecycle, from development diligence to construction oversight. The ideal candidate has over 5 years of engineering...SuggestedRemote work
- ...P2P is seeking an experienced Electrical Engineer based in New York City to lead the electrical design of critical data center infrastructure. This role involves managing project delivery and ensuring high standards in electrical systems design, documentation, and coordination...
$84.76k - $129.5k
...Sargent Lundy is hiring an Electrical Engineering Lead to join their Consulting group, focusing on utility-scale power projects. This role requires a Bachelor's degree in electrical engineering and at least 5 years of experience in electrical power engineering design....- ...Job Title: Ceph Storage Engineer Location: New York, NY - Hybrid Work Duration: 8 months... ...Excellent verbal, analytical, project management and technical problem solving skills. Ability... ...integration of the file system with HPC compute machines running Linux...
$175k - $250k
...Senior HPC Engineer Millennium's Infrastructure organization designs, engineers, and operates... ..., and share knowledge. Partner with storage specialists to architect and maintain... ..., monitoring, and lifecycle management, with an emphasis on repeatability and...$165k - $242k
...in March 2025. Learn more at What You’ll Do CoreWeave is seeking a highly skilled and motivated HPC Performance Engineer to join our HAVOCK Team, reporting into the Manager of Systems Engineering. In this role, you will play a crucial part in the design, development,...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hours- ...Fluence Energy, a leader in intelligent energy storage, is seeking a Sales Engineer. This role involves providing technical support throughout the sales process for large energy storage projects, requiring interaction with various internal teams. Candidates should have...Remote work
- ...Sales Engineer - LATAM Job Locations: US-Remote Overview This is an... ...AI and high-performance data storage innovation for over two decades... ...in AI and multi-cloud data management at scale. Our cutting-edge data... ...storage for AI/ML training, HPC simulations, data analytics,...InternshipLocal areaRemote work
- ...data centers supporting a high‑frequency trading environment. Our storage platform underpins data processing, analytics, and compute‑heavy... ...Improving how storage integrates with compute environments (GPU/HPC, Kubernetes, data pipelines) Driving faster and more reliable...Immediate startRemote work
- ...is seeking a Senior Systems Engineer to support the Department of... ...have over 5 years in enterprise storage or data solutions and... ...skills and understanding of HPC and AI/ML are crucial. Join us... ...opportunity at the forefront of AI and data management. #J-18808-Ljbffr...
- ...systems, cloud platforms, and HPC environments to support... ...Role We are looking for Systems Engineers / System Administrators to help... ...systems handling petabyte-scale storage Improve performance,... ...provisioning, and system lifecycle management Contribute to system design and...
- ...-powered advice on this job and more exclusive features. Direct message the job poster from EIT Professionals Corp Role: HPC Observability Engineer (Python, HPC) Location: Remote Contract Description: The client has Grafana and InfluxDB services running on K8S in-house...Contract workRemote work
- ...Mistral AI in New York is seeking a Systems Engineer/System Administrator to design and operate the infrastructure supporting its AI platforms... ..., with experience in large-scale environments such as HPC clusters or cloud infrastructure. The role involves maintaining...
- A pioneering AI cloud services company based in New York, NY, is looking for an HPC Performance Engineer to optimize bare-metal systems and enhance performance analysis. The candidate will work on developing tools for performance baselines, maintaining regression test...
- ...all the teams since conception. Strategic growth at Series C stage, focus on long-term company health metrics. In the role of Engineering Manager you will work on the Advanced Team where you will continue cultivating client success and solving problems. You are committed...Work at officeImmediate startRelocation
$125.5k - $230.2k
...locations: Anywhere in Country. The opportunity We are seeking a Data Engineer with strong semantic data engineering capabilities — someone... ...To thrive in this role, you will need a strong ability to manage change, lead teams effectively, and maintain excellent relationship...Summer holidayFlexible hours$250k - $275k
Qumulo's cloud data platform manages exabytes of the world's most demanding... ...proof-of-concept, designing storage and data management... ...facing technical role (Systems Engineer, Solutions Architect, or Pre‑Sales... ...enterprise workloads, from AI and HPC simulations to Splunk...Local areaFlexible hours- ...Recycling & Facility Maintenance Manager Job Category: Warehouse Requisition Number: RECYC004555 Full-Time On-site Locations Showing 1 location Description Key Responsibilities Recycling, Redemption & Recoverable Assets Coordinate redemption pickups, drop-offs, recycling...Full timeFor contractorsWork at office
$184k - $356.5k
NVIDIA Corporation is seeking a Senior Quantum Computing Libraries Engineer in New York, NY. You will develop and optimize software for quantum computing using state-of-the-art HPC technologies. The ideal candidate has extensive experience in C++ and Python, alongside strong...- ...What They Do Architectural and Engineering Managers play a problem‑solving role in their field. They manage the coordination and overall integration of technical activities in architecture or engineering projects, direct, review, or approve project design changes, consult...ApprenticeshipWork at officeRemote workAfternoon shift
$38.72k - $48.72k
Position Summary Supervise University's solid waste and recycling activities in accordance with New York City Code; train recycling staff, assign work to and monitor staff performance. Serve as liaison with University's janitorial contractor to advise on waste handling...Hourly payFor contractorsWork experience placement- ...A leading architecture and engineering firm in New York seeks a project coordinator to manage multiple projects in a fast-paced environment. The ideal candidate will have a Bachelor's Degree in Engineering or Architecture and 5 years' experience in project work. Responsibilities...
$180k - $220k
...A leading technology company is seeking an Engineering Manager for Distributed Systems in New York (Remote). Responsibilities include managing engineers, architecting systems within Ceramic, and collaborating with product teams. The ideal candidate has extensive experience...Remote work$110k - $178k
...provider of advanced server, storage, and networking solutions for... ...Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide... ..., passionate, and committed engineers, technologists, and business... ...currently seeking a Sr. Sales Manager responsible for successfully expanding...Worldwide$143k - $210k
...Sr. Engineer, Storage Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers... ...product capabilities and data plane function of CoreWeave's managed storage products. We build reliable, scalable storage...Permanent employmentTemporary workCasual workWork at officeFlexible hours- ...Resource Erectors is seeking a Projects & Engineering Manager to oversee major projects and ensure efficient engineering. This position emphasizes reducing lifecycle costs while meeting customer demands and requires up to 50% travel. The ideal candidate will have over...Remote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Manager, HPC Storage Engineer. Be the first to apply!
- apple localization manager New York, NY
- compounding manager New York, NY
- nicu manager New York, NY
- ca identity manager New York, NY
- mitigation manager New York, NY
- senior compensation manager New York, NY
- manager total rewards New York, NY
- manager salesforce New York, NY
- valuation manager New York, NY
- fraud prevention manager New York, NY

