Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

HPC Data Storage Administrator (Research & HPC Data Platforms)

$150.29k - $171.67k

Stanford

Storage Expert Opportunity

Stanford's Research Computing team is seeking a storage expert to join our Research & HPC Data Platforms group. This is a flexible-level posting; we are seeking either a Storage Architect or a Senior Storage Systems Administrator to maintain and expand our current world-class infrastructure. You will work directly with our research and HPC data platforms team lead to manage a diverse environment of more than 100PB and 5 billion files, including high-speed Lustre, MinIO object storage, and Lustre HSM, among other platforms.

Why Stanford? You aren't just managing a storage cluster; you are a part of a data and storage ecosystem that supports Nobel-caliber research across all disciplines.

Storage Architect Key Responsibilities
  • Architecture: Adapt and evolve the technology designs of existing systems to meet the needs of future computing platforms and research aims.
  • Platform Management: Deliver on the scaling, reliability, security, compliance, operations, and lifecycle management of our primary research computing storage platforms, including for high-risk data.
  • Tiered Storage Architecture: Oversee the integration of Lustre HSM on the Elm platform, managing data movement policies between parallel filesystems and MinIO object storage.
  • Performance Engineering: Tune I/O for large-scale High Performance Computing and AI workloads.
  • Community Stewardship: Represent Stanford within the Lustre community and other key community groups, contributing to the upstream roadmap and maintaining a vendor-neutral storage strategy.
Required Qualifications
  • Education: Bachelor's degree and eight years of relevant experience, or a combination of education and relevant experience.
  • Expertise at Scale: 8+ years of hands-on experience architecting, building, and managing Lustre and ZFS or similar filesystems at the 20PB+ scale.
  • Object Storage & HSM: Deep technical fluency in MinIO and Lustre HSM (copytools, policy engines like RobinHood) or similar tools.
  • Kernel & Network Mastery: Expert-level knowledge of the Linux kernel and large-scale InfiniBand/Ethernet fabric tuning.
  • In-depth Troubleshooting Experience: Must be capable of leading the debugging of issues such as kernel panics, LNet congestion, and metadata bottlenecks.
  • Leadership: Proven experience mentoring junior admins and leading large-scale migration projects without data loss.
  • Communication: Strong written and verbal communication skills.
Senior Storage Systems Administrator Key Responsibilities
  • Platform Management: Contribute to the scaling, reliability, security, compliance, operations, and lifecycle management of our primary research computing storage platforms, including for high-risk data.
  • Operational Excellence: Perform complex filesystem upgrades, kernel patches, and hardware refreshes with minimal downtime.
  • Monitoring & Telemetry: In collaboration with others, build and maintain sophisticated observability stacks for real-time I/O tracking and trend analysis.
  • User Support: Act as an escalation point for researchers struggling with complex I/O patterns, job failures, or data access issues.
  • Maintenance: Manage the physical and logical health of the storage fleet, including RMA processes, firmware updates, and disk replacement cycles.
Required Qualifications
  • Experience: 5+ years of Linux Systems Administration, with 3+ years specifically in an HPC or large-scale data environment.
  • Technical Stack: Strong hands-on experience with Lustre, ZFS, MinIO, and/or similar technologies.
  • Scripting: Advanced proficiency in scripting languages for automating routine storage tasks and parsing system logs.
  • Hardware Mastery: Comfortable with the physical aspects of the role—diagnosing hardware failures and understanding power/cooling requirements for high-density storage.
  • Communication: Strong written and verbal communication skills.
Physical Requirements*
  • Constantly perform desk-based computer tasks.
  • Frequently sit, grasp lightly/fine manipulation.
  • Occasionally stand/walk, writing by hand.
  • Rarely use a telephone, lift/carry/push/pull objects that weigh up to 10 pounds.

* Consistent with its obligations under the law, the University will provide reasonable accommodations to applicants and employees with disabilities. Applicants requiring a reasonable accommodation for any part of the application or hiring process should contact Stanford University Human Resources.

Working Conditions:
  • May work extended hours, evenings, and weekends.
Work Standards:
  • Interpersonal Skills: Demonstrates the ability to work well with Stanford colleagues and clients and with external organizations.
  • Promote Culture of Safety: Demonstrates commitment to personal responsibility and value for safety; communicates safety concerns; uses and promotes safe behaviors based on training and lessons learned.
  • Subject to and expected to stay in sync with all applicable University policies and procedures, including but not limited to the personnel policies and other policies found in Stanford's Administrative Guide.

The expected pay range for this position is $150,289 to $171,674 per annum. Stanford University provides pay ranges representing its good faith estimate of the salary or hourly wage the university reasonably expects to pay for a position upon hire. The pay offered to a selected candidate will be determined based on factors such as (but not limited to) the scope and responsibilities of the position, the qualifications of the selected candidate, departmental budget availability, internal equity, geographic location and external market pay for comparable jobs. At Stanford University, base pay represents only one aspect of the comprehensive rewards package. The Cardinal at Work website provides detailed information on Stanford's extensive range of benefits and rewards offered to employees. Specifics about the rewards package for this position may be discussed during the hiring process.

Stanford is an equal employment opportunity and affirmative action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the HPC Data Storage Administrator (Research & HPC Data Platforms) in Stanford, CA vacancy
  • A prestigious research university in California is looking for a Storage Expert to enhance its high-performance computing (HPC) data platforms. The role combines responsibilities for storage architecture and management, requiring extensive knowledge of Lustre, MinIO, and... 
    Platform

    Stanford University

    Palo Alto, CA
    2 days ago
  •  ...background in high performance computing (hpc) and automotive embedded software. The...  ...reliability of next-generation automotive compute platforms. · Develop and maintain automated unit...  ...· Simulate and emulate high-throughput data flows to test performance and reliability... 
    Platform

    Vantage Point Consulting Inc.

    Mountain View, CA
    3 days ago
  • $150.29k - $171.67k

     ...HPC Systems Administrator (Hardware & Infrastructure Operations)...  ...flagship of Stanford's research computing environment...  ...on Sherlock and other platforms. You will ensure that...  ...GPUservers,and high-density storage systems....  ...and replacements. Data Center Operations... 
    Platform
    Hourly pay
    Full time
    Work at office
    Free visa
    Afternoon shift

    Stanford University

    Stanford, CA
    3 days ago
  • $150.29k - $171.67k

    Overview HPC Systems Administrator (Hardware & Infrastructure Operations...  ...of Stanford’s research computing environment...  ...on Sherlock and other platforms, ensuring that 1,500+...  ..., and petabyte-scale storage arrays are well maintained...  ...and replacements. Data Center Operations: Collaborate... 
    Platform
    Full time
    Work at office
    Free visa
    Weekend work
    Afternoon shift

    Stanford University

    Palo Alto, CA
    2 days ago
  • $140k - $160k

     ...looking for a Senior HPC Engineer, as ASRC...  ...Systems and Storage (CSS) group and bringing...  ...100+ petabytes of data storage in...  ...scripts for system administration, monitoring and usage...  ...). Understands research use cases, researches...  ..., multi-platform HPC environments... 
    Platform
    Contract work
    Weekend work

    ASRC Federal Holding Company

    Mountain View, CA
    1 day ago
  • $181k - $297k

     ...in Mountain View, CA. We are seeking an HPC Network Engineer to design, deploy, and...  ...You will work closely with systems, GPU, platform, and software teams to build scalable, lossless...  ...deep expertise in backend systems, data processing, and large-scale system design,... 
    Platform
    For contractors
    Work at office
    Flexible hours

    LinkedIn

    Mountain View, CA
    4 days ago
  • $230.2k - $316.6k

     ...tests, real-world data and AI analytics....  ...Performance Computing team (HPC) builds and...  ...scalable data storage that holds petabytes...  ...cluster file systems* Research, design, and...  ...performance computing platforms, preferably...  ...Linux/Unix system administration, knowledge of Unix... 
    Platform
    For contractors
    Work experience placement
    Work at office
    Remote work
    Work from home

    Guardant Health

    Palo Alto, CA
    2 days ago
  • Storage Architect - Senior Storage Systems Administrator - Stanford University Reports to: Research & HPC Data Platforms Team Lead Please note: Visa sponsorship is not available for this position. The Opportunity Stanford University Research Computing is seeking a storage... 
    Platform
    Flexible hours

    Science Gateways

    Palo Alto, CA
    4 days ago
  • $184k - $287.5k

     ...communication libraries like NCCL, NVSHMEM, and UCX that are crucial for scaling Deep Learning and HPC. We're seeking a Senior Software Architect to help co-design next-gen data center platforms and scalable communications software. DL and HPC applications have a huge compute... 
    Platform

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $320k

     ...NCCL, NVSHMEM, UCX for Deep Learning and HPC. We are looking for a Distinguished...  ...to help co-design our next generation data center platforms. DL and HPC applications have a huge compute...  ...s vision? What you will be doing: Research new communication technologies (e.g.... 
    Platform
    Work experience placement

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded...  ...shape how EPYC CPUs enable next-generation data center platforms. You will own an AI/HPC-focused EPYC CPU generation from planning through end-... 
    Platform

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...mission to continue improving our HPC infrastructure. Our team...  ...automation, APIs, and self‑service platforms. Operate in a globally...  ...Quality of Service (QoS) through data-driven operations, strong SLOs...  ...‑native primitives (managed storage, messaging, compute). Proficiency... 
    Platform

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...features that no traditional platform offers. The scope ranges from...  ...R&D teams and external research partners, turning architectural...  ...represent Cerebras in the broader HPC and AI research communities....  ...-party tools process personal data. For more details, click here... 
    Platform

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    3 days ago
  • $165k - $220k

     ...Solutions Architect - HPC/AI/ML Livingston, NJ / New York, NY...  ...pioneers, CoreWeave delivers a platform of technology, tools, and...  ...AI revolution—working across data centers, hardware systems, and...  ...prototyping and initiation of research and development efforts for emerging... 
    Platform
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    16 hours ago
  • $109k - $204k

     ...HPC Engineer New York, NY/ Bellevue, WA/ Sunnyvale, CA / Livingston...  ..., CoreWeave delivers a platform of technology, tools, and...  ...Engineer to support large-scale data center deployments. In this...  ...level. Strong Linux system administration skills. Proficiency in at... 
    Platform
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Worldwide
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    1 day ago
  • HPC Systems Administrator - Stanford University The Sherlock HPC cluster is the...  ...flagship ofStanford University’s research computing environment,...  ...supporting Sherlock and related platforms. You will help ensure that...  ...racks, and petabyte-scale storage arrays are carefully... 
    Platform

    Science Gateways

    Palo Alto, CA
    3 days ago
  • $165k - $242k

     ...HPC Performance Engineer CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build...  ...hardware updates; and providing data, metrics, dashboards, and analysis... 
    Platform
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    1 day ago
  • $129k - $161.27k

     ...for existing system administrators by creating...  ...collective expertise in HPC-specific...  ...software enhancements by researching emerging technologies...  ...and devices, and storage.* Work closely with...  ...across the data center compute and...  ...liaison with hosted platform and third party providers... 
    Platform
    Work at office
    Local area
    Flexible hours
    Shift work

    Santa Clara University

    Santa Clara, CA
    3 days ago
  • $90k - $110k

     ...pioneers, CoreWeave delivers a platform of technology, tools, and...  ...Operations Engineer to join our HPC Networking Team. HPC...  ...Experience with Linux system administration and maintenance. Proficiency...  ...like Ansible. Knowledge of data center operations, including... 
    Platform
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    6 days ago
  • $165k - $220k

     ...pioneers, CoreWeave delivers a platform of technology, tools, and...  ...AI revolution—working across data centers, hardware systems, and...  ...within high-performance compute (HPC) environments Collaborate closely...  ...prototyping and initiation of research and development efforts for... 
    Platform
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    6 days ago
  • $150.29k - $171.67k

     ...generation of Stanford's research computing environment....  ...computing (HPC) and cutting-edge AI/ML...  ...experience with major cloud platforms: GCP, AWS or Azure...  ...Docker, Kubernetes). Data Movement: Experience with...  ...policies found in Stanford's Administrative Guide, The... 
    Platform
    Hourly pay
    Weekend work
    Afternoon shift

    Stanford

    Stanford, CA
    6 days ago
  • $300k - $350k

     ...leading the revolution in AI data center infrastructure, enabling...  ...for the most advanced AI and HPC workloads. Lightmatter raised...  ...Learning Infrastructure Researcher to join our Machine Learning team...  ...to design compelling reference platforms for our customers, powered by... 
    Platform
    Full time
    Temporary work
    Flexible hours

    Lightmatter

    Mountain View, CA
    16 hours ago
  •  ...InfiniBand and Ethernet for managing high-performance computing (HPC) and artificial intelligence (AI) environments. Candidates...  ...knowledge of networking protocols and experience with various network platforms. This role offers the opportunity to work in a cutting-edge... 
    Platform

    TechDigital Group

    Santa Clara, CA
    4 days ago
  •  ...Product Manager - HPC & Software (Infrastructure & Storage) On-site: San Francisco, Bay Area, Houston, Colorado, Minnesota Job Description:...  ..., rack management controllers, and cluster management platforms. Collaborate with PMM, sales, and solutions teams to... 
    Platform
    Work experience placement

    Salt

    Sunnyvale, CA
    2 days ago
  • $170k - $260k

     ...large deep learning models across multiple GPUs and nodes. Optimize data distribution and synchronization to achieve faster convergence...  ...~ Knowledge of performance profiling and optimization tools for HPC and deep learning. ~ Familiarity with resource management and scheduling... 
    Work at office

    GenBio AI

    Palo Alto, CA
    3 days ago
  • $65 - $70 per hour

     ...Pay Range: $65hr - $70hr Responsibilities SageMaker Platform Administration. Manage and configure SageMaker Studio Studio Domains and...  ...VPC configurations and private network access. Enforce data governance including fine-grained access to S3 buckets and datasets... 
    Platform

    Cynet Systems

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...architect/engineer for a Senior HPC architect role to support deployment...  ...on and implement at-scale system administration and tuning mechanisms for large-...  ...Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft... 
    Platform

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...Network Engineer - AI/HPC Memphis, TN; Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand...  ...your daily job working with and analyzing large sets of data. XAI is an equal opportunity employer. For details on data processing... 

    Xai

    Palo Alto, CA
    2 days ago
  • $152k - $241.5k

     ...highly skilled and experienced HPC Cluster Engineer to design,...  ...engineering team and collaborate with researchers and infrastructure teams to...  ...of compute, networking, and storage. Foster strong customer...  ...and powerful compute platforms for the world to use. It’s because... 
    Platform

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...Relations Manager to engage leading research labs. In this role, you will...  ...sophisticated AI and computing platforms. The ideal candidate...  ...research workflows, including HPC simulations, numerical solvers, experimental/sensor data, large-scale instruments, inverse... 
    Platform

    NVIDIA

    Santa Clara, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to HPC Data Storage Administrator (Research & HPC Data Platforms). Be the first to apply!