Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Director of AI Infrastructure

$176.4k - $264.6k

The Allen Institute for Artificial Intelligence

Persons in these roles are expected to work from our offices in Seattle. On-site requirements vary based on position and team. If you have questions about on-site work arrangements for this role, please ask your recruiter.
Our base salary range is $176,400 - $264,600, and in addition we have generous bonus plans to provide a competitive compensation package.

Who We Are:

Ai2 is a non-profit research institute at the forefront of open-source AI development. Unlike industry peers, our goal is to share our findings, data, code, and models with the global scientific community.

We are seeking a Director of AI Infrastructure to oversee the systems that power our research. This leader will be responsible for the full lifecycle of our high-performance computing (HPC) environment which includes on-prem GPU clusters and the software orchestration layer that schedules workloads across a hybrid cloud environment.

Who You Are:

  • Systems Expert: You have a deep understanding of the Linux kernel, container runtimes, and distributed systems. You understand the performance implications of InfiniBand topologies and NCCL optimizations.
  • Strategic Thinker: You look beyond the immediate "fire" to design systems that will scale for the next 3-5 years of AI research.
  • Pragmatic Leader: You are comfortable making trade-offs between technical elegance and operational necessity. You prioritize reliability and researcher velocity above all else.

Your Next Challenge:

The essential functions include, but are not limited to the following:

  • Cluster Management: Oversee the availability and performance of dense on-prem GPU clusters. You will partner with hardware vendors and internal teams to ensure our physical infrastructure meets the demands of frontier model training.
  • Orchestration & Scheduling: Direct the strategy for Beaker , our internal orchestration platform. Your goal is to optimize job scheduling, ensuring high utilization of both on-prem assets and elastic cloud resources (AWS/GCP).
  • Storage Architecture: Develop and execute a long-term roadmap for storage that balances high-throughput performance for active training with cost-effective durability for petascale research data.
  • Resource Economics: Act as the primary steward of our GPU compute budget. You will make data-driven decisions on when to burst to the cloud versus when to invest in on-prem capacity.
  • User Support & Velocity: Serve as the technical bridge to our research teams. You will ensure that infrastructure is an accelerator, not a bottleneck, for a diverse set of research objectives.

What You'll Need:

  • Experience: 12+ years in infrastructure, systems engineering, or HPC, with at least 5 years in a leadership role managing multi-disciplinary engineering teams.
  • Bachelor's degree in related field ; relevant advanced degree may substitute for equivalent years of technical work experience
  • GPU/HPC Stack: Direct experience managing large-scale NVIDIA GPU clusters and high-performance networking (InfiniBand/RoCE).
  • Cloud Native: Strong background in Kubernetes, Slurm, or similar orchestration frameworks, particularly in hybrid-cloud configurations.
  • Storage Mastery: Experience with distributed filesystems (e.g., WEKA, Ceph, Lustre) and cloud storage integration at scale.
  • Software Development: Proficient in Go or Python, with the ability to review architecture and code for our internal tooling.

Physical Demands and Work Environment:

The physical demands described here are representative of those that must be met by a team member to successfully perform the essential functions of this position. Reasonable accommodations may be made to enable individuals with disabilities to perform the functions.

  • Must be able to remain in a stationary position for long periods of time.
  • The ability to communicate information and ideas so others will understand. Must be able to exchange accurate information in these situations.
  • The ability to observe details at close range.
  • Can work under deadlines.

A Little More About Ai2:

Ai2 is a Seattle based non-profit AI research institute founded in 2014 by the late Paul Allen. Our mission is building breakthrough AI to solve the world's biggest problems. We develop foundational AI research and innovation to deliver real-world impact through large-scale open models, data, robotics, conservation, and beyond.

In addition to Ai2's core mission, we also aim to contribute to humanity through our treatment of each member of the Ai2 Team. Some highlights are:

  • We are a learning organization - because everything Ai2 does is ground-breaking, we are learning every day. Similarly, through weekly Ai2 Academy lectures, a wide variety of world-class AI experts as guest speakers, and our commitment to your personal on-going education, Ai2 is a place where you will have opportunities to continue learning alongside your coworkers.
  • We value diversity - We seek to hire, support, and promote people from all genders, ethnicities, and all levels of experience regardless of age. We particularly encourage applications from women, non-binary individuals, people of color, members of the LGBTQA+ community, and people with disabilities of any kind.
  • We value inclusion - We understand the value that people's individual experiences and perspectives can bring to an organization, and we are building a culture in which all voices are heard, respected and considered.
  • We emphasize a healthy work/life balance - we believe our team members are happiest and most productive when their work/life balance is optimized. While we value powerful research results which drive our mission forward, we also value dinner with family, weekend time, and vacation time. We offer generous paid vacation and sick leave as well as family leave.
  • We are collaborative and transparent - we consider ourselves a team, all moving with a common purpose. We are quick to cheer our successes, and even quicker to share and jointly problem solve our failures.
  • We are in Seattle - and our office is on the water! We have mountains, we have lakes, we have four seasons, we bike to work, we have a vibrant theater scene, and we have so much else. We even have kayaks for you to paddle right outside our front door. We welcome interest from applicants from outside of the United States.
  • We are friendly - chances are you will like every one of the 200+ (and growing) people who work here. We do.

Ai2 is proud to be an Equal Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. You may view the related Know Your Rights compliance poster and the Pay Transparency Nondiscrimination Provision by clicking on their corresponding links.

This employer participates in E-Verify and will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S. If E-Verify cannot confirm that you are authorized to work, this employer is required to give you written instructions and an opportunity to contact the Department of Homeland Security (DHS) or Social Security Administration (SSA) so you can begin to resolve the issue before the employer can take any action against you, including terminating your employment. Employers can only use E-Verify once you have accepted a job offer and completed the Form I-9.

We are committed to providing reasonable accommodations to employees and applicants with disabilities to the full extent required by the Americans with Disabilities Act (ADA). If you feel you need a reasonable accommodation pursuant to the ADA, you are encouraged to contact us at View email address on click.appcast.io.

Benefits:

  • Team members and their families are covered by medical, dental, vision, and an employee assistance program.
  • Team members are able to enroll in our health savings account plan, our healthcare reimbursement arrangement plan, and our health care and dependent care flexible spending account plans.
  • Team members are able to enroll in our company's 401k plan.
  • Team members will receive $125 per month to assist with commuting or internet expenses and will also receive $200 per month for fitness and wellbeing expenses.
  • Team members will also receive up to ten sick days per year, up to seven personal days per year, up to 20 vacation days per year and twelve paid holidays throughout the calendar year.
  • Team members will be able to receive annual bonuses and can participate in the long-term incentive plan.

Note: This job description in no way states or implies that these are the only duties to be performed by the team members(s) of this position. Team members will be required to follow any other job-related instructions and to perform any other job-related duties requested by any person authorized to give instructions or assignments. All duties and responsibilities are essential functions and requirements and are subject to possible modification to reasonably accommodate individuals with disabilities. To perform this job successfully, the team member(s) will possess the skills, aptitudes, and abilities to perform each duty proficiently. Some requirements may exclude individuals who pose a direct threat or significant risk to the health or safety of themselves or others. The requirements listed in this document are the minimum levels of knowledge, skills, or abilities. This document does not create an employment contract, implied or otherwise, other than an at will relationship.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Director of AI Infrastructure in Seattle, WA vacancy
  •  ...Head of Infrastructure Engineering About the Company Pioneering cloud infrastructure company Industry Information Technology and...  ...to lead the design, deployment, and operations of cutting-edge AI and HPC infrastructure. This pivotal role involves driving the... 
    Suggested

    Confidential

    Seattle, WA
    2 days ago
  • A leading AI infrastructure company is seeking a Director of People Partnering to shape the HR strategy for their rapidly growing US operations. This role will partner closely with executive leadership to enhance talent strategy, organizational design, and employee relations... 
    Suggested

    Nscale Ltd.

    Seattle, WA
    4 days ago
  • $132.24k - $208.8k

     ...Position Summary As a Datahall Design Service Delivery Director, you will lead and manage a team of DC Infrastructure Engineers responsible for the design, implementation, and delivery of data center infrastructure solutions. You will oversee project execution, resource... 
    Suggested
    Full time
    Temporary work
    Flexible hours

    Astreya

    Seattle, WA
    1 day ago
  • Paradigm, located in Seattle, is hiring a Manager for ML Ops Infrastructure to lead teams in deploying ML capabilities. The role involves designing...  ...ML Ops platforms, and ensuring the efficient deployment of AI/ML systems. The ideal candidate will bring over 7 years of... 
    Suggested

    Paradigm

    Seattle, WA
    2 days ago
  • $365k

    A leading AI research company located in Seattle seeks a Technical Program Manager to manage and optimize their compute infrastructure. Responsibilities include driving programs across the compute lifecycle and collaborating with cross-functional teams. Ideal candidates... 
    Suggested

    Anthropic

    Seattle, WA
    3 days ago
  • $298k - $350k

    Robinhood is seeking a Senior Engineering Manager, AI to lead the development of AI infrastructure in Menlo Park, CA. You will oversee AI model creation, providing mentorship to engineers, and ensuring high-quality AI applications. Ideal candidates have extensive engineering... 

    Robinhood

    Bellevue, WA
    4 days ago
  •  ...helping drive that change, with strong ecosystem relationships. We combine our strength in technology and leadership in cloud, data and AI with unmatched industry experience, functional expertise and global delivery capability. We are uniquely able to deliver tangible... 
    Live in
    Work at office
    Local area
    Work from home

    Accenture

    Seattle, WA
    14 hours ago
  • $42 - $60 per hour

     ...documentation to various departments and users regarding the global infrastructure Serve as a liaison to gather network requirements from...  ...By applying for this job, you agree to receive calls, AI-generated calls, text messages, or emails from KellyMitchell and... 
    Local area
    Remote work

    KellyMitchell Group

    Seattle, WA
    8 hours ago
  • $42.55 - $66.06 per hour

     ..., and ongoing operation of enterprise local area network (LAN) infrastructure. This role focuses on foundational network engineering tasks, troubleshooting...  ...and ticketing systems (e.g., ServiceNow) Exposure to using AI-assisted tools to improve engineering productivity, such as log... 
    Minimum wage
    Full time
    Local area
    Remote work
    Shift work

    Providence Service

    Renton, WA
    4 days ago
  • $139k - $204k

     ...Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform...  ...startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate... 
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Bellevue, WA
    1 day ago
  • $146k - $190k

     ...sold. Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into...  ...combat zones which do not have the well-developed network infrastructure which most network solutions rely on. Our challenge is to provide... 
    Full time
    Work experience placement
    Immediate start

    Anduril Industries

    Seattle, WA
    2 days ago
  • $60 - $65 per hour

     ...key role in designing, upgrading, and maintaining modern Wi-Fi infrastructure , including transitions to Cisco platforms , while...  ...details. By applying for this job, you agree to receive calls, AI-generated calls, text messages, or emails from Everforth Apex and... 
    Contract work
    Remote work
    Monday to Friday

    Apex Systems

    Seattle, WA
    2 days ago
  •  ...Job Description Job Description Salary: As an AI infrastructure startup, we are developing innovative systems to enable large-scale AI deployments. Our team comprises industry veterans who are passionate about building the sophisticated systems that support the... 
    Full time
    Internship
    Flexible hours

    Nexthop Systems Inc

    Bellevue, WA
    1 day ago
  •  ...Join Lambda, The Superintelligence Cloud Lambda, the superintelligence cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers. Our customers range from AI researchers to enterprises and hyperscalers. Lambda's mission is to make compute... 
    Work at office
    Local area
    Work from home
    Flexible hours

    Lambda Corporation

    Bellevue, WA
    3 days ago
  • $45 - $60 per hour

     ...the lab. Perform network cabling, rack setup, and physical infrastructure planning. Configure and manage networking components,...  ...laws. By applying for this job, you agree to receive calls, AI-generated calls, text messages, or emails from VOLT and its affiliates... 
    Hourly pay
    Full time
    Contract work
    Temporary work
    Work experience placement
    Immediate start

    Volt

    Seattle, WA
    4 days ago
  • $170.6k - $390k

     ...premises, cloud, and hybrid environments, and partnering closely with infrastructure, cloud, application, and security operations teams. Join...  ...while building trust in capital markets. Enabled by data, AI and advanced technology, EY teams help clients shape the future... 
    Summer holiday
    Remote work
    Flexible hours

    EY

    Seattle, WA
    1 day ago
  •  ...Operational capacity restoration (stabilizing and scaling network infrastructure).Responsibilities: Support and maintain large DWDM networks...  ...By clicking "Apply Today" you agree to receive calls, AI-generated calls, text messages or emails from Kforce and its affiliates... 
    Hourly pay
    Contract work

    Kforce

    Seattle, WA
    4 days ago
  • $146k - $194k

     ...industry, Anduril is changing how military systems are designed, built and sold. Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and control center. As the world enters an... 
    Full time
    Contract work
    Work experience placement
    Work at office

    anduril

    Seattle, WA
    1 day ago
  •  ...systems (e.g., Oracle, SAP, Workday, or similar) and understanding of financial data flows and business processes. - Proficiency with AI-enabled analysis and documentation tools, including platforms that support automated requirements generation, data analysis, or... 
    Minimum wage
    Contract work
    Temporary work
    Work experience placement
    Remote work

    MAXIMUS

    Seattle, WA
    4 days ago
  •  ...new builds across sites. Assist in integration of wireless infrastructure with Cisco & Arista switching fabric. Work closely with...  ...wireless performance troubleshooting using DNAC, or Juniper Mist AI. Industrial Site Experience & Safety Compliance: Experience... 
    Work at office
    Local area

    Vantage Point Consulting Inc.

    Renton, WA
    4 days ago
  • $225k - $275k

     ...Deployment Engineer to serve as the technical owner of how we deploy network infrastructure across our global fleet. As we rapidly scale our footprint of high-performance compute (HPC) and GPU-based AI infrastructure, you will define the deployment strategy, set the... 
    Temporary work
    Remote work

    Crusoe

    Seattle, WA
    8 hours ago
  • $193k - $234k

     ...logical implementation of our global network. As we rapidly expand our footprint of high-performance compute (HPC) and GPU-based AI infrastructure, you will be the primary driver behind bringing new data centers and edge sites online. This is a technical, hands-on role... 
    Temporary work
    Remote work

    Crusoe

    Seattle, WA
    1 day ago
  •  ...Associate Customer Engineer – Network Nexthop AI is building the most efficient AI networking infrastructure for the world's largest cloud operators. We design and deliver the hardware, software, and systems that power hyperscale AI deployments. We're a fast-moving... 
    Full time
    Relocation

    Nexthop AI

    Bellevue, WA
    8 hours ago
  • $92k - $122k

     ...diverse client devices. Administer and enhance wired network infrastructure, including Juniper and Cisco switching environments. Diagnose...  ...with AWS Route 53 Relevant certifications: Juniper JNCIS‑Mist AI‑Wireless, JNCIS‑Mist AI‑Wired, JNCIS‑ENT, Cisco CCNA Applicable... 
    Full time
    Temporary work
    Work at office
    Visa sponsorship
    Flexible hours
    Shift work

    Nativeamericanbar

    Seattle, WA
    14 hours ago
  • $180k

    Overview xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge...  ...Role Grok and X are powered largely from our own on-premise infrastructure which enables us to move at speed and efficiency when deploying... 
    Temporary work

    xAI

    Seattle, WA
    14 hours ago
  • About Nscale Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development... 

    Nscale

    Seattle, WA
    4 days ago
  •  ...management of enterprise-grade network infrastructure. In this role you will serve as a subject...  ...operations of the Versa SD-WAN fabric, including Director and Analytics platform management.*...  ...-Fi deployments using the Juniper Mist AI-driven wireless platform, including RF... 
    Night shift

    Nordstrom

    Seattle, WA
    4 days ago
  • $146k - $190k

     ...technology. Leveraging modern business models and cutting‑edge AI, computer vision, and networking, Anduril builds systems powered...  ...and analyzing network traffic. Analyze test network infrastructure, recommend, and implement improvements for reliability and throughput... 
    Full time

    Anduril-1

    Seattle, WA
    14 hours ago
  •  ...an experienced Senior Software Engineer to join the Networking Infrastructure team. In this role, you will design and automate networking for...  ...providers, ensuring secure and scalable connectivity for data and AI workloads. You will collaborate closely with product management... 

    I did my part and supported the Regular Toilet

    Bellevue, WA
    2 days ago
  • A leading data and AI company is seeking a Senior Software Engineer to join their Networking Infrastructure team. This role involves designing and automating networking solutions for large-scale compute clusters across all major cloud providers. The ideal candidate has... 

    Databricks Inc.

    Bellevue, WA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Director of AI Infrastructure. Be the first to apply!