Head of AI Infrastructure Engineering
$300kAI Foundry
Job Description
Job Description
Compensation: $300K - $450K+
Travel Requirement: International travel required to India 8+ times per year
AI Foundry is supporting one of the largest AI infrastructure deployments in India—gigawatt-scale, thousands of GPUs, designed for training and inference workloads that will serve enterprises across the region. This is greenfield work: you'll be making decisions from the US that determine how the entire stack gets built overseas.
We're looking for someone who has built GPU infrastructure at serious scale and wants to do it again with full end-to-end control. You understand the hardware, the networking, the cooling, the operations—and you know how to make decisions that optimize for performance, cost, and reliability simultaneously. This is a long-lead role where getting the foundation right matters more than moving fast and breaking things.
If you've built infrastructure at a hyperscaler or AI-native provider and wanted more control over the full stack, this is that opportunity.
What You'll Do- Design GPU cluster architectures for training and inference at scale (thousands of GPUs, not dozens)
- Specify hardware configurations: GPU servers, networking fabric, storage systems, power and cooling
- Evaluate and select vendors; negotiate technical specifications with OEMs like Dell, Supermicro, HPE, and NVIDIA directly
- Work with facility teams on power infrastructure, electrical distribution, and cooling solutions for high-density AI deployments
- Build automation for cluster provisioning, configuration management, and lifecycle operations
- Implement job scheduling and workload management (Slurm, Kubernetes, custom orchestration as needed)
- Establish monitoring, alerting, and observability for infrastructure health at scale
- Lead calls with overseas teams to review progress, present architectures, and provide technical guidance
- Define operational runbooks, incident response, and SRE practices
- Build and lead a team of infrastructure engineers, systems administrators, and hardware specialists
- Travel to India 8+ times per year to work directly with client teams
- You've built GPU infrastructure at scale; you know NVIDIA's ecosystem (DGX, HGX, NVLink, NVSwitch, CUDA, NCCL) from hands-on experience, not just vendor briefings
- Deep expertise in high-performance networking: InfiniBand, 400G Ethernet, RDMA, GPUDirect; you understand why network topology matters for distributed training
- Strong Linux systems engineering background; you've managed thousands of nodes and know what breaks at scale
- Experience with storage systems for ML workloads: Lustre, GPFS, BeeGFS, NVMe-oF, parallel file systems
- You've worked at a hyperscaler (AWS, GCP, Azure) or AI-native infrastructure provider (CoreWeave, Lambda, Crusoe, or similar); you know what good looks like
- Comfortable with data center operations: power, cooling, rack density, PUE optimization; you can have a real conversation with facilities engineers
- You can make decisions with incomplete information and defend them technically; you don't wait for perfect specs before moving forward
- Able to hold a high bar and push teams toward excellence without being a know-it-all
- Strong communicator who can translate between hardware vendors, operations teams, and business stakeholders across time zones
- Hungry to build something from the ground up; you're not looking for a role where you inherit someone else's architecture
- Comfortable with ambiguity and an ability to take confident action when there are missing details
- Experience with advanced cooling: liquid cooling, two-phase cooling, immersion systems
- Background in greenfield data center buildouts, not just operating existing infrastructure
- Familiarity with India-specific considerations: power procurement, regulatory requirements, vendor landscape
- Prior work with AI/ML frameworks and MLOps; you understand what the workloads actually look like
- Competitive compensation
- Medical and Dental benefits
- 401K
- Opportunity to shape AI strategy for 500M+ users
- A leading global payments platform is seeking an Engineering Manager for Infrastructure Automation in Seattle. The role involves building a team of 5 engineers to create AI-powered agents that automate infrastructure operations and enhance system reliability. Candidates...Suggested
- ...Head of Infrastructure Engineering About the Company Pioneering cloud infrastructure company Industry Information Technology and Services... ...lead the design, deployment, and operations of cutting-edge AI and HPC infrastructure. This pivotal role involves driving...Suggested
- A leading technology company in Seattle seeks a Senior Software Engineer to join their AI Networking team. This role involves building ML tools for optimizing AI workloads across data centers, focusing on large-scale deep learning. Candidates should have a PhD or equivalent...Suggested
- A leading financial technology company in Seattle seeks a Senior Software Engineer to join their Infrastructure AI team. You will design, build, and operate AI agents automating SRE, DevOps, and DBA workflows across global infrastructure. The ideal candidate has over 5...Suggested
- ...transformation, information technology and services Position:Sr AI Platform Engineer Location:Bellevue WA/ Frisco TX/ Kansas City KS/... .... AI software engineering: hands-on building data infrastructure for AI and ML use cases (RAG, agent tooling, feature...SuggestedContract workTemporary work
$216k - $270k
...As a Software Engineer on the Machine Learning Infrastructure team, you will build the "Operating System" for our large-scale GPU clusters. You will architect... ...that transforms raw compute into breakthrough AI. You will: Architect and scale a multi-tenant orchestration...Full time$151.8k - $265.35k
...people and teams create standout content with ease. The AI Foundations team builds the core AI platform that... ..., motion, and personalization. We're looking for an engineer to help develop and scale the AI infrastructure behind these experiences. This role is a strong fit...Temporary workLocal areaWorldwide$79.2k - $209.5k
...Principal Software Engineer Join Oracle's Health Data Intelligence (HDI) team as a Principal... ...systems, automation frameworks, and AI-powered operational tooling that enable... ...analytics workloads across Oracle Cloud Infrastructure and multi-cloud environments. You will...Temporary workFlexible hours$163k - $237k
Google is seeking a Systems Debug Engineer based in Kirkland, WA, to oversee and optimize systems for cloud operations. You will manage services on a large scale, troubleshoot AI and ML workloads, and partner closely with Product and SRE teams to ensure operational excellence...- ElastixAI INC. in Seattle seeks an Inference Infrastructure Software Engineer to manage the cloud and Kubernetes backbone behind their Token-as-a-Service... ...benefits, and the opportunity to work at the forefront of AI technology in a collaborative environment. #J-18808-...
- ...an early-stage Software startup on a mission to reinvent AI inference infrastructure from the ground up. We're building a next-generation inference... ...We're looking for an Inference Infrastructure Software Engineer to own and evolve the cloud and Kubernetes backbone behind...Work at officeFlexible hours3 days per week
$154.85k - $189.26k
...potential. Why We Need You: We are seeking a Principal AI Platform Engineer, to join our community. As AI agents become mission-... ...for agent workers. Manage agent orchestration infrastructure, including configuration, versioning, connection management...Full timeTemporary workPart timeImmediate startWork from homeFlexible hoursShift work- Bright Vision Technologies is looking for an AI Data Infrastructure Engineer to join our team and enhance our innovative solutions. This full-time, remote position requires 6+ years of experience in data engineering and expertise in AI workloads. You will design large-scale...Remote jobFull time
- WEX, Inc. is looking for a Software Engineer to join their North America Mobility organization. This position focuses on building an AI Platform to support rapid feature development and empower solutions across financial workflows. The ideal candidate will have a strong...Remote job
- Docker, Inc. is looking for a Senior Backend Engineer to lead the development of its billing platform. The role requires 4+ years of experience... ..., ensuring the implementation of precise specifications for AI-assisted workflows. A remote-first team, Docker offers flexible...Remote jobFlexible hours
- Airwallex is looking for a Senior Software Engineer to join their Productivity team in... ...building, and optimizing developer tooling and infrastructure to enhance engineering workflows and... ...to tackle complex challenges involving AI-driven automation and CI/CD pipelines. The...
- ...technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled AI Data Infrastructure Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. AI Data...Full timeH1bLocal areaImmediate startRemote workVisa sponsorship
$85.3k - $128k
AI Platform Engineer SUMMARY The AI Platform Engineer designs, builds, and maintains the infrastructure that powers Mortenson’s AI solutions. This role ensures scalable, secure, and high‑performing AI platforms, enabling rapid experimentation and reliable deployment of...H1bWork at office- A leading transportation network company is seeking a Senior Software Engineer in Seattle to drive the technical direction of their claims management system. You will collaborate with teams across Engineering, Data Science, and Product to enhance the platform's reliability...
- Apple Inc. is seeking a Senior / Staff Software Engineer for the Apple Cloud AI Platform in Seattle, Washington. This role involves building production... ...strong proficiency in Python, React, and various data infrastructure tools. Apple offers competitive compensation, stock...
$193k - $290.1k
Apple Inc. is seeking a Staff/Sr. Software Engineer to design and build distributed systems for search. Located in Seattle, Washington, this role requires over 5 years of backend development experience, proficiency in languages like Go and Java, and a deep understanding...- Ll Oefentherapie in Seattle is seeking a Principal Software Development Engineer to build next-generation AI-native engineering systems. This role involves using modern AI engineering practices to rapidly develop intelligent systems and operational tooling. Candidates...
- Google Inc. is seeking a Principal Engineer, Specialized Software to leverage technology in cloud and AI for strategic customer collaboration. This role requires 15 years of engineering experience and a Bachelor's degree, with preferred qualifications including a Master...
- Writer, a leading enterprise AI platform based in Seattle, is seeking a skilled Software Engineer to build cutting-edge AI integration solutions. In this hybrid role, you will work at the intersection of technology and enterprise productivity, utilizing your expertise...
- A leading entertainment company is looking for an AI/ML Ops Engineer to enhance their AI Engineering team in Seattle, WA. The role revolves around managing AI platforms aimed at optimizing guest experiences and includes responsibilities like model deployment, performance...Permanent employmentFull timeContract work
- ..., and application components that power AI Accelerator products.* Develop Python/FastAPI... ...LLM APIs, retrieval systems, workflow engines, and internal enterprise systems.*... ...testing, environment configuration, and infrastructure-as-code such as Terraform, AWS CDK, or CloudFormation...Full timeTemporary work
$139.5k - $258.1k
Apple Inc. in Seattle is seeking a Senior Machine Learning Engineer for the Apple Intelligence Data Platform team. This role involves building... ..., privacy-conscious ML systems and developing innovative AI experiences. Ideal candidates should possess at least 5 years of...$166k - $258k
Nordstrom is hiring a Senior Engineer 2 for their AI Agentic Platform in Seattle. This hybrid role involves designing complex systems and mentoring engineers. Applicants should have 6+ years of experience, proficiency in Python, and strong skills in AI and system design...- A leading construction and engineering firm is seeking an AI Platform Engineer to design, build, and maintain AI infrastructure. This role supports scalable AI solutions crucial for the company's 2026 priorities. The ideal candidate will have a bachelor's in Computer Science...
- Zoom is seeking a software engineer in Seattle to design and develop scalable APIs and SDKs for AI services. Ideal candidates should have over 5 years of experience, with strong skills in API and SDK development, especially in modern programming languages like Python, Java...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Head of AI Infrastructure Engineering. Be the first to apply!
- director of infrastructure Seattle, WA
- head of infrastructure Seattle, WA
- infrastructure manager Seattle, WA
- infrastructure engineering manager Seattle, WA
- senior civil engineer project manager Seattle, WA
- senior chief engineer Seattle, WA
- director of product engineering Seattle, WA
- engineering director Seattle, WA
- chief engineer Seattle, WA
- chief design engineer Seattle, WA

