AI Cluster Architect
$165k - $185kVultr
Who We Are Vultr is on a mission to make high-performance cloud infrastructure easy to use, affordable, and locally accessible for enterprises and AI innovators around the world. With 33 global cloud data center locations, Vultr is trusted by hundreds of thousands of active customers across 185 countries for its flexible, scalable, global Cloud Compute, Cloud GPU, Bare Metal, and Cloud Storage solutions. In December 2024 Vultr announced an equity financing at a $3.5 billion valuation. Founded by David Aninowsky and self-funded for over a decade, Vultr has grown to become the world's largest privately-held cloud infrastructure company.
Vultr Cares
Vultr Cares
- Excellent Medical Benefits w/ 100% company-paid premiums for employee only plan + 100% company-paid dental & vision premiums
- 401(k) plan that matches 100% up to 4% with immediate vesting
- Professional Development Reimbursement of $2,500 each year
- 11 Holidays + Paid Time Off Accrual + Rollover Plan + take your birthday off
- Commitment matters to Vultr! Increased PTO at 3 year & 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
- $500 first year remote office setup + $400 each following year for new equipment
- Internet reimbursement up to $75 per month
- Gym membership reimbursement up to $50 per month
- Company-paid Wellable subscription
- Architect large-scale GPU clusters within fixed site power budgets that optimizes for maximum GPU density while reserving necessary headroom for compute services, storage, and networking.
- Model and validate power consumption across the full cluster bill of materials (GPUs, CPUs, NICs, switches, fabric components, storage, and facility limits).
- Evaluate tradeoffs across multiple fabric networking architectures (InfiniBand, RoCE, SpectrumX) as well as multi-plane, 2-tier/3-tier, and rail-optimized topologies.
- Determine network scale limits based on switch radix, link speed, topology, and blocking requirements.
- Gather, interpret, and maintain detailed SKU-level power and thermal specifications for GPUs, NICs, switches, DPUs, storage, and server platforms.
- Develop power-aware cluster configuration templates and capacity-planning models that can scale across sites with varying constraints and allow for quick iteration and ideation.
- Document architecture, design choices, tradeoff analyses, and operational considerations for deployment and lifecycle management.
- Provide guidance on future-proofing, including the ability to incorporate next-gen GPUs, NICs, or fabrics.
- Collaborate with vendors on novel fabric architectures that enable large-scale cluster deployments (100k+ GPUs)
- 7+ years designing or building large-scale HPC, AI, or hyperscale GPU clusters.
- Expert understanding of GPU and accelerator system design, including node topology, PCIe/NVLink/NVSwitch/ROCm, and NIC-to-GPU affinity considerations.
- Strong familiarity with InfiniBand, RoCE, and SpectrumX networking, including multi-tier, multi-plane, Clos/dragonfly variants, and large-radix switch design.
- Demonstrated experience modeling power draw and thermal characteristics of servers, GPUs, NICs, switches, optics, and storage systems.
- Ability to design networks that maintain full non-blocking performance or intentionally introduce over/under-subscription while understanding impacts on workload performance.
- Proven ability to gather and analyze vendor SKU-level specifications and incorporate them into scalable cluster architectures.
- Experience balancing customer-driven requirements for compute, storage, and service density in combination with overall GPU count.
- Strong documentation, communication, and cross-functional collaboration skills.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the AI Cluster Architect in United States vacancy
$184k - $356.5k
NVIDIA Gruppe is seeking an experienced engineer to lead GPU cluster design and support for AI and HPC deployments in Santa Clara, California. The ideal candidate will have over 8 years of experience with large-scale GPU infrastructure and a strong ability to communicate...Suggested- Advanced Micro Devices, Inc. is seeking a Cluster Thermal Engineer in Austin, Texas, to design scalable thermal solutions for AI/HPC clusters. Ideal candidates will have a mechanical engineering background with a solid understanding of thermodynamics and fluid dynamics...Suggested
- NVIDIA Gruppe in Santa Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design and implement cutting-edge GPU compute clusters, focusing on deep learning and high-performance computing. The ideal candidate will have at least 5+ years...Suggested
- AMD is seeking a Cluster Thermal Engineer in Austin, Texas, responsible for architecting scalable thermal solutions for AI and HPC clusters. The ideal candidate will have a strong foundation in thermodynamics and fluid dynamics, along with a passion for data center cooling...Suggested
- ...As a member of the Professional Services team, AI Architects help position, sell, and support our platform as the gold standard for AI... ...are an expert at architecting and maintaining production-grade clusters in air-gapped, egress-constrained, or "high-side" disconnected...SuggestedWork experience placementLocal area
- ...Job Summary T he AI Interconnect Architect designs and engineers high-speed networking and communication systems for AI inference infrastructure... ...of AI hardware architecture, including GPU/accelerator clusters and data center infrastructure. ~ Deep expertise in...
- ...AI/ML Architect We are seeking an experienced AI/ML Architect with deep hands-on expertise in Databricks on AWS to lead the design and... ...strong separation of compute and serving layers. Optimize cluster performance and jobs using Spark tuning, caching, and shuffle...
- ...AI/Gen AI Architect Location: Sunnyvale, CA (3x/ week onsite) Duration: 6 months • 8-15 years of experience in implementing AI/ML... ...practices regarding LLM usage and Mainly in the Area of AIOps - Clustering| classification| Anomaly detection| capacity...
- ...AI Platform Architect Austin, Texas, United States Graphcore is one of the world's leading innovators in Artificial Intelligence compute... ...inference. By orchestrating everything from advanced clustering and distributed training frameworks down to the physical layer...
- ...Job Title: Generative AI Platform Architect - Evinova Location: Gaithersburg, MD At AstraZeneca, we pride ourselves on crafting a collaborative... ...the organization. Oversee GenAI-related Kubernetes (k8s) cluster management and provide expertise on alternative GenAI...Hourly payTemporary workWork at officeRelocation3 days per week
- ...keep our world moving forward. Job Description An AI Interconnect Architect defines and engineers high-speed networking and communication... ...Hardware Architecture: Familiarity with GPU/accelerator clusters and data center infrastructure ~ Deep, working knowledge...Temporary workRemote workFlexible hoursShift work
- NVIDIA Corporation is seeking a Senior HPC Architect to enhance GPU compute clusters. This role involves designing solutions for operationalizing NVIDIA products and collaborating closely with engineering teams. Ideal candidates should have over 8 years of experience in...
- ...Administrator to manage and optimize their Redis infrastructure. This role involves installing, configuring, and maintaining Redis clusters, as well as implementing high availability and disaster recovery strategies. The ideal candidate will have extensive hands-on...
- ...Compute and Cloud LLC is seeking an HPC Kubernetes Solutions Architect to provide customer guidance in designing and integrating GPU-... ...Responsibilities include architecting and operating Kubernetes clusters, developing integration strategies, and leading proof-of-concept...
- A technology startup specializing in AI infrastructure is seeking a Principal Deployment Engineer to lead the deployment of large-scale GPU clusters. This role entails defining technical standards, architecting high-performance network fabrics, and mentoring engineers....
$85k - $136k
...of impact. Job Description The Revenue AI Strategist is a transformative leader within... ...role serves as the primary "Business Architect," responsible for vetting and defining the... ...time-series analysis, gradient boosting, clustering, neural networks) Familiarity with cloud...Shift work$115k - $140k
...success of customers deploying GPU workloads. The role involves advising on GPU cluster design, optimizing performance, and ensuring cost-effective solutions. Requirements include 2–5+ years in AI/ML roles, strong knowledge of GPU architectures, and excellent communication...- Majestic Labs ai is seeking a highly experienced SoC Architect - AI Acceleration to lead the architecture and integration of advanced compute subsystems. This role focuses on RISC-V-based compute clusters and optimization for AI workloads. The ideal candidate will have...
- ...Overview We are looking for a Staff Software Architect to lead the technical vision and... ...for our cloud-native platform and agentic AI capabilities. You will help shape how our... ...infrastructure. This includes ECS, Kubernetes clusters (EKS), service mesh, API gateway strategy...Work at officeLocal areaRemote workFlexible hours
- ...Client-facing via NTT DATA) Primary Stack: Python, Azure (Azure AI Studio, Azure ML, Azure OpenAI) Day to Day job Duties (what... ...orchestration) Develop classical ML models (risk scoring, prediction, clustering, anomaly detection) Implement HIPAA aware AI architectures with...Remote work
- ...Job Description: DataRobot delivers AI that maximizes impact and minimizes business... ...of the Professional Services team, AI Architects help position, sell, and support the DataRobot... ...and maintaining production-grade clusters in air-gapped, egress-constrained, or "high...Work experience placementLocal areaWorldwideFlexible hours
$62k - $141k
...Share job via: Share SAS Architect & System Administrator The Opportunity... ...Identifiable Information (PII). Deploy, scale, cluster, and troubleshoot SAS environments... ...identity and prevent fraud. Candidate AI Usage Policy AI is a part of our daily...Full timeContract workPart timeWork at officeLocal areaRemote work- ...their businesses. For more information, visit We are hiring a Architect, Data AI to lead the next generation of AI/ML across JAGGAER's Source... ...machine learning models for prediction, classification, clustering, and time‑series analysis. Develop Generative AI and LLM‑powered...Contract workLive in
- ...Details Job Description We are seeking an experienced Silicon Architect to lead the definition and architectural development of compute... ...ARM-based SoCs. This role focuses on high-performance compute cluster design, including ARM core complexes, CMN/CCN mesh interconnects...Local areaShift work
$122.65k - $170.34k
...AI Architect Director - Agentic Systems NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow... ...orchestration) Develop classical ML models (risk scoring, prediction, clustering, anomaly detection) Implement HIPAA aware AI architectures...Temporary workWork experience placementRemote workFlexible hours- ...Database Engineer with cloud migration experience. The role is remote and requires strong expertise in Redis, including both standalone clustered and sentinel architectures. This position offers competitive compensation, aligning with market standards for mid-senior level...Contract workRemote work
- ...Job Title: Platform Architect/AWS solution Architect Location: Onsite (San Diego, CA or... ...infrastructure in Amazon EKS, including cluster design, workload deployment, and security... ...Infrastructure: Practical experience with AWS AI/ML services (SageMaker, Bedrock,...Shift work
- Intel Corporation is seeking an experienced Silicon Architect in Santa Clara, CA. In this role, you will lead the architectural development... ...in networking architecture and memory subsystems and be proficient in compute cluster design. #J-18808-Ljbffr Intel Corporation
- A technology solutions company in Phoenix is seeking a MongoDB DBA specialist to configure and manage replica sets and sharded clusters. You will automate monitoring tasks using Python and shell scripting, and enhance database security to meet compliance standards. The...
- ...Fractional AI Architect (Consultant) Bangalore, Karnataka, India About the Job Fractional AI Architect (Consultant) Generative... ...systems predictive analytics forecasting models clustering and segmentation pipelines. Assess the architecture supporting...Contract workPart timeRemote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Cluster Architect. Be the first to apply!


