Principal AI Performance Architect for Scalable GPU Training
Advanced Micro Devices , Inc.
Advanced Micro Devices is looking for a Principal Engineer in Santa Clara, CA to lead AI infrastructure development, define GPU architecture specifications, and drive performance gains in ML systems. The role involves leading innovative techniques, collaborating with stakeholders, and establishing best practices for distributed ML systems. The ideal candidate has extensive experience in GPU architectures, CUDA programming, and optimizing large-scale ML systems. A Bachelor's, MS or PhD in Computer Science or Engineering is required. #J-18808-Ljbffr Advanced Micro Devices , Inc.
$272k - $431.25k
...artificial intelligence (AI), agentic workloads, deep learning (DL), high-performance computing (HPC), cloud... .... Knowledge of GPU and SOC design. Experience... ...faster AI model training, agentic use-cases, efficient... ...data processing, and scalable cloud deployments....PrincipalTrainingPerformance- ...The Role As a Principal Engineer, you will spearhead... ...next generation of AI infrastructure by defining GPU architecture... ...enable massive model training at scale. Your... ...expertise will drive 2-3x performance gains in both... ...parallel dimensions Architect memory‐efficient...PrincipalTrainingPerformanceRemote work
- ...experiences—from AI and data centers,... ...seeking a Robotics AI Architect to define and... ...production‑grade performance targets. THE PERSON... ...enable broad ecosystem scalability. KEY... ...design across CPU, GPU, and accelerators... ...subsystems Cloud (training, simulation, fleet...PrincipalTrainingPerformance
$272k - $431.25k
...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join our... ...potent, effective, and scalable solutions as we mold... ...Monitor and optimize the performance of our infrastructure ensuring... ...distributed training operations using PyTorch...PrincipalTrainingPerformance$272k - $431.25k
...group is solving some of AI’s hardest... ...computing interconnects. This Principal Architect role leads the research... ...communication systems—GPU-to-GPU, GPU-to-storage... ...deep expertise in high-performance networking (InfiniBand... ...parallelism, or distributed training and inference patterns...PrincipalTrainingPerformance- ...This position is for a Senior Principal Engineer, AI/ML System Architect. As system architect, one... ...systems design including AI training and inference workloads and performance demands, as well as compute,... ...AMD, Intel, or other modern GPU accelerators and support systems...PrincipalTrainingPerformanceLocal areaRemote work
$206.4k - $379.1k
...impressive content. The AI Foundations team constructs a flexible, scalable AI framework that... ...We're looking for a Principal Architect to build and implement... ...infrastructure to support model training, fine‐tuning,... ...frameworks. Develop high‐performance runtime services for...PrincipalTrainingPerformanceLocal areaWorldwideFlexible hours- ...This position is for a Senior Principal Engineer, AI/ML System Architect. As system architect,one will... ...systems design including AI training andinference workloads and performance demands, as well as compute,... ...AMD, Intel, or other modern GPU accelerators and support systems...PrincipalTrainingPerformanceLocal areaRemote work
$254k - $349.25k
...protect how people, data, and AI agents connect across email... ...Overview We are seeking a Principal ML Architect to lead the design and... ...expertise in model architecture, training, fine‑tuning, and... ...continuously improve model performance and reliability Productionization...PrincipalTrainingPerformanceFlexible hours$254k - $349.25k
...protect how people, data, and AI agents connect across email... ...We are seeking a Principal ML Architect to lead the design and development... ...in model architecture, training, fine-tuning, and distillation... ...continuously improve model performance and reliability...PrincipalTrainingPerformanceFlexible hours$272k - $431.25k
...NVIDIA Gruppe is seeking experienced engineers for its Dynamo platform, focusing on scalable AI systems. You will develop the Kubernetes deployment, optimize GPU resource management, and work on intelligent routing and KV-cache management. Applicants should have 15+ years...$184k - $287.5k
...pivotal role in crafting the future of GPU technology. At NVIDIA, you will work... ..., optimizing along the axes of scalability/modularity, performance, area, yield, effort, and schedule.... ...an existing vacancy. NVIDIA uses AI tools in its recruiting processes....PerformanceWork experience placementNight shift- NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Communication Architect to optimize DNN models and enhance communication performance during distributed training. This role requires collaboration with hardware/software teams to implement efficient communication...TrainingPerformance
- ...Job Summary T he AI Interconnect Architect designs and engineers high-speed... ..., power efficiency, scalability, and optimized transport protocols... ...optimize power, cost, and performance across diverse workloads.... ...hardware architecture, including GPU/accelerator clusters and...PrincipalPerformance
$272k - $431.25k
...Always-On, low-overhead GPU profiling service that... ...interfaces, data flows, and scalability guarantees for multi-... .../platform layers, and performance counter/trace providers... ...with existing ML/AI workflows (e.g., PyTorch... ...on experience tuning ML training/inference loops based on...PrincipalTrainingPerformance- ...experiences—from AI and data... ...THE ROLE: As a Principal AI Infrastructure... ...large‑scale LLM training and inference on... ...strong expertise in GPU‑accelerated... ...resilient, high‑performance AI workloads at... ...and SLURM. Architect and validate Kubernetes... ...applicable, scalable checkpointing)...PrincipalTrainingPerformance
$272k - $431.25k
...is to innovate how we architect and develop our GPU for the changing AI and accelerated workloads... ...ways to improve the scalability of our design, infrastructure... .... We are looking for a Principal System Architect with a... ...to fruition high-performance, high-volume System-on-...PrincipalPerformance$272k - $431.25k
...Responsibilities Develop innovative high-performance processor and system architectures, focusing on the memory system and energy efficiency... ...micro‑architecture features to improve the state‑of‑the‑art in GPU memory systems, optimizing along the axes of perf/W, perf/mm,...PrincipalPerformance$272k - $431.25k
...We are seeking a world-class Principal Memory Simulation Architect to design and develop the next-generation GPU memory and on-chip interconnect performance and functional models. Ideal candidates... ...large-scale architectural simulation, AI-assisted model generation, and...PrincipalPerformanceNight shift$272k - $431.25k
...and creative solutions architect with experience in network... ...to join the NVIDIA GPU Cloud Infrastructure team... ...for the product into scalable technical architectures... ...BGP, DNS, QUIC. High‑performance networking and low‑latency... ...12, 2026. NVIDIA uses AI tools in its recruiting...PrincipalPerformance- ...computing experiences-from AI and data centers, to... ...AMD's Data Center GPU organization is... ...highly accomplished Principal Modeling Architect to join the Product Architecture... ...requirements and performance projections. Your... ...project performance, scalability, and resource utilization...PrincipalPerformanceRemote work
$296.3k
...Role: We are seeking a Principal AI Engineer to lead the... ...powers large-scale training and cloud inference.... .... What You’ll Do: Architect, build, and optimize... ...practices for reliability, scalability, and performance across the AI/ML... ...systems, GPU computing, and cloud...PrincipalTrainingPerformanceRemote workFlexible hours- ...generation computing experiences—from AI and data centers, to PCs, gaming and... ...r e A r c h i t e c t THE ROLE: As GPU Software Architect, you will provide technical... ...architectural intent translates into working, performant, and scalable solutions for partnerships...PrincipalPerformanceRemote work
$166.52k - $249.5k
...Principal System Architect Marvell's semiconductor solutions... ...enterprise, cloud and AI, and carrier... ...for CPU as well as GPU to unlock memory wall... ...problems for AI interface/training. The memory wall... ...hardware level for scalability and performance optimization. Benchmark...PrincipalTrainingPerformancePermanent employmentWork experience placementInternshipWork from home$185.2k - $299.48k
...and Inclusion. We weave AI into the fabric of... ...create them. As a Sr Principal Engineer focused on the... ...scale, security, and performance at the most critical point... ...Innovation: Design, architect, and implement production... ...(PoCs) to deploying scalable ML baselines and intelligent...PrincipalPerformanceFull timeWork at officeShift work- ...experiences-from AI and data centers,... ...We are seeking a Principal Software Quality Engineer... ...on AMD Instinct™ GPU platforms. You... ...framework, workload, performance, stress, stability... ...- LLM training and inference (PyTorch... ...regression tracking. ~ Architect the test...PrincipalTrainingPerformanceContract workShift work
$221.7k - $364.8k
...IP) that is applied to high-performance computing devices (mobile,... ...and Responsibilities As a Principal of GPU Architect & Modeling, you will lead teams... ...graphics, compute, and AI capabilities for millions of... ...performance and power efficiency, scalable across Samsung’s premium...PrincipalPerformanceHourly payFull timeWorldwideRelocation- ...NVIDIA Gruppe in Santa Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design and implement cutting... ...edge GPU compute clusters, focusing on deep learning and high-performance computing. The ideal candidate will have at least 5+ years of...Performance
$157k - $271.4k
...empowerment, surgical performance, operating room (... ...devices, data and AI driven insights in... ...recruiting for a Principal Software Engineer... ...to scaling training and inference across... ...product and ship scalable APIs, SDKs, CLIs,... ...model CI/CD, and GPU resources management...PrincipalTrainingPerformanceImmediate start- ...Overview We are now looking for a Senior GPU & Deep Learning Architect to join the NVIDIA GPU Architecture... ...architectures targeting both training and inference workloads. Advance the... ...validate, and verify functional or performance models. Develop tests, test plans, and...TrainingPerformance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal AI Performance Architect for Scalable GPU Training. Be the first to apply!
- principal architect Santa Clara, CA
- principal Santa Clara, CA
- senior principal scientist Santa Clara, CA
- senior principal cloud computing engineer Santa Clara, CA
- principal data scientist Santa Clara, CA
- principal cloud computing engineer Santa Clara, CA
- senior performance engineer Santa Clara, CA
- application performance engineer Santa Clara, CA
- performance engineer Santa Clara, CA
- performance food group Santa Clara, CA

