Principal AI Performance Architect for Scalable GPU Training
Advanced Micro Devices
Advanced Micro Devices is looking for a Principal Engineer in Santa Clara, CA to lead AI infrastructure development, define GPU architecture specifications, and drive performance gains in ML systems. The role involves leading innovative techniques, collaborating with stakeholders, and establishing best practices for distributed ML systems. The ideal candidate has extensive experience in GPU architectures, CUDA programming, and optimizing large-scale ML systems. A Bachelor's, MS or PhD in Computer Science or Engineering is required. #J-18808-Ljbffr Advanced Micro Devices
$272k - $431.25k
...artificial intelligence (AI), agentic workloads, deep learning (DL), high-performance computing (HPC), cloud... .... Knowledge of GPU and SOC design. Experience... ...faster AI model training, agentic use-cases, efficient... ...data processing, and scalable cloud deployments....PrincipalTrainingPerformance$272k - $431.25k
...Principal Ai And Ml Infra Software Engineer, Gpu Clusters We are seeking a Principal AI and... ...potent, effective, and scalable solutions as we mold the... ...Monitor and optimize the performance of our infrastructure... ...substantial distributed training operations using...PrincipalTrainingPerformance- ...experiences-from AI and data centers,... ...seeking a Robotics AI Architect to define and... ...production-grade performance targets. THE PERSON... ...broad ecosystem scalability. KEY RESPONSIBILITIES... ...across CPU, GPU, and accelerators... ...subsystems Cloud (training, simulation, fleet...PrincipalTrainingPerformance
- ...experiences—from AI and data centers,... ...career. THE ROLE As a Principal Engineer, you will... ...by defining GPU architecture specifications... ...massive model training at scale. Your... ...expertise will drive 2-3x performance gains in both... ...dimensions Architect memory‑efficient training...PrincipalTrainingPerformanceRemote work
$254k - $349.25k
...protect how people, data, and AI agents connect across email... ...We are seeking a Principal ML Architect to lead the design and development... ...in model architecture, training, fine-tuning, and distillation... ...continuously improve model performance and reliability...PrincipalTrainingPerformanceFlexible hours$254k - $349.25k
...protect how people, data, and AI agents connect across email... ...Overview We are seeking a Principal ML Architect to lead the design and... ...expertise in model architecture, training, fine‑tuning, and... ...continuously improve model performance and reliability Productionization...PrincipalTrainingPerformanceFlexible hours$184k - $287.5k
...pivotal role in crafting the future of GPU technology. At NVIDIA, you will work... ..., optimizing along the axes of scalability/modularity, performance, area, yield, effort, and schedule.... ...an existing vacancy. NVIDIA uses AI tools in its recruiting processes....PerformanceWork experience placementNight shift- ...experiences-from AI and data... ...ROLE: As a Principal AI Infrastructure... ...large-scale LLM training and inference on... ...strong expertise in GPU-accelerated... ..., high-performance AI workloads at... ...and SLURM. Architect and validate Kubernetes... ...where applicable, scalable checkpointing)...PrincipalTrainingPerformance
$272k - $431.25k
...Always-On, low-overhead GPU profiling service that... ...interfaces, data flows, and scalability guarantees for multi-... .../platform layers, and performance counter/trace providers... ...with existing ML/AI workflows (e.g., PyTorch... ...on experience tuning ML training/inference loops based on...PrincipalTrainingPerformance$272k - $431.25k
...is to innovate how we architect and develop our GPU for the changing AI and accelerated workloads... ...ways to improve the scalability of our design, infrastructure... .... We are looking for a Principal System Architect with a... ...to fruition high-performance, high-volume System-on-...PrincipalPerformance$272k - $431.25k
NVIDIA Corporation in Santa Clara seeks a Principal GPU Memory Simulation Architect to design advanced GPU memory systems. Candidates with a Bachelor... ...encouraged to apply. The role includes developing performance models with AI, defining innovative GPU features, and...PrincipalPerformance$166.52k - $249.5k
...Principal System Architect Marvell's semiconductor solutions... ...enterprise, cloud and AI, and carrier... ...for CPU as well as GPU to unlock memory wall... ...problems for AI interface/training. The memory wall... ...hardware level for scalability and performance optimization. Benchmark...PrincipalTrainingPerformancePermanent employmentWork experience placementInternshipWork from home$272k - $431.25k
...creative solutions architect with experience in... ...join the NVIDIA GPU Cloud Infrastructure... ...We are seeking a Principal Solutions Architect... ...and optimize high-performance network... ...the product into scalable technical architectures... ...NVIDIA uses AI tools in its recruiting...PrincipalPerformanceRemote work- ...experiences-from AI and data centers,... ...We are seeking a Principal Software Quality Engineer... ...on AMD Instinct™ GPU platforms. You... ...framework, workload, performance, stress, stability... ...- LLM training and inference (PyTorch... ...regression tracking. ~ Architect the test...PrincipalTrainingPerformanceContract workShift work
$157k - $271.4k
...empowerment, surgical performance, operating room (... ...devices, data and AI‑driven insights in... ...recruiting for a Principal Software Engineer... ...product and ship scalable APIs, SDKs, CLIs and... ...Orchestrator and Training/Inference control... ...spend, especially for GPU‑intensive training...PrincipalTrainingPerformance$164.8k - $226.6k
...electronics, ensuring performance, resilience and scalability. For decades,... ...high-growth ones in AI datacenters, automated... ...Summary The Principal Hardware Signal Integrity... .../Power Integrity Architect is accountable for... ..., education, and training. In addition to...PrincipalTrainingPerformance$296.3k
...We are seeking a Principal AI Engineer to lead the... ...powers large-scale training and cloud inference.... ...What You'll Do: Architect, build, and optimize... ...practices for reliability, scalability, and performance across the AI/ML... ...distributed systems, GPU computing, and cloud...PrincipalTrainingPerformanceLocal areaRemote workWork from homeFlexible hours- ...generation computing experiences—from AI and data centers, to PCs, gaming and... ...r e A r c h i t e c t THE ROLE: As GPU Software Architect, you will provide technical... ...architectural intent translates into working, performant, and scalable solutions for partnerships...PrincipalPerformanceRemote work
- ...I/O Interconnect Architect What You Do At... ...experiences—from AI and data centers,... ...Intelligence (AI), High Performance Computing (HPC),... ...shape the future of scalable, high-performance... ..., including CPU/GPU interconnects, memory... ...enumeration, link training, error handling)....TrainingPerformance
$184k - $287.5k
...are now looking for a Senior GPU & Deep Learning Architect! The NVIDIA GPU... ...fields delivering the highest performance in the world for deep learning... ...deep learning workloads, both training and inference, and maintain... ...real-time, cost-effective AI computing platform driving...TrainingPerformance- ...recognized globally for innovation, performance and quality. Sandisk has... ...Job Description An AI Interconnect Architect defines and engineers high-... ..., power efficiency, scalability, and optimized transport protocols... ...: Familiarity with GPU/accelerator clusters and data...PrincipalPerformanceTemporary workRemote workFlexible hoursShift work
$128k - $312k
...Expect The Tesla AI Hardware team is... ...built to efficiently train massive neural... ...computational efficiency and performance. By creating... ...the AI/ML Compute Architect will drive the... ...create efficient, scalable solutions that power... ...knowledge of CPU, GPU, and ML...TrainingPerformanceHourly payFull timeTemporary workWork at officeFlexible hours$240k - $379.5k
...customers where they are on their AI journey on our GPUs - this... ...Manager for AI Platform post-training and RL, you will be responsible... ...builders to get the best large scale performance, resilience and experience on... ...deep learning across all GPU use cases and providing great...PrincipalTrainingPerformance$272k - $431.25k
NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California... ...the efficiency of AI/ML research on GPU Clusters. The role involves collaboration... ...teams, monitoring infrastructure performance, and implementing improvements based on...PrincipalPerformance$210k - $265k
...solutions that accelerate AI data centers to... ...AI. Today's AI performance is frequently limited... ...and unlock greater GPU utilization to speed training job completion... ..., performance, and scalability testing. Work closely... ...Proven ability to architect test frameworks, design...PrincipalTrainingPerformance- ...generation computing experiences-from AI and data centers, to PCs,... ..., we advance your career. Principal / Senior GPU Software Performance Engineer - Post-Training THE ROLE: Drive the performance... ...debugging. Develop scalable tooling and automation to improve...PrincipalTrainingPerformance
- NVIDIA Corporation is seeking a Power Architect for New College Grad 2026 in Santa Clara,... ...You will be responsible for architecting GPU power features and managing system-level... ...Computer Engineering, with knowledge of performance simulators and programming tools. Salary...Performance
$272k - $431.25k
...demanding high-speed IO applications a GPU or high-performance computing server will encounter in its... ...by collaborating closely with silicon architects, board/rack level designers, and... ...interpersonal skills ~ Capability to use AI prompt tools Your base salary will...PrincipalPerformance- ...feature engineering, model training, deployment,... ...explainability, and responsible AI compliance. **... ...*Proven experience** architecting large-scale ML/AI... ...platforms with attention to performance, scalability, and maintainability.... .... The ideal Principal has a deep technical...PrincipalTrainingPerformance
$272k - $431.25k
...Learning (ML) Engineer to join the GPU accelerated Apache Spark team.... ...for ETL, SQL, and ML/DL model training and inference pipelines,... ...You will apply the latest ML/AI methods to empower enterprises... ...machine learning solutions for performance prediction and optimization of...PrincipalTrainingPerformance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal AI Performance Architect for Scalable GPU Training. Be the first to apply!
- principal Santa Clara, CA
- senior principal cloud computing engineer Santa Clara, CA
- principal architect Santa Clara, CA
- principal data scientist Santa Clara, CA
- principal cloud computing engineer Santa Clara, CA
- senior principal scientist Santa Clara, CA
- senior performance engineer Santa Clara, CA
- lead performance test engineer Santa Clara, CA
- high performance computing engineer Santa Clara, CA
- performance testing Santa Clara, CA


