Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Fellow GPU Performance Optimizer for AI Training

Advanced Micro Devices

A technology company is looking for a Fellow GPU Performance Optimization Engineer in San Jose, CA. This role focuses on maximizing the performance of large-scale AI training workloads on AMD GPU platforms. Candidates must have deep expertise in GPU architecture, distributed systems, and ML workloads, alongside strong technical leadership skills. The position offers an opportunity to drive innovations across the software-hardware stack and work on impactful optimizations in an inclusive environment. #J-18808-Ljbffr Advanced Micro Devices

Vacancy posted 23 hours ago
Similar jobs that could be interesting for youBased on the Fellow GPU Performance Optimizer for AI Training in San Jose, CA vacancy
  •  ...experiences-from AI and data centers,...  ...Software group. As a Fellow, you will be...  ...end-to-end software optimization strategy to achieve...  ...industry-leading performance for our top-tier customers...  ...inference and training at scale across multi-node/multi-GPU environments. ~... 
    Training
    Performance

    Advanced Micro Devices , Inc.

    San Jose, CA
    1 day ago
  •  ...generation computing experiences—from AI and data centers, to PCs, gaming and...  ...your career. THE ROLE We are seeking a Fellow GPU Performance Optimization Engineer to join our Models and...  ...performance and efficiency of large-scale AI training workloads on AMD GPU platforms. You... 
    Training
    Performance

    Advanced Micro Devices

    San Jose, CA
    23 hours ago
  •  ...computing experiences-from AI and data centers, to...  ...We are looking for a Fellow/Sr. Fellow Machine...  ...Engineer to join our Training At Scale team. If you...  ...training pipeline performance on large scale GPU cluster. Improve the...  .... Design and optimize the distributed training... 
    Training
    Performance

    Advanced Micro Devices , Inc.

    San Jose, CA
    23 hours ago
  •  ...Principal Engineer in Santa Clara, CA to lead AI infrastructure development, define GPU architecture specifications, and drive performance gains in ML systems. The role involves...  ...GPU architectures, CUDA programming, and optimizing large-scale ML systems. A Bachelor's, MS... 
    Training
    Performance

    Advanced Micro Devices

    Santa Clara, CA
    1 day ago
  • $272k - $431.25k

     ...Principal Ai And Ml Infra Software Engineer, Gpu Clusters We are seeking a Principal AI and ML Infra...  ...such initiatives. Monitor and optimize the performance of our infrastructure ensuring...  ...improving substantial distributed training operations using PyTorch (DDP,... 
    Training
    Performance

    NVIDIA

    Santa Clara, CA
    23 hours ago
  •  ...ML Systems Engineer — Training & Inference Optimization (MBMB) We are building large-...  ...robot foundation models, high-performance training infrastructure,...  ...compute stack Optimize GPU utilization across training...  ...We are a research-driven AI and robotics company focused... 
    Training
    Performance

    Seer

    San Jose, CA
    1 day ago
  •  ...computing experiences-from AI and data centers, to PCs...  ...about improving the performance of key applications and...  ...challenges in the industry: training and running AI to make...  ...establish best practices and optimize performance from the lowest-level GPU kernels to large-scale... 
    Training
    Performance

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...unlimited potential of AI to define the next era...  ...computing. An era in which our GPU acts as the brains of...  ...CPUs, and a fully optimized NVIDIA AI and HPC software...  ...engineer to lead performance benchmarking and optimization...  ...real-world AI training, inference, and HPC workloads... 
    Training
    Performance

    NVIDIA

    Santa Clara, CA
    23 hours ago
  • $136.8k - $359.72k

     ...GPU/AI Application Platform Architect - San Jose Location:...  ...meet the requirements of high-performance, low cost and easy to operate...  ...via application performance optimizations and architecture...  ...architecture, familiar with training and inference requirements on... 
    Training
    Performance
    Temporary work
    Local area

    Tik Tok

    San Jose, CA
    1 day ago
  • $45 per hour

     ...You will work on improving the performance and efficiency of large-scale AI models across training, inference, and deployment. This...  ...and engineering efforts to optimize deep learning models for speed,...  ...is a plus. - Familiarity with GPU programming (CUDA, Triton, or similar... 
    Training
    Performance
    Hourly pay
    Full time
    Summer work
    Internship
    Local area

    Tik Tok

    San Jose, CA
    4 days ago
  •  ...computing experiences-from AI and data centers, to PCs, gaming...  ...challenge of distributed training of large models on a large number...  ...-to-end training pipeline performance. Optimize the distributed training...  ...a plus. Experience with GPU kernel optimization is a plus... 
    Training
    Performance

    Advanced Micro Devices , Inc.

    San Jose, CA
    23 hours ago
  •  ...Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA...  ...Responsibilities: Productize and optimize models from Research into reliable, performant, and cost-efficient services with...  ...: ~3–5 years in ML/AI engineering roles owning training... 
    Training
    Performance

    Enigma

    San Jose, CA
    23 hours ago
  • $184k - $287.5k

     ...Software Engineer, Model Optimization and Edge Deployment -...  ...the forefront of the AI revolution,...  ...etc. to boost E2E model performance for production deployments...  ...proven track record of training, deploying, or optimizing...  ...Strong understanding of GPU architecture, the... 
    Training
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    23 hours ago
  • $173.66k - $245.16k

     ...cutting-edge technologies, optimize partner software stacks...  ...solutions that enhance performance and reliability. By...  ...databases, and analytics), AI/ML initiatives, and...  ...to enable the AI PC and GPU IP to support all of...  ...relevant education or training. Your recruiter can share... 
    Training
    Performance
    Local area
    Immediate start
    Shift work

    Intel

    Santa Clara, CA
    1 day ago
  •  ...generation computing experiences—from AI and data centers, to PCs, gaming...  ...and beyond. Principal / Senior GPU Software Performance Engineer — Post‑Training THE ROLE: Drive the performance of...  ...stability across data, model, and optimizer steps. Optimize multi‑GPU/multi‑node... 
    Training
    Performance

    Advanced Micro Devices

    San Jose, CA
    1 day ago
  • $256k - $414k

     ...design, scaling, and operations of high‑performance networking for GPU‑based cloud infrastructure. This...  ...enabling cloud gaming workloads, AI/ML training, and inference platforms by delivering...  ...at scale. Engage with ISPs to optimize low‑latency edge networks and ensure... 
    Training
    Performance
    Local area

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...and Shutdown Time KPIs goal & optimizations Drive end-to-end performance excellence: debug and root-cause GPU bottlenecks and issues for gaming, creator, and AI workload, validate BSP performance...  ...across GPU SW stack, LLM training and inference, and Arm architecture... 
    Training
    Performance

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...Summary Apple Silicon GPU SW architecture team...  ...models across many GPUs, optimizing every layer of the...  ...but also dive deep into performance profiling, implement novel...  ...help define the future of AI experiences delivered...  ...) in the context of ML training/inference ~ Must have... 
    Training
    Performance

    Apple

    Cupertino, CA
    1 day ago
  •  ...is developing a new class of GPU and AI silicon for large-scale model inference and training. The compiler stack connects industry...  ..., and is central to the performance and efficiency the company delivers...  ..., and target-specific optimizations. Implement code generation improvements... 
    Training
    Performance
    Internship

    Oxmiq Labs

    Campbell, CA
    3 days ago
  • $122.44k - $232.19k

     ...Role and Impact: As a GPU Logic Design Engineer at...  ...directly to achieving Intel's performance, power, area, and...  ...tools, and methods to optimize logic design for power,...  ...web services, HPC, and AI‑accelerated systems. Our...  ...relevant education or training. Your recruiter can share... 
    Training
    Performance
    Local area
    Immediate start
    Worldwide
    Flexible hours
    Shift work

    Intel Corporation

    Santa Clara, CA
    1 day ago
  • A leading technology company is seeking a Fellow in AI Software to drive the software optimization strategy for top-tier customers. This role involves defining technical vision, leading workload performance engineering, and engaging with customers to deliver tailored solutions... 
    Performance

    Advanced Micro Devices

    San Jose, CA
    3 days ago
  • $207k - $300k

     ...Engineer, GDC LLM Serving and GPU Performance Google Sunnyvale, CA, USA...  ...Language Models? Join the GDC AI Models and Performance team...  ...and flexibility. You could be optimizing KV cache transfer mechanisms...  ..., and relevant education or training. Your recruiter can share... 
    Training
    Performance
    Full time

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $224k - $356.5k

     ...to lead a team of skilled performance engineers collaborating...  ...platform is known for its AI dominance in deep learning training and inference. Nonetheless...  ...innovative techniques to optimize performance of complex...  ...optimization, including GPU parallel programming, e.g... 
    Training
    Performance

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $131k - $226k

     ...- Velox Operators for GPU Location San Jose, California...  ...computation to GPUs. Optimize memory bandwidth usage...  .... Debug complex performance bottlenecks in a distributed...  ...required by applicable law Training and educational...  ...resources on our personalized, AI‑driven learning... 
    Training
    Performance
    Full time
    Temporary work

    IBM

    San Jose, CA
    2 days ago
  • $207k - $300k

     ...Experience with modern GPU architectures (NVIDIA, AMD, or other AI accelerators), memory hierarchies, and performance bottlenecks. Experience...  ...Experience with compiler optimization, code generation, and runtime...  ...relevant education or training. Your recruiter can share... 
    Training
    Performance
    Full time
    Temporary work
    Worldwide

    Google

    Sunnyvale, CA
    2 days ago
  • $156k - $229k

    Senior Design Technology Co-Optimization Engineer Google • Sunnyvale,...  ...class IP blocks (e.g., high-performance CPU/GPU cores, SRAM arrays, or high-...  ...work to shape the future of AI/ML hardware acceleration. You...  ..., and relevant education or training. Your recruiter can share... 
    Training
    Performance
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    23 hours ago
  •  ...computing experiences-from AI and data centers, to PCs, gaming...  ...leader for the role of AMD Fellow, OneROCm - driving a unified...  ..., models, frameworks, and performance optimization layers. The role also requires...  ...: ~ Knowledge in GPU architectures, basic knowledge... 
    Performance

    Advanced Micro Devices , Inc.

    San Jose, CA
    23 hours ago
  • A leading semiconductor company is looking for a Principal/Senior GPU Software Performance Engineer in San Jose, CA. The role involves optimizing post-training workloads on AMD Instinct GPUs, improving throughput, and collaborating with various teams to drive measurable... 
    Training
    Performance

    Advanced Micro Devices

    San Jose, CA
    1 day ago
  • $109k - $160k

     ...GPU Infrastructure Software Engineer Sunnyvale, CA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers...  ...superior infrastructure performance with deep technical...  ...AI/ML infrastructure and training / inference. The base... 
    Training
    Performance
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    2 days ago
  • $184k - $287.5k

     ...our team at NVIDIA and help bring AI solutions to our largest...  ...offering support in understanding performance aspects related to tasks like large scale LLM training and inference. Conducting regular...  ...diagnostics. Hands-on experience with GPU systems in general including but... 
    Training
    Performance

    NVIDIA

    Santa Clara, CA
    23 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Fellow GPU Performance Optimizer for AI Training. Be the first to apply!