GenAI Inference Optimization Lead — GPU Performance

Advanced Micro Devices

A leading technology company is looking for a Principal GenAI Inference Optimization Engineer in San Jose, CA. This role will focus on optimizing performance and efficiency of generative AI on AMD GPU platforms. The ideal candidate will have significant expertise in GPU architecture, GenAI optimization techniques, and performance tuning tools. You will work across various layers and collaborate with cross-functional teams to drive impactful optimizations. This position is hybrid and offers a dynamic work environment. #J-18808-Ljbffr Advanced Micro Devices

Apply

Vacancy posted 20 hours ago

Similar jobs that could be interesting for youBased on the GenAI Inference Optimization Lead — GPU Performance in San Jose, CA vacancy

GPU Compiler Lead
$175k - $250k
...We're looking for highly experienced GPU Compiler Lead. You will lead the design and implementation of a high-performance compiler stack for our proprietary GPU architecture... ...high-level languages into highly optimized machine code. What you'll do: Lead...
Performance
Work from home
Bolt Graphics
Sunnyvale, CA
4 days ago
Senior GPU Compiler Architect — LLVM & Codegen Lead
$175k - $250k
...in Sunnyvale is looking for a highly experienced GPU Compiler Lead to design and implement a high-performance compiler stack for their proprietary GPU architecture... ...include leading backend development, optimizing workloads, and collaborating with hardware architects...
Performance
Bolt Graphics
Sunnyvale, CA
20 hours ago
Lead AI Inference Performance Engineer (GPU)
A leading technology company is looking for a Principal AI Performance Engineer to optimize AI inference performance on GPUs. In this role, you will lead a team driving performance optimization... ...possess extensive experience in GPU computing, strong analytical skills, and...
Performance
Advanced Micro Devices
San Jose, CA
20 hours ago
CPU Compiler Lead - RISC-V GPU Toolchain Innovator
$175k - $250k
A semiconductor startup in Sunnyvale is seeking a CPU Compiler Lead to enhance the performance of their cutting-edge GPU architecture. The role involves implementing latest ISAs in compilers, leading vectorization and autotuning efforts, and ensuring robust integration...
Performance
Bolt Graphics
Sunnyvale, CA
20 hours ago
Lead ML Inference Architect, Advertising Platform
...technical leader to architect and develop a state-of-the-art inference platform for advertising systems. The ideal candidate will have... ...role, with expertise in machine learning deployment and high-performance computing. This position also offers a competitive salary range...
Performance
Roku, Inc.
San Jose, CA
4 days ago
Networking Lead - NVIS New Product Introduction
$224k - $356.5k
...next era of computing. An era in which our GPU acts as the brains of computers, robots,... ..., NVLINK, and NVSwitch RDMA fabrics—for performance and scalability, ensuring they meet real... ..., network validation, and workflow optimization. ~ In-depth experience with network management...
Performance
NVIDIA
Santa Clara, CA
20 hours ago
Principal Engineer, Solutions Architect Lead - Industrial & Embedded IoT, Edge AI On‑Prem Appliance
$220.2k - $330.4k
...Embedded IoT (IE‑IoT) BU leads the transformation of... ...for generative AI inference and computer vision workloads... ...the accessibility and performance of a datacenter... ...developing innovative genAI and hybridAI solutions... ...instrument, profile, and optimize models and pipelines...
Performance
Work experience placement
Work at office
Qualcomm
Santa Clara, CA
3 days ago
Principal AI Lead - Surgical AI
$157k - $271.4k
...advances patient empowerment, surgical performance, operating room (OR) collaboration and... ...We are recruiting for a Principal AI Lead within the Polyphonic® Applied AI and... ...hallucination Model Integration & Inference Optimization Integrate sophisticated models with...
Performance
Local area
Immediate start
J&J Family of Companies
Santa Clara, CA
4 hours ago
Fellow, AI Software: Workload Optimization Leader
A leading technology company is seeking a Fellow in AI Software to drive the software optimization strategy for top-tier customers. This role involves defining technical vision, leading workload performance engineering, and engaging with customers to deliver tailored solutions...
Performance
Advanced Micro Devices
San Jose, CA
3 days ago
Senior Math Libraries Engineer, CPU and GPU Optimization
$224k - $356.5k
...Python, Numpy, JAX, MLIR…), and optimization at runtime for maximum flexibility and performance. Our libraries follow CUDA Everywhere... ...kernels for math libraries and lead design reviews with all... ...cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation...
Performance
Flexible hours
Shift work
NVIDIA Corporation
Santa Clara, CA
1 day ago
Founder Vice President, AI Inference Software
...Founder Vice President, AI Inference Software About the... ...the future of high-performance AI inference. The successful... ...include leading the architecture and delivery... ...background in building or optimizing production-scale AI... ...skills in areas such as GPU programming, kernel...
Performance
Confidential
San Jose, CA
4 days ago
Sr. Multimodal Model Training and Inference Optimization Engineer
$244.8k
...experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing... ...edge of AI efficiency, enhancing the performance, scalability, and deployment of large-... ...do great things with great people. We lead with curiosity, humility, and a desire...
Performance
Temporary work
Local area
ByteDance
San Jose, CA
1 day ago
Principal / Director - Robotics System Architecture Lead
$242.3k - $363.5k
...Robotics System Architecture Lead to define and own the end-to-... ...principles for real-time behavior, performance, safety isolation,... ...across heterogeneous compute (CPU/GPU/DSP/NPU/MCU). Lead architectural... ..., MCU-based systems, and AI inference runtimes. ~ Familiarity...
Performance
Work experience placement
Work from home
Qualcomm
Santa Clara, CA
3 days ago
Lead Systems Software Architect
$350k
...looking for a highly skilled Lead Systems Software Architect who... ...design, implement, debug, and optimize the software platform that spans... ...full system, ensuring it is performant, secure, and scalable across SKUs... ...and enforce memory, CPU/GPU/NPU, and storage budgets across...
Performance
Work at office
Local area
Remote work
Monday to Thursday
Flexible hours
Roku
San Jose, CA
2 days ago
Senior DL Inference & Performance Engineer
$184k - $356.5k
A leading technology company in California is seeking a Senior DL... ...Engineer to drive inference performance for Deep Learning workloads.... ...collaborating with co-design teams to optimize performance across hardware... ...experience in deep learning and GPU programming. This position...
Performance
NVIDIA Corporation
Santa Clara, CA
3 days ago
Post-Training LLM Inference Platform Engineer
...engineer in San Jose, CA, focusing on ML infrastructure and performance optimization for large-scale model inference. Ideal candidates should have a strong background in systems engineering and experience with GPU workloads. This hybrid position offers collaboration...
Performance
Advanced Micro Devices
San Jose, CA
3 days ago
Senior AI Inference Infrastructure Engineer
$163.5k - $212.4k
NIO is seeking a Senior AI Inference Infrastructure Software Engineer in San Jose, CA, specializing in building scalable inference... ...of software development experience and strong skills in performance optimization and GPU programming. The position offers a competitive salary...
Performance
nio.com
San Jose, CA
1 day ago
Software Engineer: ML Optimization
...Systems Engineer — Training & Inference Optimization (MBMB) We are building... ...robot foundation models, high-performance training infrastructure, and... ...full compute stack Optimize GPU utilization across training... ...and implement changes that lead to measurable gains in...
Performance
Seer
Sunnyvale, CA
1 day ago
Enterprise GTM Leader
$191k - $281k
...confidence. Trusted by leading AI labs, startups,... ...superior infrastructure performance with deep technical... ...need for specialized GPU infrastructure and scalable... .... * Track and optimize key GTM metrics including... ...platforms for AI training, inference, or HPC workloads....
Performance
Permanent employment
Full time
Contract work
Temporary work
Casual work
Work at office
Remote work
Flexible hours
CoreWeave
Sunnyvale, CA
13 hours ago
AI Inference Engineer - Speech
$151.8k
...AI Inference Engineer We are looking for an AI Inference Engineer... ...inference hardware, such as GPU, TPU and AI-specific chips. Our... ...are not available. Optimizing ASR inference systems for production... ...Optimizing model inference performance by diving deep into the lower...
Performance
Zoom Video Communications
San Jose, CA
3 days ago
ML Engineers - with LLM GenAI
...Title: ML Engineers - with LLM GenAI (3 Resources)... ...Vectorize and index data Inference pipeline - AI Guided workflow... ...Model metrics to evaluate the performance of the model (Predictive & Gen... ...tasks to enable picking the optimal models Apply Guardrails...
Performance
Work experience placement
Sparktek
San Jose, CA
2 days ago
GPU Software Architecture Engineer, Graphics, Games, & ML
...Summary Apple Silicon GPU SW architecture team... ...principal engineer to lead server-side ML... ...models across many GPUs, optimizing every layer of the stack... ...deep understanding of inference workload characteristics... ...but also dive deep into performance profiling, implement...
Performance
Apple
Cupertino, CA
1 day ago
Compiler Engineer - AI Inference
$152k - $241.5k
...NVIDIA's invention of the GPU 1999 sparked the... ...what is possible in AI performance and help build the technology... ...computational graph optimizations for next-generation... ...AI workloads (both inference and training) and successfully... ...-design: Partner with leading experts across our...
Performance
NVIDIA
Santa Clara, CA
3 days ago
Imaging Management Lead
...healthcare organization to identify an Imaging Management Lead who will play a key role in supporting daily operations while fostering a collaborative and high-performing clinical team. If you enjoy mentoring others, optimizing workflows, and contributing to a patient-centered...
Performance
Full time
Weekday work
Cornerstone Staffing Solutions
San Jose, CA
9 days ago
Senior Software Engineer, Machine Learning Inference
$152k - $241.5k
...in developing the industry-leading deep learning inference software for NVIDIA AI... ...implementing inference software optimizations to power AI applications... ...deep learning experts and GPU architects throughout the... ...Knowledge of close-to-metal performance analysis, optimization...
Performance
NVIDIA
Santa Clara, CA
2 days ago
Senior Deep Learning Architect, LLM Inference
$184k - $287.5k
...Deep Learning Architect, LLM Inference! NVIDIA is at the... ...focuses on inference server performance optimization for Large Language Models (... ...pushing the boundaries of GPU hardware and software performance... ...launches produce industry leading performance. Collaborate...
Performance
NVIDIA
Santa Clara, CA
4 days ago
Machine Learning Engineer, Proactive - Large Language Models & Generative AI Inference
$147.4k - $272.1k
...Models & Generative AI Inference The Intelligence... ...Generative AI and high-performance systems computing. Your... ...Your tasks will include: Leading the exploration and... ...of our clients on the GenAI inference stack, ensuring performance optimization and alignment with broader...
Performance
Relocation
Apple
Cupertino, CA
4 days ago
Apple Silicon GPU Driver Engineer, Graphics, Game and ML
...Apple Silicon Gpu Driver Engineer, Graphics, Game... ...Principal Engineer To Lead Design Of Gpu... ...Orchestrate Distributed Inference Across Multi-Node Clusters... ...Directly Impacts The Performance And Power Efficiency Of... ...And Scheduler Features Optimized For Ml/Llm Workloads...
Performance
Apple
Cupertino, CA
1 day ago
Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA
...Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA Title:... ...: Productize and optimize models from Research into reliable, performant, and cost-efficient services with... ...Flash Attention) for training and inference without materially degrading quality...
Performance
Enigma
San Jose, CA
20 hours ago
Supply Chain Lead
$195k - $230k
...S.C. 1324b(a)(3). Role: Supply Chain Lead Location: San Jose, CA Compensation... ...~ Own end-to-end hardware supply chain performance, from sourcing through final delivery... ...supplier and manufacturing partners to optimize for cost, quality, and reliability...
Performance
Permanent employment
Rivet Industries
San Jose, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to GenAI Inference Optimization Lead — GPU Performance. Be the first to apply!