Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

GenAI Inference Optimization Lead — GPU Performance

Advanced Micro Devices

A leading technology company is looking for a Principal GenAI Inference Optimization Engineer in San Jose, CA. This role will focus on optimizing performance and efficiency of generative AI on AMD GPU platforms. The ideal candidate will have significant expertise in GPU architecture, GenAI optimization techniques, and performance tuning tools. You will work across various layers and collaborate with cross-functional teams to drive impactful optimizations. This position is hybrid and offers a dynamic work environment. #J-18808-Ljbffr Advanced Micro Devices

Vacancy posted 20 hours ago
Similar jobs that could be interesting for youBased on the GenAI Inference Optimization Lead — GPU Performance in San Jose, CA vacancy
  • $175k - $250k

     ...We're looking for highly experienced GPU Compiler Lead. You will lead the design and implementation of a high-performance compiler stack for our proprietary GPU architecture...  ...high-level languages into highly optimized machine code. What you'll do: Lead... 
    Performance
    Work from home

    Bolt Graphics

    Sunnyvale, CA
    4 days ago
  • $175k - $250k

     ...in Sunnyvale is looking for a highly experienced GPU Compiler Lead to design and implement a high-performance compiler stack for their proprietary GPU architecture...  ...include leading backend development, optimizing workloads, and collaborating with hardware architects... 
    Performance

    Bolt Graphics

    Sunnyvale, CA
    20 hours ago
  • A leading technology company is looking for a Principal AI Performance Engineer to optimize AI inference performance on GPUs. In this role, you will lead a team driving performance optimization...  ...possess extensive experience in GPU computing, strong analytical skills, and... 
    Performance

    Advanced Micro Devices

    San Jose, CA
    20 hours ago
  • $175k - $250k

    A semiconductor startup in Sunnyvale is seeking a CPU Compiler Lead to enhance the performance of their cutting-edge GPU architecture. The role involves implementing latest ISAs in compilers, leading vectorization and autotuning efforts, and ensuring robust integration... 
    Performance

    Bolt Graphics

    Sunnyvale, CA
    20 hours ago
  •  ...technical leader to architect and develop a state-of-the-art inference platform for advertising systems. The ideal candidate will have...  ...role, with expertise in machine learning deployment and high-performance computing. This position also offers a competitive salary range... 
    Performance

    Roku, Inc.

    San Jose, CA
    4 days ago
  • $224k - $356.5k

     ...next era of computing. An era in which our GPU acts as the brains of computers, robots,...  ..., NVLINK, and NVSwitch RDMA fabrics—for performance and scalability, ensuring they meet real...  ..., network validation, and workflow optimization. ~ In-depth experience with network management... 
    Performance

    NVIDIA

    Santa Clara, CA
    20 hours ago
  • $220.2k - $330.4k

     ...Embedded IoT (IE‑IoT) BU leads the transformation of...  ...for generative AI inference and computer vision workloads...  ...the accessibility and performance of a datacenter...  ...developing innovative genAI and hybridAI solutions...  ...instrument, profile, and optimize models and pipelines... 
    Performance
    Work experience placement
    Work at office

    Qualcomm

    Santa Clara, CA
    3 days ago
  • $157k - $271.4k

     ...advances patient empowerment, surgical performance, operating room (OR) collaboration and...  ...We are recruiting for a Principal AI Lead within the Polyphonic® Applied AI and...  ...hallucination Model Integration & Inference Optimization Integrate sophisticated models with... 
    Performance
    Local area
    Immediate start

    J&J Family of Companies

    Santa Clara, CA
    4 hours ago
  • A leading technology company is seeking a Fellow in AI Software to drive the software optimization strategy for top-tier customers. This role involves defining technical vision, leading workload performance engineering, and engaging with customers to deliver tailored solutions... 
    Performance

    Advanced Micro Devices

    San Jose, CA
    3 days ago
  • $224k - $356.5k

     ...Python, Numpy, JAX, MLIR…), and optimization at runtime for maximum flexibility and performance. Our libraries follow CUDA Everywhere...  ...kernels for math libraries and lead design reviews with all...  ...cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation... 
    Performance
    Flexible hours
    Shift work

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  •  ...Founder Vice President, AI Inference Software About the...  ...the future of high-performance AI inference. The successful...  ...include leading the architecture and delivery...  ...background in building or optimizing production-scale AI...  ...skills in areas such as GPU programming, kernel... 
    Performance

    Confidential

    San Jose, CA
    4 days ago
  • $244.8k

     ...experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing...  ...edge of AI efficiency, enhancing the performance, scalability, and deployment of large-...  ...do great things with great people. We lead with curiosity, humility, and a desire... 
    Performance
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    1 day ago
  • $242.3k - $363.5k

     ...Robotics System Architecture Lead to define and own the end-to-...  ...principles for real-time behavior, performance, safety isolation,...  ...across heterogeneous compute (CPU/GPU/DSP/NPU/MCU). Lead architectural...  ..., MCU-based systems, and AI inference runtimes. ~ Familiarity... 
    Performance
    Work experience placement
    Work from home

    Qualcomm

    Santa Clara, CA
    3 days ago
  • $350k

     ...looking for a highly skilled Lead Systems Software Architect who...  ...design, implement, debug, and optimize the software platform that spans...  ...full system, ensuring it is performant, secure, and scalable across SKUs...  ...and enforce memory, CPU/GPU/NPU, and storage budgets across... 
    Performance
    Work at office
    Local area
    Remote work
    Monday to Thursday
    Flexible hours

    Roku

    San Jose, CA
    2 days ago
  • $184k - $356.5k

    A leading technology company in California is seeking a Senior DL...  ...Engineer to drive inference performance for Deep Learning workloads....  ...collaborating with co-design teams to optimize performance across hardware...  ...experience in deep learning and GPU programming. This position... 
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  •  ...engineer in San Jose, CA, focusing on ML infrastructure and performance optimization for large-scale model inference. Ideal candidates should have a strong background in systems engineering and experience with GPU workloads. This hybrid position offers collaboration... 
    Performance

    Advanced Micro Devices

    San Jose, CA
    3 days ago
  • $163.5k - $212.4k

    NIO is seeking a Senior AI Inference Infrastructure Software Engineer in San Jose, CA, specializing in building scalable inference...  ...of software development experience and strong skills in performance optimization and GPU programming. The position offers a competitive salary... 
    Performance

    nio.com

    San Jose, CA
    1 day ago
  •  ...Systems Engineer — Training & Inference Optimization (MBMB) We are building...  ...robot foundation models, high-performance training infrastructure, and...  ...full compute stack Optimize GPU utilization across training...  ...and implement changes that lead to measurable gains in... 
    Performance

    Seer

    Sunnyvale, CA
    1 day ago
  • $191k - $281k

     ...confidence. Trusted by leading AI labs, startups,...  ...superior infrastructure performance with deep technical...  ...need for specialized GPU infrastructure and scalable...  .... * Track and optimize key GTM metrics including...  ...platforms for AI training, inference, or HPC workloads.... 
    Performance
    Permanent employment
    Full time
    Contract work
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    13 hours ago
  • $151.8k

     ...AI Inference Engineer We are looking for an AI Inference Engineer...  ...inference hardware, such as GPU, TPU and AI-specific chips. Our...  ...are not available. Optimizing ASR inference systems for production...  ...Optimizing model inference performance by diving deep into the lower... 
    Performance

    Zoom Video Communications

    San Jose, CA
    3 days ago
  •  ...Title: ML Engineers - with LLM GenAI (3 Resources)...  ...Vectorize and index data Inference pipeline - AI Guided workflow...  ...Model metrics to evaluate the performance of the model (Predictive & Gen...  ...tasks to enable picking the optimal models Apply Guardrails... 
    Performance
    Work experience placement

    Sparktek

    San Jose, CA
    2 days ago
  •  ...Summary Apple Silicon GPU SW architecture team...  ...principal engineer to lead server-side ML...  ...models across many GPUs, optimizing every layer of the stack...  ...deep understanding of inference workload characteristics...  ...but also dive deep into performance profiling, implement... 
    Performance

    Apple

    Cupertino, CA
    1 day ago
  • $152k - $241.5k

     ...NVIDIA's invention of the GPU 1999 sparked the...  ...what is possible in AI performance and help build the technology...  ...computational graph optimizations for next-generation...  ...AI workloads (both inference and training) and successfully...  ...-design: Partner with leading experts across our... 
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...healthcare organization to identify an Imaging Management Lead who will play a key role in supporting daily operations while fostering a collaborative and high-performing clinical team. If you enjoy mentoring others, optimizing workflows, and contributing to a patient-centered... 
    Performance
    Full time
    Weekday work

    Cornerstone Staffing Solutions

    San Jose, CA
    9 days ago
  • $152k - $241.5k

     ...in developing the industry-leading deep learning inference software for NVIDIA AI...  ...implementing inference software optimizations to power AI applications...  ...deep learning experts and GPU architects throughout the...  ...Knowledge of close-to-metal performance analysis, optimization... 
    Performance

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...Deep Learning Architect, LLM Inference! NVIDIA is at the...  ...focuses on inference server performance optimization for Large Language Models (...  ...pushing the boundaries of GPU hardware and software performance...  ...launches produce industry leading performance. Collaborate... 
    Performance

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $147.4k - $272.1k

     ...Models & Generative AI Inference The Intelligence...  ...Generative AI and high-performance systems computing. Your...  ...Your tasks will include: Leading the exploration and...  ...of our clients on the GenAI inference stack, ensuring performance optimization and alignment with broader... 
    Performance
    Relocation

    Apple

    Cupertino, CA
    4 days ago
  •  ...Apple Silicon Gpu Driver Engineer, Graphics, Game...  ...Principal Engineer To Lead Design Of Gpu...  ...Orchestrate Distributed Inference Across Multi-Node Clusters...  ...Directly Impacts The Performance And Power Efficiency Of...  ...And Scheduler Features Optimized For Ml/Llm Workloads... 
    Performance

    Apple

    Cupertino, CA
    1 day ago
  •  ...Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA Title:...  ...: Productize and optimize models from Research into reliable, performant, and cost-efficient services with...  ...Flash Attention) for training and inference without materially degrading quality... 
    Performance

    Enigma

    San Jose, CA
    20 hours ago
  • $195k - $230k

     ...S.C. 1324b(a)(3). Role: Supply Chain Lead Location: San Jose, CA Compensation...  ...~ Own end-to-end hardware supply chain performance, from sourcing through final delivery...  ...supplier and manufacturing partners to optimize for cost, quality, and reliability... 
    Performance
    Permanent employment

    Rivet Industries

    San Jose, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to GenAI Inference Optimization Lead — GPU Performance. Be the first to apply!