GenAI Inference Optimization Lead — GPU Performance
Advanced Micro Devices
A leading technology company is looking for a Principal GenAI Inference Optimization Engineer in San Jose, CA. This role will focus on optimizing performance and efficiency of generative AI on AMD GPU platforms. The ideal candidate will have significant expertise in GPU architecture, GenAI optimization techniques, and performance tuning tools. You will work across various layers and collaborate with cross-functional teams to drive impactful optimizations. This position is hybrid and offers a dynamic work environment. #J-18808-Ljbffr Advanced Micro Devices
$175k - $250k
...We're looking for highly experienced GPU Compiler Lead. You will lead the design and implementation of a high-performance compiler stack for our proprietary GPU architecture... ...high-level languages into highly optimized machine code. What you'll do: Lead...PerformanceWork from home$175k - $250k
...in Sunnyvale is looking for a highly experienced GPU Compiler Lead to design and implement a high-performance compiler stack for their proprietary GPU architecture... ...include leading backend development, optimizing workloads, and collaborating with hardware architects...Performance- A leading technology company is looking for a Principal AI Performance Engineer to optimize AI inference performance on GPUs. In this role, you will lead a team driving performance optimization... ...possess extensive experience in GPU computing, strong analytical skills, and...Performance
$175k - $250k
A semiconductor startup in Sunnyvale is seeking a CPU Compiler Lead to enhance the performance of their cutting-edge GPU architecture. The role involves implementing latest ISAs in compilers, leading vectorization and autotuning efforts, and ensuring robust integration...Performance- ...technical leader to architect and develop a state-of-the-art inference platform for advertising systems. The ideal candidate will have... ...role, with expertise in machine learning deployment and high-performance computing. This position also offers a competitive salary range...Performance
$224k - $356.5k
...next era of computing. An era in which our GPU acts as the brains of computers, robots,... ..., NVLINK, and NVSwitch RDMA fabrics—for performance and scalability, ensuring they meet real... ..., network validation, and workflow optimization. ~ In-depth experience with network management...Performance$220.2k - $330.4k
...Embedded IoT (IE‑IoT) BU leads the transformation of... ...for generative AI inference and computer vision workloads... ...the accessibility and performance of a datacenter... ...developing innovative genAI and hybridAI solutions... ...instrument, profile, and optimize models and pipelines...PerformanceWork experience placementWork at office$157k - $271.4k
...advances patient empowerment, surgical performance, operating room (OR) collaboration and... ...We are recruiting for a Principal AI Lead within the Polyphonic® Applied AI and... ...hallucination Model Integration & Inference Optimization Integrate sophisticated models with...PerformanceLocal areaImmediate start- A leading technology company is seeking a Fellow in AI Software to drive the software optimization strategy for top-tier customers. This role involves defining technical vision, leading workload performance engineering, and engaging with customers to deliver tailored solutions...Performance
$224k - $356.5k
...Python, Numpy, JAX, MLIR…), and optimization at runtime for maximum flexibility and performance. Our libraries follow CUDA Everywhere... ...kernels for math libraries and lead design reviews with all... ...cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation...PerformanceFlexible hoursShift work- ...Founder Vice President, AI Inference Software About the... ...the future of high-performance AI inference. The successful... ...include leading the architecture and delivery... ...background in building or optimizing production-scale AI... ...skills in areas such as GPU programming, kernel...Performance
$244.8k
...experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing... ...edge of AI efficiency, enhancing the performance, scalability, and deployment of large-... ...do great things with great people. We lead with curiosity, humility, and a desire...PerformanceTemporary workLocal area$242.3k - $363.5k
...Robotics System Architecture Lead to define and own the end-to-... ...principles for real-time behavior, performance, safety isolation,... ...across heterogeneous compute (CPU/GPU/DSP/NPU/MCU). Lead architectural... ..., MCU-based systems, and AI inference runtimes. ~ Familiarity...PerformanceWork experience placementWork from home$350k
...looking for a highly skilled Lead Systems Software Architect who... ...design, implement, debug, and optimize the software platform that spans... ...full system, ensuring it is performant, secure, and scalable across SKUs... ...and enforce memory, CPU/GPU/NPU, and storage budgets across...PerformanceWork at officeLocal areaRemote workMonday to ThursdayFlexible hours$184k - $356.5k
A leading technology company in California is seeking a Senior DL... ...Engineer to drive inference performance for Deep Learning workloads.... ...collaborating with co-design teams to optimize performance across hardware... ...experience in deep learning and GPU programming. This position...Performance- ...engineer in San Jose, CA, focusing on ML infrastructure and performance optimization for large-scale model inference. Ideal candidates should have a strong background in systems engineering and experience with GPU workloads. This hybrid position offers collaboration...Performance
$163.5k - $212.4k
NIO is seeking a Senior AI Inference Infrastructure Software Engineer in San Jose, CA, specializing in building scalable inference... ...of software development experience and strong skills in performance optimization and GPU programming. The position offers a competitive salary...Performance- ...Systems Engineer — Training & Inference Optimization (MBMB) We are building... ...robot foundation models, high-performance training infrastructure, and... ...full compute stack Optimize GPU utilization across training... ...and implement changes that lead to measurable gains in...Performance
$191k - $281k
...confidence. Trusted by leading AI labs, startups,... ...superior infrastructure performance with deep technical... ...need for specialized GPU infrastructure and scalable... .... * Track and optimize key GTM metrics including... ...platforms for AI training, inference, or HPC workloads....PerformancePermanent employmentFull timeContract workTemporary workCasual workWork at officeRemote workFlexible hours$151.8k
...AI Inference Engineer We are looking for an AI Inference Engineer... ...inference hardware, such as GPU, TPU and AI-specific chips. Our... ...are not available. Optimizing ASR inference systems for production... ...Optimizing model inference performance by diving deep into the lower...Performance- ...Title: ML Engineers - with LLM GenAI (3 Resources)... ...Vectorize and index data Inference pipeline - AI Guided workflow... ...Model metrics to evaluate the performance of the model (Predictive & Gen... ...tasks to enable picking the optimal models Apply Guardrails...PerformanceWork experience placement
- ...Summary Apple Silicon GPU SW architecture team... ...principal engineer to lead server-side ML... ...models across many GPUs, optimizing every layer of the stack... ...deep understanding of inference workload characteristics... ...but also dive deep into performance profiling, implement...Performance
$152k - $241.5k
...NVIDIA's invention of the GPU 1999 sparked the... ...what is possible in AI performance and help build the technology... ...computational graph optimizations for next-generation... ...AI workloads (both inference and training) and successfully... ...-design: Partner with leading experts across our...Performance- ...healthcare organization to identify an Imaging Management Lead who will play a key role in supporting daily operations while fostering a collaborative and high-performing clinical team. If you enjoy mentoring others, optimizing workflows, and contributing to a patient-centered...PerformanceFull timeWeekday work
$152k - $241.5k
...in developing the industry-leading deep learning inference software for NVIDIA AI... ...implementing inference software optimizations to power AI applications... ...deep learning experts and GPU architects throughout the... ...Knowledge of close-to-metal performance analysis, optimization...Performance$184k - $287.5k
...Deep Learning Architect, LLM Inference! NVIDIA is at the... ...focuses on inference server performance optimization for Large Language Models (... ...pushing the boundaries of GPU hardware and software performance... ...launches produce industry leading performance. Collaborate...Performance$147.4k - $272.1k
...Models & Generative AI Inference The Intelligence... ...Generative AI and high-performance systems computing. Your... ...Your tasks will include: Leading the exploration and... ...of our clients on the GenAI inference stack, ensuring performance optimization and alignment with broader...PerformanceRelocation- ...Apple Silicon Gpu Driver Engineer, Graphics, Game... ...Principal Engineer To Lead Design Of Gpu... ...Orchestrate Distributed Inference Across Multi-Node Clusters... ...Directly Impacts The Performance And Power Efficiency Of... ...And Scheduler Features Optimized For Ml/Llm Workloads...Performance
- ...Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA Title:... ...: Productize and optimize models from Research into reliable, performant, and cost-efficient services with... ...Flash Attention) for training and inference without materially degrading quality...Performance
$195k - $230k
...S.C. 1324b(a)(3). Role: Supply Chain Lead Location: San Jose, CA Compensation... ...~ Own end-to-end hardware supply chain performance, from sourcing through final delivery... ...supplier and manufacturing partners to optimize for cost, quality, and reliability...PerformancePermanent employment
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to GenAI Inference Optimization Lead — GPU Performance. Be the first to apply!
- performance coach San Jose, CA
- senior performance engineer San Jose, CA
- lead performance test engineer San Jose, CA
- high performance computing engineer San Jose, CA
- performance nutrition San Jose, CA
- performance testing San Jose, CA
- performance engineer San Jose, CA
- performance test architect San Jose, CA
- system performance engineer San Jose, CA
- performance tester San Jose, CA


