Senior LLM Inference Kernel Engineer (Distributed GPU)

Advanced Micro Devices

Advanced Micro Devices in Santa Clara seeks a Senior ML Engineer focused on optimizing large language model inference runtimes. The role involves architecting distributed systems and enhancing performance across GPUs. Ideal candidates will have expertise in Python and C/C++, familiarity with deep learning frameworks, and a Master’s or PhD in a related field. Join us to push the boundaries of AI and shape impactful technology. We offer a culture of innovation and substantial benefits to our workforce. #J-18808-Ljbffr Advanced Micro Devices

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Senior LLM Inference Kernel Engineer (Distributed GPU) in Santa Clara, CA vacancy

Senior Software Development Engineer - LLM Kernel & Inference Systems
...your career. THE ROLE As a senior member of the LLM inference framework team, you will be... ...‑grade single-node and distributed inference runtimes for large... ...intersection of inference engines, distributed systems, and GPU runtime and kernel backends. THE PERSON You are...
Senior
Advanced Micro Devices
Santa Clara, CA
1 day ago
Senior LLM Performance Engineer - GPU Inference
$184k - $356.5k
A leading AI computing company in California is seeking a Senior Deep Learning Software Engineer focused on performance optimization of LLM models. You will analyze and enhance LLM inference performance, working in cross-collaborative teams to implement cutting-edge algorithms...
Senior
Full time
NVIDIA Corporation
Santa Clara, CA
13 hours ago
Senior Formal Verification Engineer, GPU Kernels
$184k - $287.5k
Overview We are looking for a Senior Formal Verification Engineer for GPU Kernels. NVIDIA's Deep Learning Safety Team is hiring engineers to build verification... ...of concurrent software. Experience building LLM agents with tool use and multi‑step reasoning, or with...
Senior
Work experience placement
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Edge-LLM Real-Time Inference Engineer
...NVIDIA Gruppe is looking for a skilled engineer to join their TensorRT Edge-LLM team in Santa Clara, California. The role involves developing a state-of-the-art inference framework for large language models and optimizing it for real-time performance on embedded platforms...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...and motivated software engineers to join us and build AI inference systems that serve large... ...inference stacks, optimize GPU kernels and compilers, drive... ..., parallel programming, distributed systems, deep learning theories... ...building and optimizing LLM inference engines (e.g.,...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Backend Engineer: Distributed Systems for AI Inference
...· Department: Backend Engineer · Work type: On-Site... ...a real-time multimodal LLM for real life, transforming... ...scalable, and resilient distributed systems. You’ll work... ..., low-latency AI model inference and data services. Partner... ...performance across GPU clusters, cloud infrastructure...
Senior
Full time
Neara
Palo Alto, CA
4 days ago
Senior Formal Verification Engineer, GPU Kernels
$184k - $287.5k
Senior Formal Verification Engineer, GPU Kernels page is loaded## Senior Formal Verification Engineer, GPU Kernelslocations: US, CA, Santa Claratime type: Full... ...of concurrent software.* Experience building LLM agents with tool use and multi-step reasoning, or with...
Senior
Work experience placement
NVIDIA Corporation
Santa Clara, CA
1 day ago
Senior DL Inference Engineer - GPU-Accelerated AI, Equity
$184k - $356.5k
NVIDIA Gruppe is looking for a Senior Software Engineer specializing in Deep Learning Inference in Santa Clara, California. You will design and optimize GPU-accelerated software critical for advanced AI applications, contributing to libraries like vLLM and SGLang. Ideal...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior DL Inference Engineer - GPU Optimization Equity
NVIDIA is seeking a Senior DL Algorithms Engineer to optimize LLM/Omni models and enhance performance across its software stack. The ideal candidate will... ...of experience in deep learning, specifically in inference. This role involves profiling, analyzing bottlenecks,...
Senior
NVIDIA
Santa Clara, CA
2 days ago
Senior LLM Performance Engineer - GPU-Accel & Equity
NVIDIA Gruppe is seeking a Senior Deep Learning Software Engineer focused on LLM performance in Santa Clara. You will optimize GPU-accelerated software for large language model deployment, working on performance tuning for various models. The ideal candidate has over 8...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Performance Engineer, Inference
...‑leading training and inference speeds and empowers machine... ...10 times faster than GPU‑based hyperscale cloud... ...Role We are hiring a Senior Performance Analyst to... ..., SGLang, TensorRT‑LLM), GPU kernel‑level optimization toolchains... ...with Product and Engineering to identify where...
Senior
Contract work
Shift work
Cerebras
Sunnyvale, CA
13 hours ago
Senior GPU Kernel Verification Engineer — Formal Methods & AI
NVIDIA Gruppe is seeking a Senior Formal Verification Engineer for GPU Kernels, focused on creating verification tools that ensure correct behavior in various environments. This role involves designing verification tools, integrating AI into workflows, and participating...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior GPU Kernel Verification Engineer — AI + Formal Methods
$184k - $287.5k
NVIDIA Corporation is seeking a Senior Formal Verification Engineer for GPU Kernels in Santa Clara, CA. In this role, you will develop and deliver verification tools for GPU kernels, integrating AI into verification workflows. The ideal candidate has an MS or PhD in Computer...
Senior
NVIDIA Corporation
Santa Clara, CA
2 days ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Principal Software Engineer (AI Inference / Distributed Systems)
...looking for a strategic software engineering lead who is passionate about... ...scale‑up and scale‑out inference. Develop methods and tooling... ...sglang, or vllm and with kserve, llm‑d. Experience running... ...used to optimize inference like distributed kv‑cache, disaggregation, request...
AMD
Santa Clara, CA
3 days ago
Principal Software Engineer (AI Inference / Distributed Systems)
...looking for a strategic software engineering lead who is passionate about... ...scale-up and scale-out inference. Develop methods and tooling... ...sglang, or vllm and with kserve, llm-d. Experience running... ...used to optimize inference like distributed kv-cache, disaggregation, request...
Advanced Micro Devices , Inc.
Santa Clara, CA
4 days ago
Senior CUDA DL Frameworks & HPC Engineer
$184k - $287.5k
...Visualization. The GPU, our invention,... ...motivated Deep Learning engineer to bring advanced CUDA features and Distributed Runtime... ...including PyTorch, TRT-LLM, vLLM, SGLang, JAX... ...up to 100K GPUs to inference down at microsecond... ...and SW architects, kernel and compiler...
Senior
2100 NVIDIA USA
Santa Clara, CA
3 days ago
Principal AI Inference Systems Engineer
...AMD is looking for a Senior Staff AI Infra Engineer who is passionate... ...AI/ML workloads and GPU-accelerated computing... ...Optimize and accelerate LLM training and inference on AMD GPUs, improving kernel, communication, and... .../ML infrastructure, distributed systems, or...
Advanced Micro Devices , Inc.
Santa Clara, CA
3 days ago
Senior Kernel & Compiler Performance Engineer (GPU/AI)
...California is looking for a Member of Technical Staff for Kernel/Compiler/Communication. This critical role requires strong expertise in CUDA and GPU optimization, along with 5+ years of experience in performance engineering. The ideal candidate will design high-performance...
Senior
RadixArk
Palo Alto, CA
3 days ago
Senior AI Inference & Distributed Systems Engineer
...Advanced Micro Devices is seeking a strategic software engineering lead in Santa Clara, California. This role involves improving application... .... Key responsibilities include developing techniques for inference optimization and supporting the ROCm ecosystem expansion. A Bachelor...
Senior
Advanced Micro Devices , Inc.
Santa Clara, CA
3 days ago
Senior DL Algorithms Engineer - Inference Performance
$152k - $241.5k
We are looking for a Senior DL Algorithms Engineer for LLM/Omni model optimizations! Seeking senior engineers... ...of the hardware/software stack from GPU architecture to Deep Learning Framework... ...and Cosmos) on NVIDIA’s accelerated inference SW stack. Contribute new features,...
Senior
NVIDIA
Santa Clara, CA
2 days ago
Senior GPU Math Library Engineer: AI & HPC Kernel Lead
NVIDIA Gruppe is looking for a senior engineer to join their Math Libraries team in Santa Clara, California. This role involves designing... ...numerical linear algebra software on GPUs, with a strong focus on kernel generation. The ideal candidate has over 8 years of...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Compiler Engineer, AI Inference Performance
$152k - $241.5k
NVIDIA's invention of the GPU 1999 sparked the growth of the PC... ...an AI & Deep Learning Compiler Engineer. NVIDIA is hiring software engineers... ...been the backbone of NVIDIA’s inference engine, spanning across data... ..., such as PyTorch, JAX. GPU kernel authoring and performance...
Senior
NVIDIA
Santa Clara, CA
3 days ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
...individual to optimize and benchmark GenAI inference using the latest acceleration... ...industry benchmark results and architecting distributed inference systems. Required qualifications... ...Python or C++. A deep understanding of LLM architectures is necessary. The base salary...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior AI Kernel & Inference Engineer
...leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara,... ...groundbreaking AI systems software for inference applications including deep learning framework optimizations and GPU kernel technologies. You will closely collaborate...
Senior
NVIDIA Corporation
Santa Clara, CA
4 days ago
Senior Software Engineer - AI Inference
$152k - $241.5k
...built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by... ...orchestration to C++/CUDA kernels—using data to guide optimization... ...work. Improve multi‑GPU inference performance... .... Familiarity with distributed systems concepts and concurrency...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Deep Learning Framework Communications Engineer
...Visualization. The GPU, our invention, serves... ...‑communication kernels to showcase ultimate... ...experience) with 5+ software engineering and HPC/AI... ...as PyTorch, JAX, and inference engines such as TRT‑LLM, vLLM, SGLang Rapid... ...these areas: Training, Distributed inference, MoE, Reinforcement...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Software Development Engineer - SGLang and Inference Stack
...instrumental in enhancing GPU kernel performance, accelerating... ...enabling RL training and SOTA LLM and Multimodal inference at scale across multi-GPU... ...: Skilled engineer with strong technical and... ...and RL-training. ~ Distributed System Optimization: Tune...
Senior
Advanced Micro Devices , Inc.
Santa Clara, CA
5 days ago
Senior DL Compiler Engineer - MLIR & GPU Codegen
...NVIDIA Gruppe is seeking an experienced Compiler Engineer in Santa Clara to design and optimize compiler passes and infrastructure for GPU kernels. You'll work with a dynamic team and be involved in architecture decisions while collaborating across various teams. The ideal...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior LLM Inference Kernel Engineer (Distributed GPU). Be the first to apply!