Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior LLM Inference Kernel Engineer (Distributed GPU)

Advanced Micro Devices

Advanced Micro Devices in Santa Clara seeks a Senior ML Engineer focused on optimizing large language model inference runtimes. The role involves architecting distributed systems and enhancing performance across GPUs. Ideal candidates will have expertise in Python and C/C++, familiarity with deep learning frameworks, and a Master’s or PhD in a related field. Join us to push the boundaries of AI and shape impactful technology. We offer a culture of innovation and substantial benefits to our workforce. #J-18808-Ljbffr Advanced Micro Devices

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior LLM Inference Kernel Engineer (Distributed GPU) in Santa Clara, CA vacancy
  •  ...your career. THE ROLE As a senior member of the LLM inference framework team, you will be...  ...‑grade single-node and distributed inference runtimes for large...  ...intersection of inference engines, distributed systems, and GPU runtime and kernel backends. THE PERSON You are... 
    Senior

    Advanced Micro Devices

    Santa Clara, CA
    1 day ago
  • $184k - $356.5k

    A leading AI computing company in California is seeking a Senior Deep Learning Software Engineer focused on performance optimization of LLM models. You will analyze and enhance LLM inference performance, working in cross-collaborative teams to implement cutting-edge algorithms... 
    Senior
    Full time

    NVIDIA Corporation

    Santa Clara, CA
    13 hours ago
  • $184k - $287.5k

    Overview We are looking for a Senior Formal Verification Engineer for GPU Kernels. NVIDIA's Deep Learning Safety Team is hiring engineers to build verification...  ...of concurrent software. Experience building LLM agents with tool use and multi‑step reasoning, or with... 
    Senior
    Work experience placement

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...NVIDIA Gruppe is looking for a skilled engineer to join their TensorRT Edge-LLM team in Santa Clara, California. The role involves developing a state-of-the-art inference framework for large language models and optimizing it for real-time performance on embedded platforms... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...and motivated software engineers to join us and build AI inference systems that serve large...  ...inference stacks, optimize GPU kernels and compilers, drive...  ..., parallel programming, distributed systems, deep learning theories...  ...building and optimizing LLM inference engines (e.g.,... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...· Department: Backend Engineer · Work type: On-Site...  ...a real-time multimodal LLM for real life, transforming...  ...scalable, and resilient distributed systems. You’ll work...  ..., low-latency AI model inference and data services. Partner...  ...performance across GPU clusters, cloud infrastructure... 
    Senior
    Full time

    Neara

    Palo Alto, CA
    4 days ago
  • $184k - $287.5k

    Senior Formal Verification Engineer, GPU Kernels page is loaded## Senior Formal Verification Engineer, GPU Kernelslocations: US, CA, Santa Claratime type: Full...  ...of concurrent software.* Experience building LLM agents with tool use and multi-step reasoning, or with... 
    Senior
    Work experience placement

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $184k - $356.5k

    NVIDIA Gruppe is looking for a Senior Software Engineer specializing in Deep Learning Inference in Santa Clara, California. You will design and optimize GPU-accelerated software critical for advanced AI applications, contributing to libraries like vLLM and SGLang. Ideal... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • NVIDIA is seeking a Senior DL Algorithms Engineer to optimize LLM/Omni models and enhance performance across its software stack. The ideal candidate will...  ...of experience in deep learning, specifically in inference. This role involves profiling, analyzing bottlenecks,... 
    Senior

    NVIDIA

    Santa Clara, CA
    2 days ago
  • NVIDIA Gruppe is seeking a Senior Deep Learning Software Engineer focused on LLM performance in Santa Clara. You will optimize GPU-accelerated software for large language model deployment, working on performance tuning for various models. The ideal candidate has over 8... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...‑leading training and inference speeds and empowers machine...  ...10 times faster than GPU‑based hyperscale cloud...  ...Role We are hiring a Senior Performance Analyst to...  ..., SGLang, TensorRT‑LLM), GPU kernel‑level optimization toolchains...  ...with Product and Engineering to identify where... 
    Senior
    Contract work
    Shift work

    Cerebras

    Sunnyvale, CA
    13 hours ago
  • NVIDIA Gruppe is seeking a Senior Formal Verification Engineer for GPU Kernels, focused on creating verification tools that ensure correct behavior in various environments. This role involves designing verification tools, integrating AI into workflows, and participating... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

    NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

    NVIDIA Corporation is seeking a Senior Formal Verification Engineer for GPU Kernels in Santa Clara, CA. In this role, you will develop and deliver verification tools for GPU kernels, integrating AI into verification workflows. The ideal candidate has an MS or PhD in Computer... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $184k - $356.5k

    NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...looking for a strategic software engineering lead who is passionate about...  ...scale‑up and scale‑out inference. Develop methods and tooling...  ...sglang, or vllm and with kserve, llm‑d. Experience running...  ...used to optimize inference like distributed kv‑cache, disaggregation, request... 

    AMD

    Santa Clara, CA
    3 days ago
  •  ...looking for a strategic software engineering lead who is passionate about...  ...scale-up and scale-out inference. Develop methods and tooling...  ...sglang, or vllm and with kserve, llm-d. Experience running...  ...used to optimize inference like distributed kv-cache, disaggregation, request... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...Visualization. The GPU, our invention,...  ...motivated Deep Learning engineer to bring advanced CUDA features and Distributed Runtime...  ...including PyTorch, TRT-LLM, vLLM, SGLang, JAX...  ...up to 100K GPUs to inference down at microsecond...  ...and SW architects, kernel and compiler... 
    Senior

    2100 NVIDIA USA

    Santa Clara, CA
    3 days ago
  •  ...AMD is looking for a Senior Staff AI Infra Engineer who is passionate...  ...AI/ML workloads and GPU-accelerated computing...  ...Optimize and accelerate LLM training and inference on AMD GPUs, improving kernel, communication, and...  .../ML infrastructure, distributed systems, or... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  •  ...California is looking for a Member of Technical Staff for Kernel/Compiler/Communication. This critical role requires strong expertise in CUDA and GPU optimization, along with 5+ years of experience in performance engineering. The ideal candidate will design high-performance... 
    Senior

    RadixArk

    Palo Alto, CA
    3 days ago
  •  ...Advanced Micro Devices is seeking a strategic software engineering lead in Santa Clara, California. This role involves improving application...  .... Key responsibilities include developing techniques for inference optimization and supporting the ROCm ecosystem expansion. A Bachelor... 
    Senior

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

    We are looking for a Senior DL Algorithms Engineer for LLM/Omni model optimizations! Seeking senior engineers...  ...of the hardware/software stack from GPU architecture to Deep Learning Framework...  ...and Cosmos) on NVIDIA’s accelerated inference SW stack. Contribute new features,... 
    Senior

    NVIDIA

    Santa Clara, CA
    2 days ago
  • NVIDIA Gruppe is looking for a senior engineer to join their Math Libraries team in Santa Clara, California. This role involves designing...  ...numerical linear algebra software on GPUs, with a strong focus on kernel generation. The ideal candidate has over 8 years of... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $152k - $241.5k

    NVIDIA's invention of the GPU 1999 sparked the growth of the PC...  ...an AI & Deep Learning Compiler Engineer. NVIDIA is hiring software engineers...  ...been the backbone of NVIDIA’s inference engine, spanning across data...  ..., such as PyTorch, JAX. GPU kernel authoring and performance... 
    Senior

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...individual to optimize and benchmark GenAI inference using the latest acceleration...  ...industry benchmark results and architecting distributed inference systems. Required qualifications...  ...Python or C++. A deep understanding of LLM architectures is necessary. The base salary... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara,...  ...groundbreaking AI systems software for inference applications including deep learning framework optimizations and GPU kernel technologies. You will closely collaborate... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by...  ...orchestration to C++/CUDA kernels—using data to guide optimization...  ...work. Improve multi‑GPU inference performance...  .... Familiarity with distributed systems concepts and concurrency... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...Visualization. The GPU, our invention, serves...  ...‑communication kernels to showcase ultimate...  ...experience) with 5+ software engineering and HPC/AI...  ...as PyTorch, JAX, and inference engines such as TRT‑LLM, vLLM, SGLang Rapid...  ...these areas: Training, Distributed inference, MoE, Reinforcement... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...instrumental in enhancing GPU kernel performance, accelerating...  ...enabling RL training and SOTA LLM and Multimodal inference at scale across multi-GPU...  ...: Skilled engineer with strong technical and...  ...and RL-training. ~ Distributed System Optimization: Tune... 
    Senior

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    5 days ago
  •  ...NVIDIA Gruppe is seeking an experienced Compiler Engineer in Santa Clara to design and optimize compiler passes and infrastructure for GPU kernels. You'll work with a dynamic team and be involved in architecture decisions while collaborating across various teams. The ideal... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior LLM Inference Kernel Engineer (Distributed GPU). Be the first to apply!