Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Deep Learning Software Engineer, LLM Performance

$184k - $287.5k

NVIDIA

We are now looking for a Senior Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate about analyzing and improving the performance of LLM inference! NVIDIA is rapidly growing our research and development for Deep Learning Inference and is seeking excellent Software Engineers at all levels of expertise to join our team. Companies around the world are using NVIDIA GPUs to power a revolution in deep learning, enabling breakthroughs in areas like LLM, Generative AI, Recommenders and Vision that have put DL into every software solution. Join the team that builds the software to enable the performance optimization, deployment and serving of these DL solutions. We specialize in developing GPU-accelerated Deep learning software like TensorRT, DL benchmarking software and performant solutions to deploy and serve these models.

Collaborate with the deep learning community to implement the latest algorithms for public release in TensorRT LLM, VLLM, SGLang and LLM benchmarks. Identify performance opportunities and optimize SoTA LLM models across the spectrum of NVIDIA accelerators, from datacenter GPUs to edge SoCs. Implement LLM inference, serving and deployment algorithms and optimizations using TensorRT LLM, VLLM, SGLang, Triton and CUDA kernels. Work and collaborate with a diverse set of teams involving performance modeling, performance analysis, kernel development and inference software development.

What you'll be doing:

  • Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment in NVIDIA/OSS LLM frameworks.

  • Scale performance of LLM models across different architectures and types of NVIDIA accelerators.

  • Scale performance for max throughput, minimum latency and throughput under latency constraints.

  • Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton.

  • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding to develop innovative solutions.

What we need to see:

  • Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI).

  • At least 8 years of relevant software development experience.

  • Excellent Python/C/C++ programming, software design and software engineering skills

  • Experience with a DL framework like PyTorch, JAX, TensorFlow.

Ways to stand out from the crowd:

  • Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation

  • Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance application

  • Architectural knowledge of CPU and GPU

  • GPU programming experience (CUDA or OpenCL)

GPU deep learning has provided the foundation for machines to learn, perceive, reason and solve problems posed using human language. The GPU started out as the engine for simulating human imagination, conjuring up the amazing virtual worlds of video games and Hollywood films. Now, NVIDIA's GPU runs deep learning algorithms, simulating human intelligence, and acts as the brain of computers, robots and self-driving cars that can perceive and understand the world. Just as human imagination and intelligence are linked, computer graphics and artificial intelligence come together in our architecture. Two modes of the human brain, two modes of the GPU. This may explain why NVIDIA GPUs are used broadly for deep learning, and NVIDIA is increasingly known as “the AI computing company.” Come, join our DL Architecture team, where you can help build the real-time, cost-effective computing platform driving our success in this exciting and quickly growing field.

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits ( .

Applications for this job will be accepted at least until April 20, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Deep Learning Software Engineer, LLM Performance in Santa Clara, CA vacancy
  • $224k - $356.5k

     ...We are looking for a Senior Deep Learning Software Engineer to design and build our automated inference and...  ...designing and implementing a high-performance execution environment, low-level GPU...  ...inference software solutions (TRT, TRT-LLM, TRT Model Optimizer) can maintain... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...AI and machine learning to solve some of...  ...talented and motivated engineers to join our...  ...-leading deep learning inference software for NVIDIA AI accelerators...  .... As a Senior Software Engineer...  ...TensorRT and TensorRT-LLM to supercharge...  ...close-to-metal performance analysis,... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...NVIDIA's TensorRT Edge-LLM team and help shape...  ...robotics. We build the software stack that enables Large...  ...to deliver high-performance, production-ready solutions...  ..., Electrical/Computer Engineering, or a closely related...  ...development experience. ~ Deep understanding of... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...outstanding AI systems engineers to develop...  ...the inference systems software stack! We build innovative...  ...implementations, new LLM inference runtimes components...  ...engineers at NVIDIA across deep learning frameworks, libraries,...  ...development and performance optimizations (especially... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    5 days ago
  •  ...optimizing and developing deep learning frameworks for AMD...  ...enhancing GPU kernel performance, accelerating deep learning...  ...RL training and SOTA LLM and Multimodal...  ...collaborate across internal GPU software teams and engage with...  ...: Skilled engineer with strong technical... 
    Senior
    Performance

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

     ...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor...  ...developing and maintaining high-performance deep learning frameworks, including...  ...for state-of-the-art LLM and Generative AI models across NVIDIA... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...Artificial Intelligence, High Performance Computing and Visualization...  ...looking for a motivated Deep Learning engineer to bring advanced CUDA features...  ..., including PyTorch, TRT-LLM, vLLM, SGLang, JAX, etc....  ...systems principles (aka systems software fundamentals) ~... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...advanced computer vision and deep learning. Our team builds large-...  ...are looking for a seasoned software engineer to help build video analytics...  ...focusing on usability, accuracy, performance, and scalability. This is...  ...and AutoML for vision and LLM/VLM model enhancement.... 
    Senior
    Performance
    Worldwide

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...We are now looking for a Senior Deep Learning Software Engineer, TensorRT Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate about analyzing...  ...) and inference libraries (e.g. TensorRT, TensorRT-LLM, vLLM, SGLang, FlashInfer). Experience with... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...NVIDIA seeks a senior software engineer to join the AI Networking co-design...  ...productizing machine learning tools. These include...  ...working on distributed Deep Learning, particularly within LLM training and inference...  ...(ML) for comprehensive performance analysis and optimization... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    5 days ago
  •  ...Senior AI Engineer - RAG Systems Bright.AI is a high-growth...  ...Senior AI Engineer - LLM, RAG to lead the development...  ...strategies to measure performance, accuracy, and user...  ...Science, AI, Machine Learning, or a related field,...  ...specialization in NLP or deep learning. Strong... 
    Senior
    Performance
    Immediate start

    BrightAI Corporation

    Palo Alto, CA
    5 days ago
  • $184k - $287.5k

    Senior Software Engineer, RL Post-Training Frameworks page is loaded##...  ...2015863Reinforcement learning post-training is...  ...their need optimizing deep learning frameworks,...  ..., CPUs, and LPUs for performance where it matters, contributing...  ...learning for LLM post-training (RLHF,... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

     ...highly skilled and motivated software engineers to join us and build AI...  ...architect and implement high-performance inference stacks, optimize...  ...programming, distributed systems, deep learning theories. Knowledgeable...  ...building and optimizing LLM inference engines (e.g.,... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

    Senior DL Software Engineer, Model Optimization and Edge Deployment - Autonomous...  ...are seeking a high-caliber Deep Learning Engineer to bridge the gap...  ...SOTA algorithms to make LLM/VLM fast, lean, and reliable...  ..., etc. to boost E2E model performance for production deployments... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $229.9k - $262.4k

     ...Senior Lead AI Engineer (LLM Gateway, FM Hosting) Overview: At...  ...leader in using machine learning to create real-time,...  ...— along with our deep experience in...  ...and scalable, high-performance AI infrastructure. At...  ...deploy, and support AI software components including... 
    Senior
    Performance
    Full time
    Part time
    Local area

    Capital One

    San Jose, CA
    14 days ago
  • $184k - $356.5k

    A leading AI computing company in California is seeking a Senior Deep Learning Software Engineer focused on performance optimization of LLM models. You will analyze and enhance LLM inference performance, working in cross-collaborative teams to implement cutting-edge algorithms... 
    Senior
    Performance
    Full time

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $224k - $356.5k

     ...motivated and experienced Senior Software Engineer to join us. At a company celebrated...  ...Agentic development, and Deep Learning. Among other things, our...  .... Optimize the performance and reliability of cloud applications...  ...content generation using LLM and Generative AI models.... 
    Senior
    Performance
    Full time

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...leading technology company in Santa Clara is seeking a Senior Deep Learning Software Engineer focused on optimizing NVIDIA's inference ecosystem....  ...should have a solid understanding of GPU architectures and performance analysis. The position offers competitive salary and... 
    Senior
    Performance
    Remote job

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU...  ...are looking for a highly motivated senior software engineer for an exciting role in our...  ...products in High Performance Computing and Deep Learning. What you will be doing:... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

    A leading technology company is looking for a Senior Software Engineer to work on performance optimizations in deep learning frameworks using JAX. The role involves designing core components and collaborating with AI researchers. Candidates should have a BS in Computer... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...We are now looking for a Senior Deep Learning Architect, LLM Inference! NVIDIA is at...  ...focuses on inference server performance optimization for Large...  ...boundaries of GPU hardware and software performance and...  ...achievements. Collaborate with engineers from AI startup companies... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...NVIDIA is hiring senior engineers to develop its AI platform and more specifically its performance optimizations in deep learning frameworks using JAX, a tool that can differentiate between...  ..., numeric libraries, modular software design. ~ Highly motivated with excellent... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...Senior Software Engineer For Compiler Team NVIDIA's GPUs are at the core of modern AI infrastructure...  ...for a compiler team within NVIDIA's deep learning software organization. This team...  ...and execution stack, targeting high-performance kernel generation for deep learning... 
    Senior
    Performance
    Work experience placement

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep Learning by helping build...  ...be scaled to multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ..., container, GPU, and systems engineers. When useful, you will apply machine learning and deep learning techniques for categorization...  .../prediction) inside existing software workflows. What we need to...  ...and HPC / large-scale or performance-sensitive environments... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...NVIDIA is seeking outstanding senior engineers to work on the CUDA driver,...  .... You will join a versatile software engineering team that...  ...unlock the full potential and performance of NVIDIA hardware across diverse workloads like deep learning, scientific research, autonomous... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...plus top AI researchers who specialize in software engineering, logical reasoning, STEM,...  ...proprietary intelligence with systems that perform reliably, deliver measurable impact, and...  ...software using modern languages and tools. Deep understanding of software architecture,... 
    Senior
    Performance
    For contractors
    Remote work
    Flexible hours

    Turing

    Santa Clara, CA
    2 days ago
  • $168k - $270.25k

     ...Senior Engineer For Factory Infrastructure And Automation...  ...optimizes and serves performant inferencing for every...  .... You will apply your deep technical expertise to...  ...intersecting our prowess in deep learning and computing, with...  ...hardware and software environments. You will... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $224k - $356.5k

     ...understood using advanced computer vision and deep learning. Our team builds large-scale distributed...  ...visibility and real-world impact. As a System Software Engineer for Vision AI, you will develop and optimize high-performance vision systems that turn massive streams of... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...where everyone is motivated to perform at their highest level. Come...  ...with Data Analytics, Test Engineering, and System Integration &...  ...analysis, data analysis, and software architecture. ~ Strong software...  ...-offs between End-to-End deep learning approaches, classical... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Deep Learning Software Engineer, LLM Performance. Be the first to apply!