Principal Software Engineer - Large-Scale LLM Memory and Storage Systems

$272k - $425.5k

NVIDIA

Principal Software Engineer – Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer – Large-Scale LLM Memory and Storage Systemslocations: US, CA, Santa Clara: US, WA, Remote: US, MA, Remotetime type: Full timeposted on: Posted Todayjob requisition id: JR2010271NVIDIA Dynamo is a high-throughput, low-latency inference framework for serving generative AI and reasoning models across multi-node distributed environments. Built in Rust for performance and Python for extensibility, Dynamo orchestrates GPU shards, routes requests, and manages shared KV cache across heterogeneous clusters so that many accelerators feel like a single system at datacenter scale. As large language models rapidly outgrow the memory and compute budget of any single GPU, this platform enables efficient, resilient deployment of cutting-edge LLM workloads.We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale LLM and storage systems.**What you'll be doing:*** Design and evolve a unified memory layer that spans GPU memory, pinned host memory, RDMA-accessible memory, SSD tiers, and remote file/object/cloud storage to support large-scale LLM inference.* Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT-LLM), with a focus on KV-cache offload, reuse, and remote sharing across heterogeneous and disaggregated clusters.* Co-design interfaces and protocols that enable disaggregated prefill, peer-to-peer KV-cache sharing, and multi-tier KV-cache storage (GPU, CPU, local disk, and remote memory) for high-throughput, low-latency inference.* Partner closely with GPU architecture, networking, and platform teams to exploit GPUDirect, RDMA, NVLink, and similar technologies for low-latency KV-cache access and sharing across heterogeneous accelerators and memory pools.* Mentor senior and junior engineers, set technical direction for memory and storage subsystems, and represent the team in internal reviews and external forums (open source, conferences, and customer-facing technical deep dives).**What we need to see:*** Masters or PhD or equivalent experience* 15+ years of experience building large-scale distributed systems, high-performance storage, or ML systems infrastructure in C/C++ and Python, with a track record of delivering production services.* Deep understanding of memory hierarchies (GPU HBM, host DRAM, SSD, and remote/object storage) and experience designing systems that span multiple tiers for performance and cost efficiency.* Distributed caching or key-value systems, especially designs optimized for low latency and high concurrency.* Hands-on experience with networked I/O and RDMA/NVMe-oF/NVLink-style technologies, and familiarity with concepts like disaggregated and aggregated deployments for AI clusters.* Strong skills in profiling and optimizing systems across CPU, GPU, memory, and network, using metrics to drive architectural decisions and validate improvements in TTFT and throughput.* Excellent communication skills and prior experience leading cross-functional efforts with research, product, and customer teams.**Ways to stand out from the crowd:*** Prior contributions to open-source LLM serving or systems projects focused on KV-cache optimization, compression, streaming, or reuse.* Experience designing unified memory or storage layers that expose a single logical KV or object model across GPU, host, SSD, and cloud tiers, especially in enterprise or hyperscale environments.* Publications or patents in areas such as LLM systems, memory-disaggregated architectures, RDMA/NVLink-based data planes, or KV-cache/CDN-like systems for ML.With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our special engineering teams are growing fast. If you're a creative and autonomous engineer with a genuine passion for technology, we want to hear from you!Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 425,500 USD.You will also be eligible for equity and .Applications for this job will be accepted at least until December 26, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #J-18808-Ljbffr NVIDIA

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Principal Software Engineer - Large-Scale LLM Memory and Storage Systems in Santa Clara, CA vacancy

Principal Software Engineer - Large-Scale LLM Memory and Storage Systems
$272k - $425.5k
Principal Software Engineer – Large-Scale LLM Memory and Storage Systems page is loaded## Principal Software Engineer – Large-Scale LLM Memory and Storage Systemslocations: US, CA, Santa Clara: US, WA, Remote: US, MA, Remotetime type: Full timeposted on: Posted Todayjob...
Suggested
Local area
Remote work
NVIDIA
Santa Clara, CA
1 day ago
Principal Software Engineer, Rack-Scale System Software — CSP Engagements
$272k - $431.25k
We're looking for a Principal Software Engineer to join our CSP Engagements team as the technical focal point for rack‑scale system SW/FW, working with CSP engineering teams to ensure they... ...system software, platform firmware, or large‑scale distributed systems engineering....
Suggested
Shift work
NVIDIA
Santa Clara, CA
10 hours ago
Senior Cloud Infrastructure Engineer, Large-Scale Systems
$174k - $253k
Google in Sunnyvale, CA is seeking a Software Engineer to develop next-generation technologies that transform how users connect and interact... ...reviews with peers and stakeholders. With a focus on large-scale system design and accessible technologies, candidates must have a...
Suggested
Google
Sunnyvale, CA
10 hours ago
Staff Platform Infra Engineer — Lead Large‑Scale Systems
$207k - $300k
A leading tech company in California is seeking a Software Engineer to work on large-scale systems and projects critical to its needs. The ideal candidate should have a Bachelor's degree and at least 8 years of relevant experience, including leadership in technical roles...
Suggested
Google
Sunnyvale, CA
10 hours ago
Principal Software Engineer, GPU Firmware and GPU System Software — CSP Engagements
$272k - $431.25k
We're looking for a Principal Software Engineer to join our CSP Engagements... ...GPU firmware and GPU system software, working... ...firmware at fleet scale. You will drive work... ...update orchestration for large-scale deployments —... ...execution, compute kernels, memory hierarchy, and how...
Suggested
NVIDIA
Santa Clara, CA
10 hours ago
Principal Software Engineer, At-Scale Reliability and Fleet Intelligence — CSP Engagements
$272k - $431.25k
We're looking for a Principal Software Engineer to join our CSP Engagements... ...point for fleet-scale reliability, working... ...enables you to distinguish systemic architectural gaps... ...failure modes in large‑scale GPU/accelerator... ...compute, interconnect, memory, power, and thermal domains...
NVIDIA
Santa Clara, CA
10 hours ago
Senior GenAI Software Engineer — Large-Scale ML Systems
$174k - $253k
Google Inc. is seeking a Senior Software Engineer to work on Generative AI technologies in Sunnyvale, CA. You will design, develop, and maintain large software systems, managing project priorities and collaborating closely with teams across the organization. With a focus...
Google Inc.
Sunnyvale, CA
1 day ago
Staff Software Engineer, Large-Scale Distributed Systems
Google is seeking a software engineer to work on high-impact projects within Search Ads Serving. You will... ...a key role in latency optimization and system reliability. Ideal candidates have extensive experience in C++, large-scale distributed systems, and software design,...
Google
Mountain View, CA
10 hours ago
Software Engineer, GDC LLM Serving and GPU Performance
$207k - $301k
...experience in software development.... ...degree or PhD in Engineering, Computer... ...information at massive scale, and extend... ...computing, large-scale system design, networking and data storage, security, artificial... ...re-inventing LLM serving by... ...compute and memory to unlock new...
Temporary work
Google
Sunnyvale, CA
10 hours ago
SR Principal Software Engineer - LLM Engineering
SR Principal Software Engineer - LLM Engineering and 1 more Job Information Job Identification... ...services, enabling scale across teams and functions.... ..., GNN serving platforms in large‑scale environments typical... ...optimization and distributed systems for large models focused on...
Full time
Shift work
JPMorganChase
Palo Alto, CA
10 hours ago
Principal Software Engineer — Agentic AI Applications and Foundations
$272k - $431.25k
...assistants and engineering-productivity... ...we need a principal-level, hands‑... ...harden production systems and the... ...ensure they scale. What you'll... ...like mature software, not prototypes... ...agent workflows, memory and context... ...patterns such as LLM‑powered... ...relevant to large‑scale agent collaboration...
Live in
NVIDIA
Santa Clara, CA
10 hours ago
Staff Software Engineer: AI-Driven Large-Scale Infrastructure
Google is seeking software engineers to join the AI and Infrastructure team responsible for operating systems from Kernel to Node userspace. The Agentic Engineering team advances... ...leading a distributed team, designing large-scale software, and delivering reliable, secure...
Google
Sunnyvale, CA
10 hours ago
Lead AI Engineer (AI Foundations, LLM Customization and Finetuning)
$197.3k - $225.1k
...Lead AI Engineer (AI Foundations, LLM Customization and Finetuning... ...and reliable AI systems, changing banking... ..., and support AI software components... ...model training, large language model inference... ...throughput — of large scale production AI... ..., Guardrails, Memory) using Python, C++...
Full time
Part time
Local area
Capital One
San Jose, CA
3 days ago
Lead AI Engineer (FM Hosting, LLM Inference)
$197.3k - $225.1k
...and reliable AI systems, changing... ...applied science and engineering teams to deliver... ...and support AI software components... ...model training, large language model... ...state‑of‑the‑art LLM optimization techniques... ...—of large‑scale production AI... ...VectorDBs, guardrails, memory) using Python,...
Local area
Capital One
San Jose, CA
4 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency... ...as fusion, autotuning, and memory/layout optimization; build and... ...Experience building and optimizing LLM inference engines (e.g., vLLM...
NVIDIA
Santa Clara, CA
1 day ago
Staff Software Engineer - AI & Large-Scale Infra Leader
Google is seeking a software engineer in Sunnyvale to build and scale infrastructure and distributed systems for next-generation technologies. You will lead design, development, testing, deployment, and maintenance of large-scale software solutions across teams. The role...
Google
Sunnyvale, CA
3 days ago
Senior Software Engineer - Systems
...transformative areas of large language models and agentic systems, our mission is to... ...of researchers and engineers who thrive on... ...and queryability at scale Develop production‑... ...Architect context and memory systems for conversational... ...Experience with LLM serving, agentic orchestration...
Boson AI
Santa Clara, CA
10 hours ago
Software Engineer - AI Agent Memory Infrastructure
$156k - $316.8k
...Software Engineer - AI Agent Memory Infrastructure Join ByteDance's AI Agent... ...the core memory systems that power next-generation... ...design and operate large-scale, low-latency, and... ...full lifecycle from storage and retrieval to... ...core technologies in LLM applications, including...
Temporary work
ByteDance
San Jose, CA
1 day ago
HPC Systems Software Engineer - Large-Scale DL & Image
$136.3k - $231.7k
KLA is hiring a Software Engineer in Milpitas, California, to join their HPC system software engineering team. The role involves designing and building software for large-scale deep learning and image processing workloads, while ensuring high quality and timely delivery...
KLA
Milpitas, CA
10 hours ago
Lead Staff Engineer, Large-Scale Infrastructure
$262k - $365k
...company in Sunnyvale seeks experienced software engineers to provide technical leadership on... ...development, especially in distributed systems and infrastructure. The role involves... ...managing project priorities, and developing large-scale software solutions. With a competitive...
Google
Sunnyvale, CA
10 hours ago
Principal System Software Engineer - AV Platform
$272k - $431.25k
...NVIDIA is seeking a highly motivated Principal System Software Engineer to drive next-generation... ...architecture, development, optimization, and scaling of foundational software... ...optimization initiatives across CPU, GPU, memory, storage, networking, and platform subsystems...
NVIDIA
Santa Clara, CA
1 day ago
Principal Systems Software Engineer
$272k
NVIDIA is seeking a Sr. Principal Systems Software Engineer for the Apache Spark Acceleration group. GPU accelerated... ...for accelerated computing to handle large data processing needs. Multi-node GPU... ...problems challenges at large scale * Provide recommendations and feedback...
Full time
Work experience placement
NVIDIA
Santa Clara, CA
4 days ago
Senior Infrastructure Engineer, Large-Scale AI & Platform
A leading technology company is seeking a Software Engineer to develop next-generation technologies that change how billions of users... ...participating in design reviews, and debugging complex issues across large-scale systems. Candidates should have a strong foundation in programming...
Google Inc.
Sunnyvale, CA
10 hours ago
Agentic AI Systems Engineer
$152k - $208.5k
...leader in materials engineering solutions used to... ...Overview As a Software Engineer at Applied... ...expertise in intricate systems, deciphering code,... ...agents, tools, memory, planning, validation... ..., runtimes) in large‑scale engineering environments... ...to on‑prem LLM deployment and optimization...
Full time
Relocation
Applied Materials
Santa Clara, CA
2 days ago
Senior Systems Software Engineer, Kubernetes Scale - DGX Cloud
$184k - $287.5k
...edge hardware and software innovation to deliver... ...of innovative engineers dedicated to solving... ...Senior Systems Software Engineer... ...world problems at scale. In this pivotal role... ...clusters at ultra‑large scale, ensuring reliability... ..., Networking, Storage systems, Accelerators...
Worldwide
NVIDIA
Santa Clara, CA
10 hours ago
AI Engineering Intern, Voice & LLM Systems
...drive‑thrus, operating at scale in complex, noisy,... ...technical, entrepreneurial engineering culture. We place a... ...work on real‑world AI systems alongside a high‑performance... ...at the intersection of large language models, real‑... .... Experiment with LLM‑based systems, including...
Full time
Internship
Presto Phoenix, Inc.
Palo Alto, CA
10 hours ago
Senior Software Engineer - TensorRT Edge-LLM
$152k - $241.5k
...limits of real-time large language model... ...NVIDIA’s TensorRT Edge-LLM team and help... ...robotics. We build the software stack that enables... .../Computer Engineering, or a closely related... ...tensor parallelism, or memory‑efficient... ...autoregressive LLM serving systems, including...
NVIDIA
Santa Clara, CA
10 hours ago
Senior Software Engineer - NVLink Rack Scale Stability and Reliability
$152k - $241.5k
...for highly motivated Senior Software Engineers to join our Fabric Networking... ...focus on NVLink Rack-Scale Systems Stability & Reliability. In... ...diagnostics, recovery, and large-scale AI infrastructure, contributing... ..., including PCIe, memory hierarchy, DMA, high-speed interconnects...
NVIDIA
Santa Clara, CA
10 hours ago
Principal Software Engineer, E2E Performance and Goodput — CSP Engagements
$272k - $431.25k
...'re looking for a Principal Engineer to join our CSP Engagements... ...and drive systemic improvements in... ...latest NVIDIA rack-scale systems, GPU architectures... ...GPU capabilities, memory hierarchy changes,... ...configuration, software, or workload... ...dynamics (vLLM, TensorRT-LLM, SGLang,...
NVIDIA
Santa Clara, CA
10 hours ago
Principal, Software Engineer - GenAI Initiative
$143k - $286k
...customer experiences. We seek a Principal, Software Engineer with deep expertise in... ...deployment of advanced AI systems. This role involves architecting... .... Develop and optimize LLM‑based agents for tasks like... ...Drive technical direction for large‑scale AI initiatives and mentor...
Full time
Temporary work
Part time
Walmart
Sunnyvale, CA
10 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Software Engineer - Large-Scale LLM Memory and Storage Systems. Be the first to apply!