Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal AI Inference Engineer Open-Source & GPU-Focused

$272k - $431.25k

NVIDIA Gruppe

NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing inference runtimes, improving efficiency, and mentoring other engineers. A strong background in systems engineering, LLM serving, and programming in Rust, C++, Python, and CUDA is required. The base salary ranges from $272,000 to $431,250, with additional equity and benefits. #J-18808-Ljbffr

Vacancy posted 23 hours ago
Similar jobs that could be interesting for youBased on the Principal AI Inference Engineer Open-Source & GPU-Focused in Santa Clara, CA vacancy
  •  ...Member of Technical Staff — CI Engineer to improve CI reliability for their open-source LLM inference engine. The role requires 3+ years...  ...CI/CD, knowledge of Linux and GPU computing, as well as strong skills...  ...about building world-class AI infrastructure, ensuring fast and... 
    Suggested

    RadixArk

    Palo Alto, CA
    1 day ago
  •  ...Advanced Micro Devices is seeking a principal software developer to join the ROCm GPU-compute team in Santa Clara, California. The ideal candidate will have over 10 years of software development experience in C/C++, Python, and GPU technologies. This role involves developing... 
    Suggested

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    23 hours ago
  •  ...computing experiences-from AI and data centers, to PCs,...  ...for a Senior Staff AI Infra Engineer who is passionate about...  ...benchmarks, with a special focus on AI/ML workloads and GPU-accelerated computing. As...  ...accelerate LLM training and inference on AMD GPUs, improving... 
    Principal

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    5 days ago
  • A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative... 
    Suggested

    NVIDIA Corporation

    Santa Clara, CA
    22 hours ago
  • $175.8k - $293k

     ...advantage. We are looking for a Principal AI Engineer to build our next-...  ...first) across AI pipelines, inference services, orchestration layers...  ...LLMs (commercial and/or open-source) including model selection,...  ...Kubernetes, model serving, GPU optimization). Our commitment... 
    Principal

    BMC Software

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...optimize and benchmark GenAI inference on NVIDIA's latest...  ...sits at the intersection of GPU performance engineering and public accountability....  ...workflows, and other emerging AI use cases. Collaborate with...  ...LLM, vLLM, SGLang, and other open-source projects. Partner with architecture... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

     ...NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing...  ...intersection of inference runtime architecture, GPU performance engineering, and distributed... 
    Principal

    NVIDIA Gruppe

    Santa Clara, CA
    22 hours ago
  • $184k - $356.5k

    NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...possible together. As a Principal AI/ML at JPMorgan...  ..., software engineering, and product management...  ...Engineers, focusing on best practices...  ...infrastructure, including inference, training,...  ...Acceleration (e.g., GPU, TPU, RDMA), or...  ...and optimizing open‑source ML frameworks. Recognized... 
    Principal

    Aumni

    Palo Alto, CA
    22 hours ago
  • $275.8k - $340.5k

     ...unique demands of AI and ML innovation,...  ...productivity of ML engineers, and drive the...  ...AI Validation & Inference: Ensures robust model...  ...Overview: The Principal AI/ML Engineer will...  ...specific tasks. The focus is on developing...  ...Experience with open-source orchestration platforms... 
    Principal
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    5 days ago
  •  ...experiences-from AI and data centers...  ...are seeking a Principal Software Quality Engineer to serve as the...  ...on AMD Instinct™ GPU platforms. You will...  ..., and the open-source community running...  ...LLM training and inference (PyTorch, vLLM,...  ...quality-engineering focus, including 5+... 
    Principal
    Contract work
    Shift work

    Advanced Micro Devices , Inc.

    San Jose, CA
    5 days ago
  • $296.3k

     ...unique demands of AI and ML innovation,...  ...productivity of ML engineers, and drive the...  ...AI Validation & Inference: Ensures robust model...  ...Overview: The Principal AI/ML Engineer will...  ...specific tasks. The focus is on developing...  ...Experience with open-source orchestration platforms... 
    Principal
    Local area
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  • $184k - $287.5k

     ...NVIDIA Gruppe in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a...  ...workloads while actively contributing to open-source projects. The offering includes a competitive... 

    NVIDIA Gruppe

    Santa Clara, CA
    23 hours ago
  • $152k - $241.5k

     ...NVIDIA Gruppe is seeking a Senior Software Engineer – AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building... 

    NVIDIA Gruppe

    Santa Clara, CA
    22 hours ago
  •  ...computing experiences—from AI and data centers,...  ...We are seeking a Principal GenAI Inference Optimization Engineer to join our Models...  ...team. This role focuses on improving performance...  ...workloads on AMD GPU platforms. You will...  ...where applicable, open-source projects for... 
    Principal

    Advanced Micro Devices , Inc.

    San Jose, CA
    23 hours ago
  • $165k - $242k

     ...Senior Software Engineer, Core Open-Source- Marimo Livingston, NJ / New York, NY / Sunnyvale, CA /...  ...CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave...  ...marimo's open-source ecosystem, focusing on marimo's backend and its Python ecosystem... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    1 day ago
  •  ...NVIDIA Gruppe is seeking a Principal AI and ML Infra Software Engineer to join our Hardware Infrastructure team in Santa Clara, CA. In this role, you'll...  ...efficiency by addressing infrastructure deficiencies for GPU Clusters, fostering innovations in AI/ML research. The... 
    Principal

    NVIDIA Gruppe

    Santa Clara, CA
    23 hours ago
  • $152k - $241.5k

    NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $174k - $252k

    Google Inc. is seeking a Senior Software Engineer in Sunnyvale, CA, to develop and enhance Cloud Dataproc, focusing on big data technologies like Hadoop and Spark. The...  ...distributed systems and working with open-source frameworks, while driving technical design for... 
    Full time

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • NVIDIA Corporation in Santa Clara seeks a Principal Software Engineer - AI Inference to advance open-source LLM serving. This hands-on role focuses on optimizing inference engines like vLLM and SGLang for NVIDIA GPUs, requiring deep technical skill and collaboration across... 
    Principal

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $272k - $431.25k

     ...NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California, to enhance the efficiency of AI/ML research on GPU Clusters. The role involves collaboration with various teams, monitoring infrastructure performance, and implementing... 
    Principal

    NVIDIA

    Santa Clara, CA
    22 hours ago
  • $110k - $300k

     ...redefining the future of AI with our...  ...Our talented team of engineers and industry‑leading executives...  ...applications. Improve inference efficiency and model compression...  ..., and contribute to open‑source projects when...  ...machine learning, with a focus on edge AI and lightweight... 

    TETRAMEM INC

    San Jose, CA
    23 hours ago
  • $124k - $195.5k

     ...NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have... 

    NVIDIA

    Santa Clara, CA
    22 hours ago
  • $195.2k - $275.58k

    The Software and AI (SAI) organization...  ...Software Development Engineer to contribute to...  ..., cross‑platform, open‑source performance library...  ...deep‑learning inference and training throughput...  ...—remote or onsite—focused on employee well‑...  ...on Linux GPU optimizations (OpenCL... 
    Local area
    Remote work
    Worldwide
    Flexible hours
    Shift work

    Intel Corporation

    Santa Clara, CA
    3 days ago
  • $230k - $250k

     ...available to any downstream query engine and use case (from traditional analytics to real-time AI / ML). We are a team...  ...bridge between the worlds of open source and enterprise: contributing directly...  ...) - fully paid so you can focus your energy on your newest addition... 
    Odd job
    Work at office
    Remote work

    OneHouse LLC

    Sunnyvale, CA
    5 days ago
  •  ...world’s largest AI chip, 56 times...  ...leading training and inference speeds and...  ...faster than GPU-based hyperscale...  ...Infrastructure Operations Engineer (SiteOps) is an...  ...role focused on the deployment...  ...Senior and Principal IC roles within...  ...GPU. Publish and open source their cutting-edge... 
    Internship
    Work at office

    Dormont Manufacturing Company

    Sunnyvale, CA
    23 hours ago
  • $272k - $425.5k

    Principal Software Engineer – Large-Scale LLM Memory and Storage...  ..., low-latency inference framework for serving generative AI and reasoning models...  ...orchestrates GPU shards, routes requests...  ...-LLM), with a focus on KV-cache...  ...external forums (open source, conferences, and... 
    Principal
    Local area
    Remote work

    NVIDIA

    Santa Clara, CA
    23 hours ago
  • $272k - $431.25k

     ...Dynamo is an innovative, open-source platform focused on efficient, scalable inference for large language and...  ...models in distributed GPU environments. By bringing...  ...high-performance AI inference for demanding...  ...and we’re searching for engineers enthusiastic about building... 
    Principal

    NVIDIA Gruppe

    Santa Clara, CA
    23 hours ago
  • $147k - $237.5k

     ...Integrity, and Inclusion. We weave AI into the fabric of...  ...Job Summary The ADEM engineering team is the engine of innovation...  ...; we create them. As a Principal Engineer focused on the Agent, you will be at...  ...history of contributing to open-source projects (e.g., related to... 
    Principal
    Permanent employment
    Full time
    Work at office
    Local area

    Palo Alto Networks

    Santa Clara, CA
    13 days ago
  • $152k - $241.5k

     ...is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by contributing directly to...  ...guide optimization work. Improve multi‑GPU inference performance and reliability:... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal AI Inference Engineer Open-Source & GPU-Focused. Be the first to apply!