Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

High-Performance AI Inference Engineer (TensorRT)

$124k - $195.5k

NVIDIA Gruppe

NVIDIA Gruppe is looking for a passionate Software Engineer to join its TensorRT team in Santa Clara, California. This role involves designing and developing high-performance AI inference solutions while contributing to performance optimizations and collaborating with various teams. The ideal candidate possesses a Master's or PhD in relevant fields and strong C++ skills. The position offers a competitive base salary ranging from 124,000 USD to 195,500 USD based on location and experience. #J-18808-Ljbffr NVIDIA Gruppe

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the High-Performance AI Inference Engineer (TensorRT) in Santa Clara, CA vacancy
  •  ...Corporation is looking for a passionate Software Engineer to join the TensorRT team in Santa Clara, California. You will...  ...in deep learning and work with cutting-edge AI technology, contributing to high-performance AI inference solutions. Your role involves designing and... 
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

    NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...and benchmark GenAI inference on NVIDIA's latest...  ...the industry’s performance standards across language...  ...directly within TensorRT-LLM, SGLang, and...  ...GPU performance engineering and public accountability...  ...other emerging AI use cases....  ...learning inference or high-performance systems... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...in the age of Generative AI? Join NVIDIA’s TensorRT team to help lead a first...  ...point for out-of-framework inference globally. We are moving...  ...of AI agents to produce high-performance, high-quality, modern C++...  ...are a systems‑thinking C++ engineer who wants to help scale... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

    Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software...  ...multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning inference. Closely... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...Role: AI Inference Engineer Location: San Jose, CA Duration: 6 to...  ...Overview: We are seeking a highly skilled AI Inference Engineer...  ...our team and drive the performance, scalability, and reliability...  ..., Triton Inference Server, TensorRT-LLM, TorchServe, or KServe... 
    Performance

    Triune Infomatics Inc

    San Jose, CA
    17 hours ago
  • $100k

    High Speed AI Interconnect Signal Integrity Engineer Tenstorrent is leading the industry on cutting‑edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI...  ...technologies for next‑generation AI inference and training clusters. This... 
    Performance
    Permanent employment

    Tenstorrent Inc.

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...unlimited potential of AI to define the next era...  ...Develop reproducible, high-fidelity evaluation frameworks covering performance, quality and developer...  ...distributed training, and inference/serving—and with model/...  ...Computer Science, Electrical Engineering, or related field (or... 
    Performance

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...learning ignited modern AI — the next era of...  ...Deep Learning Compiler Engineer. NVIDIA is hiring software...  ...the backbone of NVIDIA’s inference engine, spanning across...  ...deliver leading inference performance, fast build time,...  ...kernel generation with high performance and fast build... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

    NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Software Engineer focused on improving performance of deep learning inference software like TensorRT. The ideal candidate will have a strong foundation in C++ and Python, relevant experience with deep learning frameworks... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $248.71k - $292.6k

     ...Groq delivers fast, efficient AI inference. Our LPU-based system...  ...we are on a mission to make high performance AI compute more accessible...  ...Build fast. Sr. Staff Software Engineer - High Performance GPU...  ...production (e.g., Triton, TensorRT, or custom GPU services). Deploying... 
    Performance

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    4 days ago
  • $184k - $356.5k

    NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $124k - $195.5k

    NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have... 
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

    NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...tapping into the unlimited potential of AI to define the next era of computing. An era...  ...platforms, identifying fundamental performance limiters. Prioritize and solve performance...  ...most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $229.9k - $262.4k

     ...Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) Overview: At Capital One, we are creating responsible and...  ...capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring... 
    Performance
    Full time
    Part time
    Local area

    Capital One Financial Corp

    San Jose, CA
    4 days ago
  • $152k - $204k

     ...Senior Software Engineer, Inference Sunnyvale, CA / Bellevue...  ...The Essential Cloud for AI™. Built for pioneers...  ...superior infrastructure performance with deep technical...  ...custom accelerators for high-efficiency workloads...  ...frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve,... 
    Performance
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    3 days ago
  • $120.1k - $225.7k

     ...Entails End-to-End Inference Optimization: Lead the optimization...  ...: Design and implement high-performance inference frameworks;...  ...team members to build a robust AI inference technical ecosystem...  ...Computer Science, Electronic Engineering, AI, or related fields;... 
    Performance
    Relocation package

    Tencent

    Palo Alto, CA
    2 days ago
  •  ...Devices is seeking a strategic software engineering lead in Santa Clara, California. This role involves improving application performance and engaging in sophisticated software engineering...  ...include developing techniques for inference optimization and supporting the ROCm... 
    Performance

    Advanced Micro Devices

    Santa Clara, CA
    3 days ago
  •  ...time large language model inference? Join NVIDIA’s TensorRT Edge‑LLM team and help...  ...next generation of edge AI for automotive and robotics...  ...and robotics to deliver high‑performance, production‑ready solutions...  ...Science, Electrical/Computer Engineering, or a closely related... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $124k - $195.5k

    Deep Learning Software Engineer, TensorRT Performance We are now looking for a Deep Learning...  ...performance of NVIDIA’s inference ecosystem. NVIDIA is...  ...breakthroughs in areas like Generative AI, Recommenders and Vision...  ...employer. As we highly value diversity in our current... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $132k - $330k

    Software Engineer, AI Inference Codesign The AI inference co-design team's goal is to take research...  ...cutting-edge MLIR compiler and solve the performance bottlenecks needed to achieve real-...  ...utilization and throughput Implement/improve highly performant micro kernels for Tesla's... 
    Performance
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla Motors, Inc.

    Palo Alto, CA
    22 hours ago
  •  ...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing...  ...with distributed, high-performance software design and implementation...  ...fields Experience with inference servers/model serving frameworks (such as TensorRT-LLM, vLLM, SGLang, etc.)... 
    Performance
    Work experience placement
    3 days per week

    d-Matrix

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

    2100 NVIDIA USA in Santa Clara is seeking a Senior High Performance AI Engineer to design and optimize cutting-edge AI systems. The role involves collaboration with software and hardware teams to create innovative runtimes and orchestration tools for the CUDA ecosystem... 
    Performance

    2100 NVIDIA USA

    Santa Clara, CA
    1 day ago
  • $188k - $275k

     ...Essential Cloud for AI™. Built for...  ...superior infrastructure performance with deep...  ...What You'll Do: Inference Platform Team The...  ...powering low-latency, high-throughput AI workloads...  ...a Staff Software Engineer (IC5) on the...  ...as vLLM, Triton, TensorRT-LLM, Ray Serve, or... 
    Performance
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    8 days ago
  •  ...generation computing experiences-from AI and data centers, to PCs, gaming...  ...for a Senior Staff AI Infra Engineer who is passionate about improving the performance of key applications and benchmarks...  ...and accelerate LLM training and inference on AMD GPUs, improving kernel, communication... 
    Performance

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    22 hours ago
  •  ...AI Platform Engineer - Training & Inference Saviynt's AI-powered identity platform manages and governs human...  ...), and NVIDIA Triton (TensorRT/ONNX) as a unified deployment graph...  ...memory sharing • Optimise inference performance: configure fractional GPU allocation... 
    Performance

    Saviynt

    Milpitas, CA
    3 days ago
  •  ...We're a team of engineers, clinicians, and innovators...  ...helps care teams perform with greater precision...  .... As a Senior AI/ML Research Engineer...  ...multimodal stack-where a high-level model...  ...Real-time / edge inference optimization (e.g., TensorRT, NVIDIA Jetson).... 
    Performance
    Local area
    Worldwide
    Flexible hours

    Intuitive

    Sunnyvale, CA
    2 days ago
  • $152k - $241.5k

    NVIDIA Gruppe is seeking a passionate C++ engineer to join the TensorRT team in Santa Clara, California. This role focuses on architecting an AI-native framework, utilizing advanced...  ...learning techniques and optimizing performance through agent-based workflows. If you... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to High-Performance AI Inference Engineer (TensorRT). Be the first to apply!