Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Infra Intern: GPU Kernel Optimization & LLM Profiling

$19 - $65 per hour

A Medium Corporation

PlusAI is seeking a Machine Learning Infrastructure Engineer Intern to advance their AI-based virtual driver software. The role involves identifying bottlenecks in BEV model training and implementing high-performance custom kernels using CUDA or C++. Interns will explore using LLMs for code generation to optimize processes. The position offers competitive hourly pay ranging from $19 to $65, based on experience and education level. Join PlusAI for hands-on work in an innovative, dynamic field with opportunities for personal and professional growth. #J-18808-Ljbffr Medium

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the ML Infra Intern: GPU Kernel Optimization & LLM Profiling in Santa Clara, CA vacancy
  • $19 - $65 per hour

     ...a Machine Learning Infrastructure Engineer Intern to work on high-performance kernels for BEV model training. In this role, you will...  ...internship also explores the use of LLMs to optimize code generation and performance profiling. The hourly pay ranges from $19 to $65 based... 
    Internship
    Hourly pay

    PlusAI

    Santa Clara, CA
    1 day ago
  • $207k - $300k

    Software Engineer, GDC LLM Serving and GPU Performance Google...  ...sequential decision making), ML infrastructure, or...  .... You could be optimizing KV cache transfer mechanisms...  ...down to performance profiling, ensuring Google’s...  ...GPU libraries and kernels. Collaborate with research... 
    Suggested
    Full time

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $272k - $431.25k

     ...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join our Hardware Infrastructure team. As an Engineer, you...  ...long‑term roadmaps for such initiatives. Monitor and optimize the performance of our infrastructure ensuring high availability... 
    Suggested

    Jobleads-US

    Santa Clara, CA
    2 days ago
  •  ...establish best practices and optimize performance from the lowest-level GPU kernels to large-scale...  ...the C++/HIP/CUDA core of ML frameworks like PyTorch,...  ...stay at the forefront of LLM advancements, showing familiarity...  ...experience using GPU profiling and performance analysis... 
    Suggested

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    3 days ago
  •  ...NVIDIA Gruppe is seeking a Principal AI and ML Infra Software Engineer to join our Hardware Infrastructure team in Santa Clara, CA. In this...  ...efficiency by addressing infrastructure deficiencies for GPU Clusters, fostering innovations in AI/ML research. The ideal... 
    Suggested

    Jobleads-US

    Santa Clara, CA
    2 days ago
  • $272k - $431.25k

    NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California, to enhance the efficiency of AI/ML research on GPU Clusters. The role involves collaboration with various teams, monitoring infrastructure performance, and implementing improvements... 

    Jobleads-US

    Santa Clara, CA
    2 days ago
  • $168k - $258.75k

     ...coding agents synthesize, optimize, and deploy GPU kernels automatically. This job...  ..., you will act as the internal champion for AI agents and LLM-based coding workflows...  ...agents with compilers, profilers, execution sandboxes,...  ...platform products in AI, ML infrastructure, or high... 

    Nvidia Corporation

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...libraries, code generators, and GPU kernel technologies for NVIDIA's...  ...attention kernel implementations, new LLM inference runtimes components,...  ...Designing, implementing, and optimizing kernels for high impact AI...  ...academic/ industry) experience with ML/DL systems development... 

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $272k - $425.5k

     ...Engineer – Large-Scale LLM Memory and Storage...  ...orchestrates GPU shards, routes requests...  ...the team in internal reviews and external...  ...performance storage, or ML systems infrastructure...  ...especially designs optimized for low latency and...  ...* Strong skills in profiling and optimizing... 
    Local area
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    22 hours ago
  • $184k - $287.5k

     ...Learning Software Engineer, LLM Performance page is...  ...enable the performance optimization, deployment and serving...  ...in developing GPU-accelerated Deep learning...  ...SGLang, Triton and CUDA kernels. Work and collaborate with...  ...performance modeling, profiling, debug, and code optimization... 

    NVIDIA

    Santa Clara, CA
    22 hours ago
  • $19 - $65 per hour

     ...s Scania, MAN, and International brands, Hyundai Motor...  ...performance custom kernels (using CUDA, Triton...  ...generation, kernel optimization, and automated performance profiling with Nsight and...  ...by both human and LLM-assisted workflows to maximize GPU utilization and reduce... 
    Internship
    Hourly pay

    PlusAI, Inc.

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...state‑of‑the‑art model optimization techniques—speculative...  ...and efficient attention kernels optimized for KV‑caching...  ..., layer‑by‑layer model profiling to identify compute and...  ...with modern LLM/VLM inference stacks, such...  ...Strong understanding of GPU architecture, the compilation... 

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...connectivity. Key Responsibilities Design and optimize software for deploying large-scale AI models in production vehicles. Profile and improve inference performance across...  ...writing and optimizing CPU or CUDA kernels. Experience developing performance tooling... 
    Internship

    XPENG

    Santa Clara, CA
    1 day ago
  • $124k - $195.5k

    Deep Learning Kernel Software Performance Architect -...  ...computing. An era in which our GPU acts as the brains of...  ...issues* Engage AI/ML training and inference...  ...teams to identify and optimize critical deep learning...  ...performance analysis and profiling to identify performance... 
    Work experience placement

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...is seeking a Senior Software Engineer to lead the optimization of distributed training across large-scale GPU platforms. Candidates should have substantial experience...  ...and technical leadership. This role involves profiling end-to-end workloads, debugging complex systems,... 

    NVIDIA Gruppe

    Santa Clara, CA
    22 hours ago
  • $159.05k - $199.3k

     ...software engineer with deep experience in optimizing ML models and deploying them on production‑...  ...solutions Set up methodologies to profile the model performance on target embedded...  ...years of experience with ML accelerators, GPU, CPU, SoC architecture and micro‑architecture... 
    Full time
    For contractors
    For subcontractor

    Decisive Point

    Sunnyvale, CA
    3 days ago
  •  ...for inference applications including deep learning framework optimizations and GPU kernel technologies. You will closely collaborate with other...  ...Computer Science, strong programming skills in C/C++, and significant experience with ML frameworks. #J-18808-Ljbffr NVIDIA

    NVIDIA

    Santa Clara, CA
    4 days ago
  •  ...Training | Optimisation | GPU | Hybrid, San Jose, CA...  ...: Productize and optimize models from Research into...  .../time-to-train using profiling and optimization. Implement...  .... Partner with ML Ops on CI/CD, telemetry...  ...Research, Platform/Infra, Data, and Product functions... 

    Enigma

    San Jose, CA
    1 day ago
  • $152k - $241.5k

    NVIDIA seeks an experienced engineer for AI-based GPU compiler technology in Santa Clara, California. The role involves designing technology...  ...ideal candidate holds an M.S. or Ph.D., has over 5 years in AI/ML, and skills in Python and C++. Competitive salaries range from $1... 

    NVIDIA

    Santa Clara, CA
    2 days ago
  • NVIDIA Gruppe is seeking a Senior Deep Learning Software Engineer focused on LLM performance in Santa Clara. You will optimize GPU-accelerated software for large language model deployment, working on performance tuning for various models. The ideal candidate has over 8... 

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $170.5k - $315.49k

     ...reasoning-intensive tasks.* Debug and optimize training runs — Profile training jobs, resolve bottlenecks, improve GPU utilization, and address numerical...  ...engineering, data science or ML research* Proficient in Python* Proficient in LLM architectures, optimization and model... 
    Internship
    Local area
    Shift work

    Intel

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...innovative technologies in machine learning compilers and AI systems. The ideal candidate will have strong expertise in AI/ML, particularly in compiler optimization and low-level programming, along with 5+ years of relevant experience. This position offers a competitive salary... 

    2100 NVIDIA USA

    Santa Clara, CA
    3 days ago
  • $170.5k - $315.49k

    ## Inference Optimization Engineer (local / edge...  ...edge environments — GPU/iGPUs, Vulkan...  ...# What you’ll do* Profile and optimize local...  ...will develop:** The internals of modern inference...  ...* Understands how LLM inference works (attention...  ...) or SIMD / CPU kernels* Familiarity with... 
    Internship
    Local area
    Immediate start
    Shift work

    Intel

    Santa Clara, CA
    3 days ago
  • $248.71k - $292.6k

     ...Engineer - High Performance GPU Inference Systems...  .... Low‑Level GPU Optimization : Build deterministic...  ...Diagnostics : Develop profiling, observability, and diagnostics...  ...with teams across ML compilers,...  ...hierarchies, streams, kernels), OS internals, parallel algorithms,... 

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    2 days ago
  • $50 - $175 per hour

    Title: AI Infrastructure / ML Infrastructure Engineer Job Type: Contract Contract...  ...involves provisioning, managing, and optimizing high-performance GPU clusters and infrastructure to...  ...CloudFormation. Building and maintaining the internal "Model Hub" for versioning and... 
    Remote job
    Contract work
    Immediate start

    DeWinter Group

    Campbell, CA
    1 day ago
  • $272k - $431.25k

     ...distributed training, GPU architecture, systems...  ...You will analyze and optimize frontier‑scale LLM workloads running on...  ...parallelism strategy, kernel efficiency, framework...  ...computing, ML frameworks, compilers...  ...track record of using profiling, tracing, benchmarking... 

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...Technical Staff — Kernel / Compiler / Communication...  ...understanding of GPU architecture and...  ...writing or optimizing high-performance kernels...  ...Strong debugging and profiling skills at system level...  ...to kernel/compiler/ML systems open source...  ..., the fastest open LLM serving engine),... 
    Flexible hours

    RadixArk

    Palo Alto, CA
    3 days ago
  • Sr. Product Manager - Runtime Infra, AI/ML, Annapurna Labs (Cupertino)...  ...training and inference at scale, optimal orchestration and efficient...  ...with Linux systems and kernel development Track record of...  ...as performance optimization, profiling and tooling Experience with... 

    Downtown Boulder Partnership

    Cupertino, CA
    2 days ago
  • $224k - $356.5k

     ...execution environment, low-level GPU optimizations and developing custom GPU kernels in CUDA and/or Triton....  ...solution. Analyze and profile GPU kernel-level...  ...software solutions (TRT, TRT-LLM, TRT Model Optimizer) can...  ...Python, PyTorch, and related ML tools. Strong algorithms... 

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $150k

     ...looking for a distributed ML infrastructure...  ...Implement distributed optimizers from mathematical specs...  ...across multi‑node, multi‑GPU clusters Own experiment...  ...and researchers. Infra Engineering - Write production...  ...with performance profiling, kernel fusion, or memory... 
    Flexible hours

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Infra Intern: GPU Kernel Optimization & LLM Profiling. Be the first to apply!