Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Infra Intern: GPU Kernel Optimization & LLM Profiling

$19 - $65 per hour

A Medium Corporation

PlusAI is seeking a Machine Learning Infrastructure Engineer Intern to advance their AI-based virtual driver software. The role involves identifying bottlenecks in BEV model training and implementing high-performance custom kernels using CUDA or C++. Interns will explore using LLMs for code generation to optimize processes. The position offers competitive hourly pay ranging from $19 to $65, based on experience and education level. Join PlusAI for hands-on work in an innovative, dynamic field with opportunities for personal and professional growth. #J-18808-Ljbffr Medium

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Infra Intern: GPU Kernel Optimization & LLM Profiling in Santa Clara, CA vacancy
  • $19 - $65 per hour

     ...a Machine Learning Infrastructure Engineer Intern to work on high-performance kernels for BEV model training. In this role, you will...  ...internship also explores the use of LLMs to optimize code generation and performance profiling. The hourly pay ranges from $19 to $65 based... 
    Internship
    Hourly pay

    PlusAI

    Santa Clara, CA
    12 hours ago
  • $207k - $300k

    Software Engineer, GDC LLM Serving and GPU Performance Google...  ...sequential decision making), ML infrastructure, or...  .... You could be optimizing KV cache transfer mechanisms...  ...down to performance profiling, ensuring Google’s...  ...GPU libraries and kernels. Collaborate with research... 
    Suggested
    Full time

    Google Inc.

    Sunnyvale, CA
    2 days ago
  •  ...Stack: Establish best practices and optimize performance from the lowest‑level GPU kernels to large‑scale distributed...  ...design. Deep experience using GPU profiling and performance analysis tools (e...  ...plus. Relevant publications in AI/ML, GPU computing, or system optimization... 
    Suggested

    AMD

    Santa Clara, CA
    1 day ago
  • $176k - $420k

     ...Expect The Performance Optimization team takes research...  ...development, kernel optimization, and hardware...  ..., optimize, and profile highly performant...  ...to distributed LLM inference Work with...  ...Understanding of computer and GPU architecture, SIMD,...  ..., hardware, and ML teams Degree in... 
    Suggested
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Palo Alto, CA
    2 days ago
  • $272k - $431.25k

     ...Principal Ai And Ml Infra Software Engineer, Gpu Clusters We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA...  ...long-term roadmaps for such initiatives. Monitor and optimize the performance of our infrastructure ensuring high availability... 
    Suggested

    NVIDIA

    Santa Clara, CA
    3 days ago
  •  ...Senior Staff AI Infra Engineer who is passionate...  ...focus on AI/ML workloads and GPU-accelerated...  ...and software to optimize performance for next...  ...Optimize and accelerate LLM training and...  ...GPUs, improving kernel, communication, and...  ...Experience with profiling and performance-... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    2 days ago
  • $168k - $258.75k

     ...coding agents synthesize, optimize, and deploy GPU kernels automatically. This job...  ..., you will act as the internal champion for AI agents and LLM-based coding workflows...  ...agents with compilers, profilers, execution sandboxes,...  ...platform products in AI, ML infrastructure, or high... 

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...seniorities. As an Acceleration Kernel Developer at Tenstorrent, you will play a crucial role in optimizing low-level workloads, kernel...  ...and pipelines. Performance Profiling: Identify performance...  ...parallel algorithms on CPU, or GPU acceleration. High degree of... 
    Internship
    Permanent employment

    Tenstorrent

    Santa Clara, CA
    1 day ago
  • NVIDIA Gruppe is seeking a Principal AI and ML Infra Software Engineer to join our Hardware Infrastructure team in Santa Clara, CA. In this...  ...efficiency by addressing infrastructure deficiencies for GPU Clusters, fostering innovations in AI/ML research. The ideal candidate... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...libraries, code generators, and GPU kernel technologies for NVIDIA's...  ...attention kernel implementations, new LLM inference runtimes components,...  ...Designing, implementing, and optimizing kernels for high impact AI...  ...academic/ industry) experience with ML/DL systems development... 
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

    NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California, to enhance the efficiency of AI/ML research on GPU Clusters. The role involves collaboration with various teams, monitoring infrastructure performance, and implementing improvements... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  •  ...As a senior member of the LLM inference framework team, you...  ...responsible for building and optimizing production-grade single-node...  ...engines, distributed systems, and GPU runtime and kernel backends. THE...  ...You are a systems-minded ML engineer who thinks in terms... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  • $152k - $218.5k

     ...now looking for a Senior Kernel Performance Architect...  ...will be doing: Craft GPU-accelerated system...  ...Analyze, visualize, and optimize software performance using...  ...performance issues. AI/ML training and inference...  ...analysis and profiling to identify performance... 
    Work experience placement

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $19 - $65 per hour

     ...s Scania, MAN, and International brands, Hyundai Motor...  ...performance custom kernels (using CUDA, Triton...  ...generation, kernel optimization, and automated performance profiling with Nsight and...  ...by both human and LLM-assisted workflows to maximize GPU utilization and reduce... 
    Internship
    Hourly pay

    PlusAI

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...Learning Software Engineer, LLM Performance page is...  ...enable the performance optimization, deployment and serving...  ...in developing GPU-accelerated Deep learning...  ...SGLang, Triton and CUDA kernels. Work and collaborate with...  ...performance modeling, profiling, debug, and code optimization... 

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...state‑of‑the‑art model optimization techniques—speculative...  ...and efficient attention kernels optimized for KV‑caching...  ..., layer‑by‑layer model profiling to identify compute and...  ...with modern LLM/VLM inference stacks, such...  ...Strong understanding of GPU architecture, the compilation... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...Architect to assist customers in building AI/ML and HPC software solutions at scale. As a...  ...aspects related to tasks like large scale LLM training and inference. Conducting regular...  ...diagnostics. Hands-on experience with GPU systems in general including but not limited... 

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $193.3k - $261.5k

     ...Trainium. The Acceleration Kernel Library team is at the...  ...for AWS's custom ML accelerators. Working at...  ...counts in delivering optimal performance for our customers...  ...analysis using profiling tools to identify and resolve...  ...- Experience with GPU kernel optimization and... 
    Internship
    Local area
    Work from home
    Flexible hours

    Amazon

    Cupertino, CA
    4 days ago
  • $152k - $287.5k

     ...accelerate the development of machine learning innovations. In this role, you'll design and implement solutions for GPU clusters, enabling researchers to optimize their work. Strong expertise in software engineering and languages like Python or C++ is required. The ideal... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $40 - $70 per hour

    Location Toronto Employment Type Intern Location Type Hybrid Department R&D - SW Kernels & Workloads Compensation CA$40 - CA...  ....g., LLMs, CNNs) Experience with ML frameworks such as PyTorch and ML...  ...understanding of computer architecture (CPU, GPU, custom ASICs, etc.) Currently... 
    Internship
    Hourly pay

    MixMode

    Santa Clara, CA
    1 day ago
  • $272k - $431.25k

     ...Dynamo orchestrates GPU shards, routes...  ...deployment of cutting-edge LLM workloads. We...  ...the team in internal reviews and external...  ...performance storage, or ML systems infrastructure...  ...especially designs optimized for low latency and...  ...~ Strong skills in profiling and optimizing... 
    Local area
    Remote work

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $136k - $218.5k

     ...Senior Power Architecture & Optimization Engineer to push the limits of...  ...chip and unit‑level power using internal and industry‑standard RTL and...  ...models and flows, including ML/RL‑based techniques for...  ...architecture and help shape the energy profile of NVIDIA’s future products.... 

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...effortlessly run large-scale ML applications, without...  ...10 times faster than GPU-based hyperscale cloud...  ...to join our on‑field Kernel Reliability team. You’ll...  ...inference, training, and internal production services. In...  ..., tracing, sanitizers, profilers, etc.). Familiarity... 
    Internship

    Dormont Manufacturing Co

    Sunnyvale, CA
    1 day ago
  • $124k - $195.5k

    Deep Learning Kernel Software Performance Architect -...  ...computing. An era in which our GPU acts as the brains of...  ...issues* Engage AI/ML training and inference...  ...teams to identify and optimize critical deep learning...  ...performance analysis and profiling to identify performance... 
    Work experience placement

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  •  ...Technical Staff, Machine Learning Kernels to design, optimize, and benchmark high-...  ...optimize high-performance ML kernels, primarily...  ...and memory efficiency. Profile, benchmark, and analyze performance...  ...accelerators. Advise internal teams on GPU and accelerator... 
    Visa sponsorship
    Relocation package

    Netpreme

    Santa Clara, CA
    4 days ago
  • $278.1k - $347.6k

     ...Mobile AI Inference Optimization Location Mountain...  ...the full mobile ML stack, and mentor a...  ...quality, and power profile of AI-driven features...  ...to hardware-specific kernel tuning on NPU, GPU, and CPU. Architecture...  ...information ~ International relocation support is... 
    Work at office
    Worldwide
    Relocation package

    Unity Technologies

    Mountain View, CA
    10 days ago
  • NVIDIA Gruppe is seeking a Senior Deep Learning Software Engineer focused on LLM performance in Santa Clara. You will optimize GPU-accelerated software for large language model deployment, working on performance tuning for various models. The ideal candidate has over 8... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $224k - $356.5k

     ...execution environment, low-level GPU optimizations and developing custom GPU kernels in CUDA and/or Triton....  ...solution. Analyze and profile GPU kernel-level...  ...software solutions (TRT, TRT-LLM, TRT Model Optimizer) can...  ...Python, PyTorch, and related ML tools. ~ Strong... 

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $150k

     ...looking for a distributed ML infrastructure...  ...Implement distributed optimizers from mathematical specs...  ...across multi-node, multi-GPU clusters • Own...  ...researchers. • Infra Engineering - Write...  ...Familiarity with performance profiling, kernel fusion, or memory... 
    Flexible hours

    Institute of Foundation Models

    Sunnyvale, CA
    5 days ago
  •  ...work spans low-level kernel performance debugging and optimization, system-level...  ...the art and customer ML models. Optimize and...  ...level deep learning / LLM math. Strong analytical...  ...Computer Architecture, CPU/GPU Performance, Kernel...  ...to performance profiling and debug on any... 

    Dormont Manufacturing Co

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Infra Intern: GPU Kernel Optimization & LLM Profiling. Be the first to apply!