Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Remote | CUDA & GPU Kernel Optimization Engineer — $70-$90/hour

$70 - $90 per hour

24-MAG

New York, NY
  • Remote job

We are sharing a specialised part-time consulting opportunity for CUDA and GPU programming professionals experienced in kernel optimization, C++ engineering, profiler-guided performance analysis, GPU hardware utilization, and technical review.

This role supports current and upcoming remote consulting opportunities focused on GPU kernel optimization, performance evaluation, CUDA/HIP review, profiler metric analysis, C++ and Python workflows, and high-quality project execution. Selected professionals will apply their GPU programming expertise to analyze kernels, identify performance bottlenecks, improve implementation quality, and document optimization decisions across modern hardware environments.

Key Responsibilities

Professionals in this role may contribute to:

GPU Kernel Optimization

  • Analyze and optimize GPU kernels for performance, efficiency, and hardware utilization
  • Review kernel implementations and identify bottlenecks in memory access, occupancy, throughput, or execution patterns
  • Improve performance outcomes using CUDA, HIP, shader programming, or related GPU programming models
  • Optimize kernels even when limited background context is available for the underlying algorithm

Profiler-Guided Performance Analysis

  • Use profiler metrics such as L2 cache hit rate, L2 throughput, occupancy, memory behavior, and related performance signals
  • Evaluate when specific profiler metrics are useful, misleading, or secondary to other optimization factors
  • Document optimization decisions clearly and explain tradeoffs in technical terms
  • Calibrate performance judgments against structured benchmarks, hardware constraints, and project-specific criteria

C++, Python & GPU Programming Review

  • Write, modify, and reason about C++17, Python, and GPU programming code
  • Review code for correctness, performance impact, maintainability, and optimization potential
  • Use Git-based workflows to manage technical materials and project submissions
  • Apply practical GPU programming expertise across CUDA, HIP, Slang, HLSL, GLSL, or related kernel programming environments

Ideal Profile

Strong candidates may have:

  • Strong practical experience with GPU programming and kernel optimization
  • Fluency in core C++ features through C++17
  • Working knowledge of Python and Git
  • Fluency in at least one GPU programming model, such as CUDA, HIP, Slang, HLSL, GLSL, or related kernel programming
  • At least 1 year of professional or graduate-level research experience working with GPUs
  • Strong understanding of GPU profiler performance metrics and how to use them to optimize kernels
  • Ability to work independently on technical review and optimization tasks
  • Availability to work at least 20 hours per week depending on project scope

Educational Background

  • A degree in computer science, electrical engineering, computer engineering, applied mathematics, physics, mechanical engineering, or a related technical field is helpful
  • Graduate-level research, professional GPU engineering experience, or equivalent hands-on kernel optimization experience is highly relevant
  • Practical experience with CUDA, HIP, GPU architecture, high-performance computing, graphics programming, or compiler-adjacent performance work may be especially valuable

Nice to Have

  • Experience with CUDA, HIP, CUDA C++ Core Libraries, inline PTX assembly, or tensor core-level optimization
  • Experience optimizing kernels for NVIDIA Blackwell hardware or other modern GPU architectures
  • Familiarity with Nsight Compute or comparable GPU profiling tools
  • Prior experience with GPU hardware organizations such as NVIDIA, AMD, Qualcomm, or similar technical environments
  • Open-source contributions related to GPU kernel optimization, HPC, compiler tooling, graphics, or performance engineering

Why This Opportunity

  • Apply advanced GPU programming expertise to structured remote project work
  • Contribute to high-quality kernel optimization, performance review, and technical evaluation workflows
  • Work on flexible assignments aligned with CUDA, C++, profiler analysis, and GPU architecture strengths
  • Use your ability to identify bottlenecks, improve performance, and explain optimization decisions clearly
  • Remote structure with competitive hourly compensation

Contract Details

  • Independent contractor role
  • Fully remote with flexible scheduling
  • Eligible professionals may be based in approved project locations depending on project needs
  • Expected commitment of at least 20 hours per week depending on project availability
  • Competitive rates between $70–$90 per hour depending on expertise and project scope
  • Weekly payments via Stripe or Wise
  • Projects may be extended, shortened, or adjusted depending on scope and performance
  • Work will not involve access to confidential or proprietary information from any employer, client, or institution

About the Platform

This opportunity is available through 24-MAG LLC. We connect experienced professionals with remote consulting opportunities across technical, evaluation, and project-based workstreams.

By submitting this application, you acknowledge that your information may be processed by 24-MAG LLC for recruitment and opportunity matching in accordance with our Privacy Policy: .

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Remote | CUDA & GPU Kernel Optimization Engineer — $70-$90/hour in New York, NY vacancy
  • Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for high-throughput AI systems. This role involves designing custom kernels, profiling GPU workloads, and resolving performance bottlenecks. Ideal candidates will have a strong understanding... 
    Remote job

    Pragmatike

    Florida, NY
    4 days ago
  • $80 - $120 per hour

    Mercor is looking for a CUDA Engineering Expert to join their team and work remotely. This role will focus on analyzing and optimizing GPU kernels to enhance performance and efficiency. The ideal...  ...must be available for at least 20 hours a week. The position offers... 
    Remote job
    Hourly pay

    Mercor

    New Bremen, OH
    4 days ago
  • Pragmatike is hiring a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for a leading AI startup. This remote role focuses on maximizing GPU performance and throughput for high-scale AI systems. Candidates should have substantial experience with CUDA and... 
    Remote job
    Relocation package

    Pragmatike

    New York, NY
    20 hours ago
  • Pragmatike is seeking a CUDA Kernel Engineer for a remote position to develop and optimize NVIDIA CUDA kernels for high-performance AI systems. The ideal candidate will have a deep understanding of GPU architecture, performance optimization strategies, and hands-on experience... 
    Remote work
    Relocation package

    Pragmatike

    San Francisco, CA
    4 days ago
  • Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize CUDA kernels for high-throughput AI systems. This remote role involves maximizing GPU performance, diagnosing PCIe bottlenecks, and working with Fortune 500 clients. Ideal candidates will have hands-on... 
    Remote work
    Relocation package

    Pragmatike

    Atlanta, GA
    1 day ago
  • Mercor is seeking a CUDA Engineering Expert to analyze and optimize GPU kernels for performance in a remote role. The ideal candidate should be fluent in core C++ features through...  .... If you're available for at least 20 hours per week and have experience with GPUs, please... 
    Remote job

    Mercor

    San Francisco, CA
    2 days ago
  • RemoteFetch, a fast-growing AI startup, is looking for a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels from scratch. The ideal candidate will deeply understand GPU architecture and play a critical role in enhancing the performance of high-throughput AI... 
    Remote job

    RemoteFetch

    Boston, MA
    2 days ago
  • Mercor is looking for a CUDA Engineering Expert to analyze and optimize GPU kernels to improve performance and efficiency. This position offers a remote working arrangement and requires at least 20 hours a week availability. The ideal candidate should have experience with... 
    Remote job

    Mercor

    San Francisco, CA
    2 days ago
  •  ...About the job CUDA Kernel Engineer Location: Remote US Start date: ASAP Languages: English (required)...  ...hands-on experience developing and optimizing NVIDIA CUDA kernels from scratch . You will work on the GPU performance layer powering large-scale... 
    Remote work
    Local area
    Immediate start
    Relocation package

    Pragmatike

    Cambridge, MA
    20 hours ago
  •  ...GPU Kernel Engineer Our R&D team is seeking expert level GPU kernel engineers...  ...an AI agent that writes and optimizes kernels in the same way you...  ...Gdansk or New York City. Remote work will be considered for...  ...MakoraGenerate writes GPU kernels in CUDA, HIP, and Triton using LLMs.... 
    Remote work

    Makora

    United States
    1 day ago
  • $90 - $125 per hour

     ...edge AI company is looking for Low-Level Engineers to design RL environments that optimize kernel development and systems programming....  ...a solid understanding of LLMs. This remote contractor role offers an hourly rate ranging from $90 to $125, based on expertise. Applicants... 
    Remote job
    Hourly pay
    For contractors

    Open Data Science

    San Francisco, CA
    3 days ago
  • $167.2k - $209k

     ...DigitalOcean is seeking a Senior Engineer 2 to play a key...  ...in our AI Inference Optimization team. DigitalOcean...  ...inference engine and GPU kernel layers, ensuring our infrastructure...  ...software stacks (CUDA, ROCm, TensorRT,...  ...to $209,000 This is a remote role Why You’ll Like... 
    Remote work
    Local area
    Worldwide
    Flexible hours

    DigitalOcean

    Seattle, WA
    3 days ago
  •  ...help businesses automate and optimize their operations. We...  ...we’re looking for a skilled GPU Systems Engineer (CUDA) to join our dynamic team and...  ...Engineer (CUDA) Location: 100% Remote (Continental United States)...  ...high-performance CUDA kernels for compute-intensive workloads... 
    Remote work
    Full time
    H1b
    Local area
    Immediate start
    Visa sponsorship
    Work visa

    Bright Vision Technologies

    Plano, TX
    4 days ago
  • Pragmatike is looking for a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels. The role involves working closely with high-throughput AI systems...  ...hands-on experience with CUDA, a deep understanding of GPU architecture, and a track record of optimizing kernel... 

    Pragmatike

    Austin, TX
    1 day ago
  •  ...technology company is looking for exceptional generalist engineers who thrive with autonomy. This fully remote role allows you to work on high-impact projects across the vLLM stack, from optimizing CUDA kernels to designing distributed orchestration systems. Ideal candidates... 
    Remote job

    Inferact

    New York, NY
    4 days ago
  • $90 - $125 per hour

    RLEE - Low-Level Engineering & Kernel Inference Optimization RL Environments Kernel Optimization GPU/CUDA Compilers (LLVM/MLIR) PyTorch...  ...architectures. This is a remote contractor role with ≥4 hours overlap to PST and...  ...Hourly contractor rate: 90- 125 USD/hour (dependent... 
    Remote work
    Hourly pay
    For contractors

    Open Data Science

    San Francisco, CA
    3 days ago
  • $80 - $100 per hour

     ...dynamic project with a leading AI lab as a GPU kernel optimization expert. This role is tailored for...  ...Python, and GPU programming code. Apply CUDA, HIP, shader programming, or related...  ...Qualifications Available to work at least 20 hours per week. Fluent in core C++... 
    Remote job
    Contract work
    Freelance

    SaidGig

    Remote
    9 days ago
  • $195.2k - $292.8k

    Qualcomm is seeking a GPU Engineer to work in Austin with a focus on optimizing GPU cores and supporting driver and compiler development. Candidates should possess...  ...position offers a pay range of $195,200.00 - $292,800.00 and is open to remote work. #J-18808-Ljbffr Qualcomm
    Remote job

    Qualcomm

    Austin, TX
    4 days ago
  •  ...Sciforium Gpu Kernel Engineer Sciforium is an AI infrastructure company developing next-generation...  .... In this role, you will design and optimize custom GPU kernels that power next-generation...  ...custom GPU kernels using C++, PTX, CUDA, ROCm, Triton, and/or JAX Pallas.... 
    Flexible hours

    Sciforium

    San Francisco, CA
    3 days ago
  • $160k - $230k

     ...Systems Research Engineer, GPU Programming San Francisco About...  ...role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications...  ...computing, such as CUDA and/or Triton....  ...as flexibility in terms of remote work. The US base salary range... 
    Remote work
    Full time

    Together AI

    San Francisco, CA
    20 hours ago
  • $100k - $120k

     ...inference workloads grow, we need kernel‑level innovations to reduce...  ...team to architect and optimize low‑level compute kernels, drivers...  ...a team of kernel and system engineers focused on performance-critical...  ...for CPU (AVX/ARM NEON), GPU (CUDA/ROCm), and hardware accelerators... 

    Coda Robotics

    San Francisco, CA
    4 days ago
  • $285k - $315k

    SF Tensor is looking for a Founding GPU Kernel Engineer in San Francisco, specializing in GPU architecture and kernel optimization for machine learning workloads. The ideal candidate...  ...and strong programming skills in C++ and CUDA. This full-time position offers a... 
    Full time
    Relocation package

    SF Tensor

    San Francisco, CA
    1 day ago
  • FriendliAI is seeking a GPU Kernel Engineer in San Francisco to design and optimize GPU kernels for AI inference. This role requires expertise in CUDA, C++, and performance-critical systems. You will work on cutting-edge GPU technology and contribute to a highly collaborative... 

    FriendliAI

    San Francisco, CA
    1 day ago
  • Overview Position Title: DSP Engineer Location: Aberdeen, MD...  ...and improvement to a GPU accelerated signals...  ...Kubernetes is desired CUDA parallel computing platform...  ...computing, including kernel development, memory hierarchy...  ...GPU/CPU data transfer optimization. Familiarity with CI/CD... 
    Full time
    Contract work
    Work at office
    Local area
    Immediate start

    LufCo

    Aberdeen, MD
    4 days ago
  •  ...leading AI acceleration company in San Francisco is seeking a GPU Kernel Engineer to optimize performance for machine learning models. You will be...  ...computation efficiency. Ideal candidates have 1-5 years of CUDA development experience and a strong understanding of GPU architecture... 

    Baseten

    San Francisco, CA
    1 day ago
  •  ...and help build the platform engineers turn to to ship AI products. THE ROLE We’re seeking a GPU Kernel Engineer to join our team at...  ...powers modern AI workloads, optimizing every microsecond of computation...  ...Write and optimize code using CUDA, PTX assembly, and... 
    Flexible hours

    Baseten

    San Francisco, CA
    3 days ago
  • $285k - $315k

     ...The Role We're looking for a Founding GPU Kernel Engineer who lives right at the boundary between...  ...turn that knowledge into compiler optimization passes that help every model we compile...  ...Solid systems programming in C++ and CUDA (or ROCm/HIP) Good understanding of how... 
    Full time
    Work at office
    Relocation package

    SF Tensor

    San Francisco, CA
    1 day ago
  •  ...California is looking for a Member of Technical Staff for Kernel/Compiler/Communication. This critical role requires strong expertise in CUDA and GPU optimization, along with 5+ years of experience in performance engineering. The ideal candidate will design high-performance... 

    RadixArk

    Palo Alto, CA
    20 hours ago
  • MakerMaker, based in San Francisco, is seeking a highly skilled kernel engineer to write and optimize GPU kernels that enhance performance for training and...  .... The ideal candidate will have a strong background in CUDA or similar, with proven experience in kernel... 

    MakerMaker

    San Francisco, CA
    4 days ago
  • $100k - $150k

     ...help businesses automate and optimize their operations. We...  ...we’re looking for a skilled GPU Systems Engineer (CUDA) to join our dynamic team and...  ...Engineer (CUDA) Location: 100% Remote (Continental United States)...  ...high-performance CUDA kernels for compute-intensive workloads... 
    Remote work
    Full time
    H1b
    Local area
    Immediate start
    Visa sponsorship
    Work visa

    Bright Vision Technologies

    Lawrenceville, GA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Remote | CUDA & GPU Kernel Optimization Engineer — $70-$90/hour. Be the first to apply!