Remote | CUDA & GPU Kernel Optimization Engineer — $70-$90/hour
$70 - $90 per hour24-MAG
- Remote job
We are sharing a specialised part-time consulting opportunity for CUDA and GPU programming professionals experienced in kernel optimization, C++ engineering, profiler-guided performance analysis, GPU hardware utilization, and technical review.
This role supports current and upcoming remote consulting opportunities focused on GPU kernel optimization, performance evaluation, CUDA/HIP review, profiler metric analysis, C++ and Python workflows, and high-quality project execution. Selected professionals will apply their GPU programming expertise to analyze kernels, identify performance bottlenecks, improve implementation quality, and document optimization decisions across modern hardware environments.
Key Responsibilities
Professionals in this role may contribute to:
GPU Kernel Optimization
- Analyze and optimize GPU kernels for performance, efficiency, and hardware utilization
- Review kernel implementations and identify bottlenecks in memory access, occupancy, throughput, or execution patterns
- Improve performance outcomes using CUDA, HIP, shader programming, or related GPU programming models
- Optimize kernels even when limited background context is available for the underlying algorithm
Profiler-Guided Performance Analysis
- Use profiler metrics such as L2 cache hit rate, L2 throughput, occupancy, memory behavior, and related performance signals
- Evaluate when specific profiler metrics are useful, misleading, or secondary to other optimization factors
- Document optimization decisions clearly and explain tradeoffs in technical terms
- Calibrate performance judgments against structured benchmarks, hardware constraints, and project-specific criteria
C++, Python & GPU Programming Review
- Write, modify, and reason about C++17, Python, and GPU programming code
- Review code for correctness, performance impact, maintainability, and optimization potential
- Use Git-based workflows to manage technical materials and project submissions
- Apply practical GPU programming expertise across CUDA, HIP, Slang, HLSL, GLSL, or related kernel programming environments
Ideal Profile
Strong candidates may have:
- Strong practical experience with GPU programming and kernel optimization
- Fluency in core C++ features through C++17
- Working knowledge of Python and Git
- Fluency in at least one GPU programming model, such as CUDA, HIP, Slang, HLSL, GLSL, or related kernel programming
- At least 1 year of professional or graduate-level research experience working with GPUs
- Strong understanding of GPU profiler performance metrics and how to use them to optimize kernels
- Ability to work independently on technical review and optimization tasks
- Availability to work at least 20 hours per week depending on project scope
Educational Background
- A degree in computer science, electrical engineering, computer engineering, applied mathematics, physics, mechanical engineering, or a related technical field is helpful
- Graduate-level research, professional GPU engineering experience, or equivalent hands-on kernel optimization experience is highly relevant
- Practical experience with CUDA, HIP, GPU architecture, high-performance computing, graphics programming, or compiler-adjacent performance work may be especially valuable
Nice to Have
- Experience with CUDA, HIP, CUDA C++ Core Libraries, inline PTX assembly, or tensor core-level optimization
- Experience optimizing kernels for NVIDIA Blackwell hardware or other modern GPU architectures
- Familiarity with Nsight Compute or comparable GPU profiling tools
- Prior experience with GPU hardware organizations such as NVIDIA, AMD, Qualcomm, or similar technical environments
- Open-source contributions related to GPU kernel optimization, HPC, compiler tooling, graphics, or performance engineering
Why This Opportunity
- Apply advanced GPU programming expertise to structured remote project work
- Contribute to high-quality kernel optimization, performance review, and technical evaluation workflows
- Work on flexible assignments aligned with CUDA, C++, profiler analysis, and GPU architecture strengths
- Use your ability to identify bottlenecks, improve performance, and explain optimization decisions clearly
- Remote structure with competitive hourly compensation
Contract Details
- Independent contractor role
- Fully remote with flexible scheduling
- Eligible professionals may be based in approved project locations depending on project needs
- Expected commitment of at least 20 hours per week depending on project availability
- Competitive rates between $70–$90 per hour depending on expertise and project scope
- Weekly payments via Stripe or Wise
- Projects may be extended, shortened, or adjusted depending on scope and performance
- Work will not involve access to confidential or proprietary information from any employer, client, or institution
About the Platform
This opportunity is available through 24-MAG LLC. We connect experienced professionals with remote consulting opportunities across technical, evaluation, and project-based workstreams.
By submitting this application, you acknowledge that your information may be processed by 24-MAG LLC for recruitment and opportunity matching in accordance with our Privacy Policy: .
- Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for high-throughput AI systems. This role involves designing custom kernels, profiling GPU workloads, and resolving performance bottlenecks. Ideal candidates will have a strong understanding...Remote job
$80 - $120 per hour
Mercor is looking for a CUDA Engineering Expert to join their team and work remotely. This role will focus on analyzing and optimizing GPU kernels to enhance performance and efficiency. The ideal... ...must be available for at least 20 hours a week. The position offers...Remote jobHourly pay- Pragmatike is hiring a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for a leading AI startup. This remote role focuses on maximizing GPU performance and throughput for high-scale AI systems. Candidates should have substantial experience with CUDA and...Remote jobRelocation package
- Pragmatike is seeking a CUDA Kernel Engineer for a remote position to develop and optimize NVIDIA CUDA kernels for high-performance AI systems. The ideal candidate will have a deep understanding of GPU architecture, performance optimization strategies, and hands-on experience...Remote workRelocation package
- Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize CUDA kernels for high-throughput AI systems. This remote role involves maximizing GPU performance, diagnosing PCIe bottlenecks, and working with Fortune 500 clients. Ideal candidates will have hands-on...Remote workRelocation package
- Mercor is seeking a CUDA Engineering Expert to analyze and optimize GPU kernels for performance in a remote role. The ideal candidate should be fluent in core C++ features through... .... If you're available for at least 20 hours per week and have experience with GPUs, please...Remote job
- RemoteFetch, a fast-growing AI startup, is looking for a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels from scratch. The ideal candidate will deeply understand GPU architecture and play a critical role in enhancing the performance of high-throughput AI...Remote job
- Mercor is looking for a CUDA Engineering Expert to analyze and optimize GPU kernels to improve performance and efficiency. This position offers a remote working arrangement and requires at least 20 hours a week availability. The ideal candidate should have experience with...Remote job
- ...About the job CUDA Kernel Engineer Location: Remote US Start date: ASAP Languages: English (required)... ...hands-on experience developing and optimizing NVIDIA CUDA kernels from scratch . You will work on the GPU performance layer powering large-scale...Remote workLocal areaImmediate startRelocation package
- ...GPU Kernel Engineer Our R&D team is seeking expert level GPU kernel engineers... ...an AI agent that writes and optimizes kernels in the same way you... ...Gdansk or New York City. Remote work will be considered for... ...MakoraGenerate writes GPU kernels in CUDA, HIP, and Triton using LLMs....Remote work
$90 - $125 per hour
...edge AI company is looking for Low-Level Engineers to design RL environments that optimize kernel development and systems programming.... ...a solid understanding of LLMs. This remote contractor role offers an hourly rate ranging from $90 to $125, based on expertise. Applicants...Remote jobHourly payFor contractors$167.2k - $209k
...DigitalOcean is seeking a Senior Engineer 2 to play a key... ...in our AI Inference Optimization team. DigitalOcean... ...inference engine and GPU kernel layers, ensuring our infrastructure... ...software stacks (CUDA, ROCm, TensorRT,... ...to $209,000 This is a remote role Why You’ll Like...Remote workLocal areaWorldwideFlexible hours- ...help businesses automate and optimize their operations. We... ...we’re looking for a skilled GPU Systems Engineer (CUDA) to join our dynamic team and... ...Engineer (CUDA) Location: 100% Remote (Continental United States)... ...high-performance CUDA kernels for compute-intensive workloads...Remote workFull timeH1bLocal areaImmediate startVisa sponsorshipWork visa
- Pragmatike is looking for a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels. The role involves working closely with high-throughput AI systems... ...hands-on experience with CUDA, a deep understanding of GPU architecture, and a track record of optimizing kernel...
- ...technology company is looking for exceptional generalist engineers who thrive with autonomy. This fully remote role allows you to work on high-impact projects across the vLLM stack, from optimizing CUDA kernels to designing distributed orchestration systems. Ideal candidates...Remote job
$90 - $125 per hour
RLEE - Low-Level Engineering & Kernel Inference Optimization RL Environments Kernel Optimization GPU/CUDA Compilers (LLVM/MLIR) PyTorch... ...architectures. This is a remote contractor role with ≥4 hours overlap to PST and... ...Hourly contractor rate: 90- 125 USD/hour (dependent...Remote workHourly payFor contractors$80 - $100 per hour
...dynamic project with a leading AI lab as a GPU kernel optimization expert. This role is tailored for... ...Python, and GPU programming code. Apply CUDA, HIP, shader programming, or related... ...Qualifications Available to work at least 20 hours per week. Fluent in core C++...Remote jobContract workFreelance$195.2k - $292.8k
Qualcomm is seeking a GPU Engineer to work in Austin with a focus on optimizing GPU cores and supporting driver and compiler development. Candidates should possess... ...position offers a pay range of $195,200.00 - $292,800.00 and is open to remote work. #J-18808-Ljbffr QualcommRemote job- ...Sciforium Gpu Kernel Engineer Sciforium is an AI infrastructure company developing next-generation... .... In this role, you will design and optimize custom GPU kernels that power next-generation... ...custom GPU kernels using C++, PTX, CUDA, ROCm, Triton, and/or JAX Pallas....Flexible hours
$160k - $230k
...Systems Research Engineer, GPU Programming San Francisco About... ...role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications... ...computing, such as CUDA and/or Triton.... ...as flexibility in terms of remote work. The US base salary range...Remote workFull time$100k - $120k
...inference workloads grow, we need kernel‑level innovations to reduce... ...team to architect and optimize low‑level compute kernels, drivers... ...a team of kernel and system engineers focused on performance-critical... ...for CPU (AVX/ARM NEON), GPU (CUDA/ROCm), and hardware accelerators...$285k - $315k
SF Tensor is looking for a Founding GPU Kernel Engineer in San Francisco, specializing in GPU architecture and kernel optimization for machine learning workloads. The ideal candidate... ...and strong programming skills in C++ and CUDA. This full-time position offers a...Full timeRelocation package- FriendliAI is seeking a GPU Kernel Engineer in San Francisco to design and optimize GPU kernels for AI inference. This role requires expertise in CUDA, C++, and performance-critical systems. You will work on cutting-edge GPU technology and contribute to a highly collaborative...
- Overview Position Title: DSP Engineer Location: Aberdeen, MD... ...and improvement to a GPU accelerated signals... ...Kubernetes is desired CUDA parallel computing platform... ...computing, including kernel development, memory hierarchy... ...GPU/CPU data transfer optimization. Familiarity with CI/CD...Full timeContract workWork at officeLocal areaImmediate start
- ...leading AI acceleration company in San Francisco is seeking a GPU Kernel Engineer to optimize performance for machine learning models. You will be... ...computation efficiency. Ideal candidates have 1-5 years of CUDA development experience and a strong understanding of GPU architecture...
- ...and help build the platform engineers turn to to ship AI products. THE ROLE We’re seeking a GPU Kernel Engineer to join our team at... ...powers modern AI workloads, optimizing every microsecond of computation... ...Write and optimize code using CUDA, PTX assembly, and...Flexible hours
$285k - $315k
...The Role We're looking for a Founding GPU Kernel Engineer who lives right at the boundary between... ...turn that knowledge into compiler optimization passes that help every model we compile... ...Solid systems programming in C++ and CUDA (or ROCm/HIP) Good understanding of how...Full timeWork at officeRelocation package- ...California is looking for a Member of Technical Staff for Kernel/Compiler/Communication. This critical role requires strong expertise in CUDA and GPU optimization, along with 5+ years of experience in performance engineering. The ideal candidate will design high-performance...
- MakerMaker, based in San Francisco, is seeking a highly skilled kernel engineer to write and optimize GPU kernels that enhance performance for training and... .... The ideal candidate will have a strong background in CUDA or similar, with proven experience in kernel...
$100k - $150k
...help businesses automate and optimize their operations. We... ...we’re looking for a skilled GPU Systems Engineer (CUDA) to join our dynamic team and... ...Engineer (CUDA) Location: 100% Remote (Continental United States)... ...high-performance CUDA kernels for compute-intensive workloads...Remote workFull timeH1bLocal areaImmediate startVisa sponsorshipWork visa
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Remote | CUDA & GPU Kernel Optimization Engineer — $70-$90/hour. Be the first to apply!
- remote quality assurance New York, NY
- remote wordpress developer New York, NY
- remote accounts payable New York, NY
- remote gis New York, NY
- entry level remote New York, NY
- remote medical billing part time New York, NY
- sales engineer remote New York, NY
- remote dba New York, NY
- career coach remote New York, NY
- remote isolated New York, NY

