CUDA Kernel Engineer (Remote US)
Pragmatike
- Remote job
About The Role Pragmatike is hiring on behalf of a fast‑growing AI startup recognized as a Top 10 GenAI company by GTM Capital, founded by MIT CSAIL researchers. Location: Remote US Start date: ASAP Languages: English (required) We are searching for a CUDA Kernel Engineer who has hands‑on experience developing and optimizing NVIDIA CUDA kernels from scratch. You will work on the GPU performance layer powering large-scale, high-throughput AI systems used by Fortune 500 customers. This role is ideal for someone who deeply understands NVIDIA GPU architecture, memory hierarchy, warp‑level execution, and profiling workflows – not someone coming from generic hardware, FPGA, or non‑NVIDIA compute backgrounds. You will directly influence the GPU efficiency, throughput, and scalability of mission‑critical AI systems. What You’ll Do Design, implement, and optimize custom CUDA kernels for NVIDIA GPUs, with a focus on maximizing occupancy, memory throughput, and warp efficiency. Profile GPU workloads using tools such as Nsight Compute, Nsight Systems, nvprof, and CUDA‑MEMCHECK. Analyze and eliminate performance bottlenecks including warp divergence, uncoalesced memory access, register pressure, and PCIe transfer overhead. Improve GPU memory pipelines (global, shared, L2, texture memory) and ensure proper memory coalescing. Collaborate closely with AI systems, model acceleration, and backend distributed systems teams. Contribute to GPU architecture decisions, kernel libraries, and internal performance‑engineering best practices. What We’re Looking For Proven track record building NVIDIA CUDA kernels from scratch, not just calling existing libraries. Strong ability to optimize kernels (tiling strategies, occupancy tuning, shared memory design, warp scheduling). Deep understanding of CUDA threads, warps, blocks, and grids, GPU memory hierarchy and memory coalescing, as well as warp divergence (how to detect, analyze, and mitigate it). Experience diagnosing PCIe bottlenecks and optimizing host‑device transfers (pinned memory, streams, batching, overlap). Familiarity with C++, CUDA runtime APIs, and GPU debugging/profiling tooling. Bonus Points Experience with multi‑GPU or distributed GPU systems (NCCL, NVLink, MIG). Background in GPU acceleration for ML frameworks or HPC workloads. Knowledge of model inference optimization (TensorRT, CUDA Graphs, CUTLASS). Exposure to compiler‑level optimization or PTX/SASS analysis. Startup experience or comfort working in fast‑moving, ambiguous environments. Benefits Competitive salary & equity options Sign‑on bonus Health, Dental, and Vision 401k EEO Statement Pragmatike is an Equal Opportunity Employer and is committed to providing equal employment opportunities to all applicants without discrimination. We recruit on behalf of our clients and prohibit discrimination and harassment based on race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training. We are committed to a fair and inclusive hiring process. We process your personal data solely for recruitment purposes, in accordance with applicable privacy laws, and maintain reasonable safeguards to protect your information. Your data may be shared with our client(s) for hiring consideration, but will not be disclosed to third parties outside of the recruitment process. #J-18808-Ljbffr Pragmatike
- RemoteFetch, a fast-growing AI startup, is looking for a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels from scratch. The ideal candidate will deeply understand GPU architecture and play a critical role in enhancing the performance of high-throughput...Remote job
- Mercor is seeking a CUDA Engineering Expert to analyze and optimize GPU kernels for performance in a remote role. The ideal candidate should be fluent in core C++ features through C++17, with working knowledge of Python and Git, and experience in GPU programming models...Remote job
- Pragmatike is hiring a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for a leading AI startup. This remote role focuses on maximizing GPU performance and throughput for high-scale AI systems. Candidates should have substantial experience with CUDA and...Remote jobRelocation package
- Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for high-throughput AI systems. This role involves designing custom kernels, profiling GPU workloads, and resolving performance bottlenecks. Ideal candidates will have a strong understanding...Remote job
$80 - $120 per hour
Mercor is looking for a CUDA Engineering Expert to join their team and work remotely. This role will focus on analyzing and optimizing GPU kernels to enhance performance and efficiency. The ideal candidate should possess strong experience in C++17, Python, and GPU programming...Remote jobHourly pay- Pragmatike is seeking a CUDA Kernel Engineer for a remote position to develop and optimize NVIDIA CUDA kernels for high-performance AI systems. The ideal candidate will have a deep understanding of GPU architecture, performance optimization strategies, and hands-on experience...Remote workRelocation package
- Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize CUDA kernels for high-throughput AI systems. This remote role involves maximizing GPU performance, diagnosing PCIe bottlenecks, and working with Fortune 500 clients. Ideal candidates will have hands-on...Remote workRelocation package
- ...GPU Kernel Engineer Our R&D team is seeking expert level GPU kernel engineers to help build... ...based in either Gdansk or New York City. Remote work will be considered for exceptional... ...components: MakoraGenerate writes GPU kernels in CUDA, HIP, and Triton using LLMs....Remote work
- ...Webster Associates is leading a confidential search for a Kernel Engineer on behalf of a premier software technology firm that... ...integrity, safety-critical embedded systems. This fully remote role (contiguous US) is perfect for an engineer with deep expertise in designing...Remote workFull timeRelocation
- ...Vision Technologies is seeking a GPU Systems Engineer (CUDA) to maximize performance from GPU... ...ll design and implement high-performance kernels, profile GPU code, and optimize training... ...of CUDA and GPU architecture. This is a remote position, requiring a Bachelor's or Master...Remote job
$167.2k - $209k
...DigitalOcean is seeking a Senior Engineer 2 to play a key technical role... ...the inference engine and GPU kernel layers, ensuring our... ...AMD) and their software stacks (CUDA, ROCm, TensorRT, OpenAI Triton... ...200.00 to $209,000 This is a remote role Why You’ll Like Working...Remote workLocal areaWorldwideFlexible hours- ...the United States is seeking an experienced GPU Software Engineer to design and implement high-performance GPU kernels for autonomous driving technologies. The position requires strong programming skills in CUDA and C++, and the ability to collaborate with cross-functional...
$70 - $90 per hour
...specialised part-time consulting opportunity for CUDA and GPU programming professionals experienced in kernel optimization, C++ engineering, profiler-guided performance analysis, GPU... ...This role supports current and upcoming remote consulting opportunities focused on GPU...Remote jobHourly payWeekly payContract workPart timeFor contractorsFlexible hours$165k - $242k
Join to apply for the Systems Engineer, Kernel role at CoreWeave CoreWeave is The Essential Cloud... ..., nydus, kubelet) HPC/AI workloads (CUDA, GPUDirect, RoCE/InfiniBand) Kernel-Hardware... ...prioritize a hybrid work environment, remote work may be considered for candidates...Remote workPermanent employmentFull timeTemporary workCasual workWork at officeFlexible hours- ...technology company is looking for exceptional generalist engineers who thrive with autonomy. This fully remote role allows you to work on high-impact projects across the vLLM stack, from optimizing CUDA kernels to designing distributed orchestration systems. Ideal candidates...Remote job
- Pragmatike is looking for a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels. The role involves working closely with high-throughput AI systems used by major clients. Candidates should have hands-on experience with CUDA, a deep understanding of GPU architecture...
- ...Ensono is looking for a Senior Storage Engineer to enhance storage environments for a critical US client. This remote position demands strong knowledge of SAN arrays and involves troubleshooting, support for lifecycle processes, and modernizing infrastructure. You will...Remote workFlexible hours
$102k - $148k
...Ensono is looking for a Senior Storage Engineer to ensure stability and efficient operation of SAN arrays and data fabrics, primarily representing a critical US client. The ideal candidate will have at least 10 years of related experience, with a focus on cloud storage...Remote work- ...Ensono is seeking a Senior Storage Engineer to ensure the stability and efficiency of SAN... ...role in delivering support for a critical US client while driving positive change in storage... ...health plan options, and more. This is a remote position available within the United...Remote work
- ...A leading functional safety software provider seeks a talented Kernel Developer to enhance their cutting-edge real-time operating system. This fully remote role requires expertise in C/C++, multi-core safety features, and a strong grounding in functional safety standards...Remote workFlexible hours
- ...Description Machine Learning Engineer, Inference Want to... ...GPU profiling and identifying kernel-level bottlenecks... ...like TensorRT, Triton, vLLM, CUDA Graphs, ONNX Runtime, or custom... ...and benefits. Location: Remote across the US or Europe. If you're excited...Remote workFlexible hours
$900 - $1,200 per month
...Help Desk Engineer Work Hours: United States Eastern Time (9 AM - 6 PM EST) Salary Range... ...support and customer service to our US-based clients. This role offers opportunities... ...from inception to closure. Provide remote desktop support and troubleshooting for Windows...Remote workWork at office- ...Cybersecurity Engineer This role is responsible for securing enterprise infrastructure,... ...retention, and email security policies. Secure remote access, mobile devices, and hybrid cloud... ...have experience on projects for HUD (US Department of Housing and Urban Development...Remote work
$104k - $143k
A leading technology firm is seeking a Search Engineer to provide technical support for their products. The successful candidate... ...and a passion for customer service is essential. This remote role is open to US candidates in Mountain or Pacific time zones. Experience in...Remote job$180k - $220k
...Forward Deployed Engineer Aussie Engineer looking to make the move to the US? Join us and build the AI platform that's rewriting a $50B industry Salary range... ...we build. Flexibility and community: Fully remote across the US or Canada. How to Apply We're...Remote workImmediate startRelocation package- Ensono is seeking a Senior Storage Engineer to ensure the stability and efficiency of storage systems for a major US client. This remote role requires at least 10 years of relevant experience with a strong focus on SAN arrays and data fabrics. The ideal candidate will manage...Remote job
- Ensono is looking for a Senior Storage Engineer to ensure the stability and efficient operation of storage solutions. This US-based role is remote, providing opportunities for leadership in storage infrastructure modernization. The ideal candidate will have over 10 years...Remote job
$102k - $148k
Ensono is seeking a Senior Storage Engineer for a remote position. The role involves ensuring the stability and efficient operation of storage systems and will represent Ensono on a critical US client. The ideal candidate should have extensive experience with SAN arrays...Remote job$90 - $125 per hour
A cutting-edge AI company is looking for Low-Level Engineers to design RL environments that optimize kernel development and systems programming. Candidates should... ...skills and a solid understanding of LLMs. This remote contractor role offers an hourly rate ranging from...Remote jobHourly payFor contractors$100k - $110k
...ultimately, safer nations. Connect with a career that matters, and help us build a safer future. Department OverviewDepartment Overview:... ...assessments. Able to work in a fast-paced, deadline-driven, remote environment. Able to travel at least 25% as required for...Remote workFor contractorsRelocation
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to CUDA Kernel Engineer (Remote US). Be the first to apply!
- remote quality assurance New York, NY
- remote wordpress developer New York, NY
- remote accounts payable New York, NY
- remote gis New York, NY
- entry level remote New York, NY
- remote medical billing part time New York, NY
- sales engineer remote New York, NY
- remote dba New York, NY
- career coach remote New York, NY
- remote isolated New York, NY

