RLEE - Low-Level Engineering & Kernel Inference Optimization

$90 - $125 per hour

Open Data Science

RLEE - Low-Level Engineering & Kernel Inference Optimization RL Environments Kernel Optimization GPU/CUDA Compilers (LLVM/MLIR) PyTorch Extensions Distributed Inference (vLLM/NCCL) Brief Description of the Role We're hiring Low-Level Engineers to design and build RL environments that teach LLMs kernel development, hardware optimization, and systems programming. The goal is to create realistic feedback loops where models learn to write high-performance code across GPU and CPU architectures. This is a remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required. About the Company Preference Model is building the next generation of training data to power the future of AI. Today's models are powerful but fail to reach their potential across diverse use cases because so many of the tasks that we want to use these models are out of distribution. Preference Model creates RL environments where models encounter research and engineering problems, iterate, and learn from realistic feedback loops. Our founding team has previous experience on Anthropic's data team building data infrastructure, tokenizers, and datasets behind the Claude model. We are partnering with leading AI labs to push AI closer to achieving its transformative potential. The company is backed by Tier 1 Silicon Valley VC. Responsibilities Design and build MLE/SWE environments and diverse tasks. Target a specified language model and satisfy the required difficulty distribution. Requirements Minimal Qualifications Strong Python (engineering-quality, not notebook-only) Clear understanding of LLMs, their current limitations Ability to meet throughput expectations and respond quickly to feedback You may be a good fit if one of the following applies Deep understanding of memory hierarchies (registers, L1/L2/shared memory, HBM, system RAM) and their performance implications Threading models, synchronization primitives, and concurrent programming (warps, thread blocks, barriers, atomics) Cache coherence, memory access patterns, coalescing, and bank conflicts AOT compilation and optimization passes (LLVM, MLIR, TVM) Compiler and kernel frameworks such as CUTLASS, BitBLAS, or JAX/Pallas Modern C++, including templates, concurrency, and build systems Assembly-level programming and low-level optimization across GPU and CPU architectures (e.g., x86, ARM, NVIDIA Hopper, NVIDIA Blackwell) Debugging and optimizing GPU kernels using CUDA and/or HIP/ROCm Developing PyTorch custom operators, backend extensions, or dispatcher integrations (e.g., ATen, TorchScript, or custom backends) Customizing, extending, or optimizing vLLM, including distributed inference workflows GPU communication libraries and collectives, such as NVIDIA NCCL, AMD RCCL, MPI, or UCX Mixed-precision and low-precision kernels (e.g., FP16, BF16, FP8, INT8), including numerical stability and performance trade-offs Working conditions Hourly contractor rate: 90- 125 USD/hour (dependent on the expertise level and quality of take-home assignment). Contacts Log In Only registered users can open employer contacts. #J-18808-Ljbffr Open Data Science

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the RLEE - Low-Level Engineering & Kernel Inference Optimization in San Francisco, CA vacancy

Remote Low-Level Engineer: Kernel & Inference Optimization
$90 - $125 per hour
A cutting-edge AI company is looking for Low-Level Engineers to design RL environments that optimize kernel development and systems programming. Candidates should have strong Python skills and a solid understanding of LLMs. This remote contractor role offers an hourly...
Suggested
Remote job
Hourly pay
For contractors
Open Data Science
San Francisco, CA
1 day ago
LLM Inference Frameworks and Optimization Engineer
$160k - $230k
...efficient and scalable inference for large language... ...Our mission is to optimize inference... ...and Optimization Engineer to design, develop... ...This role focuses on low‑latency, high‑throughput... ...graph, compiled kernels, and efficient... ...determined by location, level and role. Equal...
Suggested
Full time
Together AI
San Francisco, CA
3 days ago
Edge Inference Engineer: Optimize On-Device AI Kernels
Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming...
Suggested
Flexible hours
Liquid AI
San Francisco, CA
1 day ago
Member of Technical Staff - Low Level & Kernels Capabilities
...Model is building automated ML research engineering. Existing frontier models are brittle... ...Machine Learning Engineers for our Low Level / Kernels Capabilities team. The Kernels team builds... ...experience: you write kernels and optimize them iteratively against a profiler....
Suggested
Visa sponsorship
Relocation package
Preference Model
San Francisco, CA
3 days ago
Senior Technical Recruiter: Kernel & Low-Level Systems
Gimlet Labs, Inc. is seeking a Senior Technical Recruiter to focus on recruiting for specialized roles in kernel, compiler, and low-level systems engineering. The ideal candidate will possess over 5 years of technical recruiting experience and a proven ability to source...
Suggested
Gimlet Labs, Inc.
San Francisco, CA
3 days ago
Distributed Training and Inference Engineer
...hands-on support from AMD engineers the team is scaling... ...Distributed Training and Inference Engineer to build, optimize, and maintain the critical... ...learning infrastructure from low-level CUDA/ROCm runtimes to high... ...runtime failures, and kernel-level inconsistencies. Collaborate...
Flexible hours
Sciforium
San Francisco, CA
3 days ago
LLM Inference & Optimization Engineer
Gravity Engineering Services Pvt Ltd. is looking for an Inference Frameworks and Optimization Engineer to enhance the performance of AI infrastructure. This role involves designing... ...multimodal models, optimizing frameworks for low-latency and high-throughput performance. The...
Gravity Engineering Services Pvt Ltd.
San Francisco, CA
1 day ago
KERNEL ENGINEER
...ABOUT THE ROLE You’ll write and optimize the GPU kernels and supporting systems software that makes our training and inference workloads fast. This is deep, low-level work (performance counters,... ...actually use. We hire kernel engineers because the gap between "this works...
Shift work
MakerMaker.AI
San Francisco, CA
2 days ago
Kernel Engineer for High-Performance AI Kernels
$225k
Magic is hiring a Kernel Engineer in San Francisco to design and maintain high-performance kernels that optimize throughput and latency during AI training and inference. The ideal candidate has low-level programming expertise, particularly for AI accelerators like NVIDIA...
Magic Inc
San Francisco, CA
4 days ago
Senior Engineer 2: GPU Kernel and Performance
$167.2k - $209k
...DigitalOcean is seeking a Senior Engineer 2 to play a key technical role in our AI Inference Optimization team. DigitalOcean aims to be... ...the inference engine and GPU kernel layers, ensuring our... ...skills, particularly related to low-level GPU programming - optimization...
Local area
Remote work
Worldwide
Flexible hours
DigitalOcean
San Francisco, CA
2 days ago
GPU Kernel Engineer
$100k - $120k
...foundation models. As training and inference workloads grow, we need kernel‑level innovations to reduce latency,... ...s founding team to architect and optimize low‑level compute kernels, drivers, and... ...Lead a team of kernel and system engineers focused on performance-critical...
Coda Robotics
San Francisco, CA
2 days ago
Robotics GPU Inference Engineer — Hybrid (Relocation)
OpenAI is seeking a GPU Inference Engineer based in San Francisco, CA. In this high-impact role, you'll optimize inference performance and scalability for Robotics... ...expertise in model performance optimization, kernel-level systems, and low-level performance tuning. The...
Work at office
Relocation
Relocation package
OpenAI
San Francisco, CA
2 days ago
TPU Kernel Engineer
$315k
...committed researchers, engineers, policy experts, and... ...About the Role As a TPU Kernel Engineer, you'll be... ...research, training, and inference. A significant portion... ...involve designing and optimizing kernels for the TPU. You... ...systems problems and low-level optimization. You may...
Contract work
For contractors
For subcontractor
Work at office
Relocation
Visa sponsorship
Work visa
Flexible hours
Menlo Ventures
San Francisco, CA
3 days ago
Inference Engineer, Robotics
...and pushing towards AGI‑level intelligence in... ...’re looking for a GPU Inference Engineer to contribute to improvements... ...drive initiatives to optimize inference performance... ...optimizations from a kernel and data movement perspective... ..., data movement, and low‑level performance...
Work at office
Relocation package
OpenAI
San Francisco, CA
2 days ago
Staff Engineer, Inference & Scalable Serving
Requirements Worked on system optimizations for model serving, such as batching... ..., and parallelism , Worked on low-level optimizations for inference, such as GPU kernels and code generation , Worked on... ...on large-scale inference engines or reinforcement learning frameworks...
xAI
San Francisco, CA
2 days ago
Member of Technical Staff - Edge Inference Engineer
...hardware, ensuring low latency, minimal... ...Opportunity Our Edge Inference team compiles... ...Foundation Models into optimized machine code that... ...at the hardware level: You understand cache... ...inference kernels for CPU, NPU, and... ...Embedded software engineering experience or work...
Liquid AI
San Francisco, CA
1 day ago
Founding GPU Kernel Engineer
$285k - $315k
...looking for a Founding GPU Kernel Engineer who lives right at the boundary... ...knowledge into compiler optimization passes that help every model... ...Profile at the microarchitectural level: look into SM utilization,... ...) Strong skills with low-level profiling tools: Nsight...
Full time
Work at office
Relocation package
SF Tensor
San Francisco, CA
4 days ago
Speech LLM Inference Engineer — Ultra-Low Latency Serving
$200k
Plaud is seeking skilled AI engineers to join their core SpeechLLM lab in San Francisco. You will play a crucial role in building high-throughput inference engines for conversational AI and optimizing GPU performance while collaborating with various teams. The position...
Work at office
Plaud
San Francisco, CA
4 days ago
Staff GenAI Kernel & Performance Engineer
...Francisco seeks a Staff Software Engineer to lead kernel-level performance engineering... ...involves designing and optimizing high-performance GPU... ...performance roadmaps for low-level compute paths. Ideal... ...on pushing the frontier of inference performance. #J-18808-Ljbffr...
Databricks
San Francisco, CA
3 days ago
Distributed LLM Inference Engineer
...software developers of all skill levels. Were commercializing Ray, a... ...About the role As a Distributed LLM Inference Engineer, you will help with systems and optimizations that push the boundaries of... ...providing optimizations achieving low-cost solutions for large scale ML...
Work at office
Anyscale
San Francisco, CA
4 days ago
LLM Inference Engineer: Frameworks & Optimizations
$160k - $230k
Together AI is seeking an Inference Frameworks and Optimization Engineer in San Francisco, California. The role focuses on designing and optimizing distributed inference engines, ensuring efficient deployment of large language models and vision models. The ideal candidate...
Together AI
San Francisco, CA
3 days ago
GPU Optimization Engineer
$300k
...Description GPU Optimisation Engineer - Real-Time Inference Want to push GPU... ...? This team is building low-latency AI systems where milliseconds... ...GPUs at an architectural level. Someone who knows where... ...lost: memory hierarchy, kernel launch overhead, occupancy...
Relocation
Visa sponsorship
Free visa
Techire Ai
San Francisco, CA
2 days ago
INFERENCE ENGINEER
...build and operate the inference systems that serve our... ...infrastructure, runtime optimization, and the long tail of... .... This is an engineering role, not a research role... ...(quantization, custom kernels, scheduling improvements... ...reading and writing systems-level code in at least one...
MakerMaker
San Francisco, CA
1 day ago
Real-Time GPU Inference Optimization Engineer
$300k
...leading technology firm in San Francisco seeks a GPU Optimisation Engineer to maximize GPU performance in real-time AI systems. The... ..., a deep understanding of GPU execution, and a knack for optimizing inference latency for large generative models. With a competitive base...
Visa sponsorship
Relocation package
Trades Workforce Solutions
San Francisco, CA
2 days ago
GPU Kernel Engineer — Fast ML Training
MakerMaker.AI in San Francisco is seeking a skilled Software Engineer to write and optimize GPU kernels. You will work on deep low-level tasks that directly impact the performance of machine learning models. The ideal candidate has over 4 years of experience with GPU kernels...
MakerMaker.AI
San Francisco, CA
1 day ago
GPU Kernel Engineer: Build Fast AI Inference at Scale
A leading AI acceleration company in San Francisco is seeking a GPU Kernel Engineer to optimize performance for machine learning models. You will be responsible for designing high-performance GPU kernels and using advanced techniques to boost computation efficiency. Ideal...
Baseten
San Francisco, CA
4 days ago
Founding Engineer, ML Inference
...unicorn founders and senior engineers with deep expertise in... ...Founding Engineer, ML Inference with deep expertise in... ...inference frameworks, optimizing inference performance,... ...edge in ultra-low-latency, high-throughput... ....compile, custom CUDA kernels, and specialized inference...
Relocation
Visa sponsorship
Relocation package
Reactor
San Francisco, CA
5 days ago
Kernel Engineer
$100k
...ultra-long context, and inference-time compute to... ...About the role: As a Kernel Engineer, you will design, implement... ...kernels to optimize throughput and latency... ...Think beyond the kernel level to the broader scheme... ...for: Experience with low-level programming of AI...
Remote job
Relocation
Visa sponsorship
Magic
San Francisco, CA
more than 2 months ago
Kernel Performance Engineer - AI Tooling & Systems
...research company in San Francisco is seeking a Systems Engineer focused on kernel optimization and AI-assisted workflows. You'll develop tooling to improve... ...in performance optimization, particularly in low-level software. Join us in shaping the future of AI development...
OpenAI
San Francisco, CA
3 days ago
CUDA Kernel Optimization Specialist
$80 - $120 per hour
...and Jack Dorsey . Position: CUDA Engineering Expert Type: Contract... ...Role Responsibilities Analyze and optimize GPU kernels for performance, efficiency, and hardware... ...least 1 year of professional or graduate-level research experience with GPUs . Strong...
Contract work
Summer work
Remote work
Mercor
San Francisco, CA
28 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to RLEE - Low-Level Engineering & Kernel Inference Optimization. Be the first to apply!