Member of Technical Staff - Kernels & GPU Performance
Gimlet Labs
About Us Gimlet Labs is building the first heterogeneous neocloud for AI workloads. As AI systems scale, the industry is hitting fundamental limits in power, capacity, and cost with today's homogeneous, vertically integrated infrastructure. Gimlet addresses this by decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates each component to hardware that best fits its performance and efficiency needs. This approach enables heterogeneous systems across multi-vendor and multi-generation hardware, including the latest emerging accelerators. These systems unlock step-function improvements in performance and cost efficiency at scale. On top of this foundation, Gimlet is building a production-grade neocloud for agentic workloads. Customers use Gimlet to deploy and manage their workloads through stable, production-ready APIs, without having to reason about hardware selection, placement, or low-level performance optimization. Gimlet works with foundation labs, hyperscalers, and AI native companies to power real production workloads built to scale to gigawatt-class AI datacenters.
Mission Gimlet Labs is seeking a Member of Technical Staff focused on kernels and GPU performance. In this role, you will work close to accelerators and execution hardware to extract maximum performance from AI workloads across diverse and rapidly evolving platforms. You will analyze low-level execution behavior, design and optimize kernels, and ensure performance is reliable across both established and emerging hardware. This role is ideal for engineers who enjoy deep performance work, reasoning about hardware tradeoffs, and turning theoretical peak performance into real-world results.
Responsibilities
Mission Gimlet Labs is seeking a Member of Technical Staff focused on kernels and GPU performance. In this role, you will work close to accelerators and execution hardware to extract maximum performance from AI workloads across diverse and rapidly evolving platforms. You will analyze low-level execution behavior, design and optimize kernels, and ensure performance is reliable across both established and emerging hardware. This role is ideal for engineers who enjoy deep performance work, reasoning about hardware tradeoffs, and turning theoretical peak performance into real-world results.
Responsibilities
- Design, implement, and optimize GPU and accelerator kernels for AI workloads
- Analyze and tune performance across the GPU execution stack, including memory access patterns, synchronization, and instruction scheduling
- Work with compilers and runtimes to ensure kernels integrate cleanly and perform well in end-to-end systems
- Bring up and optimize execution on new or emerging accelerators
- Profile, benchmark, and debug performance issues across kernels, runtimes, and hardware
- Ensure performance optimizations are robust, correct, and production-ready at scale
- Strong software engineering fundamentals
- Experience working on performance-critical systems close to hardware
- Comfort reasoning about low-level execution behavior, memory hierarchies, and performance tradeoffs
- Experience with CUDA, Triton, CUTLASS, or other accelerator programming models
- Deep understanding of GPU execution models (warps/wavefronts, blocks, grids)
- Experience optimizing memory access patterns (coalescing, shared memory, cache behavior)
- Familiarity with occupancy, latency hiding, and instruction-level parallelism
- Experience using profiling and performance analysis tools
- Familiarity with multi-GPU or distributed execution is a plus
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Kernels & GPU Performance in San Francisco, CA vacancy
- ...challenges and the wins. What You'll Do Bring deep kernel expertise to our AI agents that optimize high-performance, mission-critical computing systems. You'll shape... ...or optimizing kernels for ML or other GPU-heavy workloads Fluency in Python and C/C++, and...PerformanceWork at officeFlexible hours
- ...Member Of Technical Staff - Image / Video Generation Freiburg (Germany) About... ...models don't fit on one GPU and training decisions impact... ...and backward Triton kernels and ensuring their correctness... ...trace viewers Know the performance characteristics of different...PerformanceRemote workWorldwide2 days per week
$180k
...Member Of Technical Staff - Inference Palo Alto, CA About Xai Xai's mission is to create... ...Role We are building the high-performance inference platform that serves Grok... ...scaling) to deep low-level optimizations (GPU kernels, quantization, speculative decoding,...PerformanceTemporary work- ...Member of Technical Staff, Model Efficiency Who are we? Our mission is to scale intelligence... ...inference stack to improve core performance metrics by diving deep into model... ...performance techniques, including GPU/CUDA optimizations, kernel-level improvements, and model execution...PerformanceFull timeWork at officeRemote workFlexible hours
- ...component to hardware that best fits its performance and efficiency needs. This approach... ...Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference.... ...boundaries Work closely with compilers, kernels, networking, and distributed systems...Performance
$225k
...large-scale model training across massive GPU clusters. You will work at the boundary... ...systems, ensuring that training runs are performant, reliable, and reproducible under extreme... ...training throughput Collaborate with Kernels and Research to align model architecture...PerformanceRelocationVisa sponsorship- ...The Role We're looking for a Member of Technical Staff - Data & ML Infrastructure Engineer... ...regressions. You'll work across GPU kernels, inference systems, distributed training... ...Production AI deployment Performance engineering This role emerged directly...Performance
$256k - $276k
...Postman. The Opportunity As a Member of Technical Staff on AI Infrastructure, you will build and... ...and research teams to ensure performance, scalability, and reliability of critical... ...services Optimize performance for GPU/xPU accelerators and cloud environments...PerformanceWork at officeFlexible hours3 days per week$150k - $300k
...fine-tuning runs on managed GPU clusters with a single API call... ...runs the jobs. Core Technical Responsibilities Hosted... ...fundamentals: networking, namespaces, performance tuning Programming &... ...and encourage team members to contribute to the broader...PerformanceWork at officeLocal areaRemote workVisa sponsorshipRelocation packageFlexible hours$180k - $300k
...Member Of Technical Staff - Infrastructure Engineer Freiburg (Germany), San Francisco (USA) About... ...optimizing components to extract peak performance from the system (both on application,... ...Python, Bash, Go Kubernetes Nvidia GPU drivers, and operators OTel,...PerformanceWork at officeRemote workWorldwideRelocation2 days per week- ...scheduling and KV-cache management to support in API Gateway. GPU Kernels Migration to CuTe DSL. Port our in-house CUDA kernels to... ...Python pains and keep up with rapidly growing traffic. Performance Optimization. Profile and fix bottlenecks from network...Performance
- ...orchestrates each component to hardware that best fits its performance and efficiency needs. This approach enables heterogeneous... ...AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will build...Performance
- ...techniques and numerical precision trade-offs across different model scales Analyze, profile and debug low-level GPU operations to optimize performance Stay up-to-date on research to bring new ideas to work What We're Looking For Strong grasp of state-...Performance
- ...orchestrates each component to hardware that best fits its performance and efficiency needs. This approach enables heterogeneous systems... ...AI datacenters. Mission Gimlet Labs is seeking an Member of Staff focused on AI Research (Intern). As an AI Researcher (Intern...PerformanceInternship
$200k
...Infrastructure team, you will design, build, and operate the large-scale GPU infrastructure that powers Magic's model training and inference... ...Experience operating production GPU infrastructure or high-performance distributed systems Strong understanding of networking and...PerformanceRelocationVisa sponsorship$180k
...Member Of Technical Staff - RL Infrastructure Palo Alto, CA xAI's mission is to create AI systems that can accurately understand the universe... ...an efficient and robust environment for the agent to perform actions in? # Evaluations and observability are a core part...PerformanceTemporary work- ...volume data platform for AI applications. We are looking for team members who love building enabling systems that empower our engineers... ...resources and Kubernetes clusters for cost-effectiveness and performance. Enable external customer deployment success through...PerformanceWork at office
$200k - $350k
...Member of ML Technical Staff Title of Role: Member of ML Technical Staff Location: San Francisco, onsite Company Stage of Funding... ...continuous improvement of engineering practices. Analyze model performance and implement improvements based on quantitative metrics....PerformanceWork at officeVisa sponsorship- ...Member Of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your medical records, AI, and real doctors... ...compliance-grade visibility. Instrument model performance tracking in production — monitoring latency, token usage,...Performance
- ...Member Of Technical Staff, Training Infra Bay Area Ai Systems Inception creates the world's fastest, most efficient AI models. Our Mercury... ...scale across thousands of GPUs and nodes. Develop high-performance optimizations to maximize throughput and efficiency. Develop...PerformanceImmediate startFlexible hours
$150k
...We are seeking a Member of Technical Staff Simulation Engineer to join our AI robotics research team developing foundation models for robotics... ...Experience with transformer model optimization - Background in performance profiling and optimization - Experience working directly...PerformanceInternshipLocal area- ...training loops and distributed GPU training to massive-scale... ...training stacks Triton / custom kernels Data Infrastructure... ...Distributed systems High-performance computing You care deeply... ...and enjoy solving hard technical problems. What We Offer:...PerformanceRelocation package
$150k
...Amazon's Frontier AI & Robotics (FAR) team is seeking a Member of Technical Staff to drive foundational research and build intelligent robotic... ...collaborating with platform teams to ensure your models and algorithms perform robustly in dynamic real-world environments. You'll have...PerformanceLocal area- ...Member Of Technical Staff – Fullstack Stuut is transforming accounts receivable for B2B companies—making collections smarter and faster for... ...Frontend teams, as well as customers to deliver seamless, high-performance experiences across the entire stack. This is a high-...PerformanceFull timeFlexible hours
$125k - $200k
...agent system from the ground up Making critical technical decisions that will shape our product's future... ...transformations, and infrastructure Understanding of GPU infrastructure and model optimization for performance Our Values We play to dominate -...PerformanceFull timeTemporary workCurrently hiringImmediate startFlexible hours$200k
...Join to apply for the Member of Technical Staff role at Listen Labs . TL;DR: We are seeing strong market demand and an aggressive 6‑month product... ...enterprise wins at Google, Microsoft, Nestlé, and P&G. Performance: 83% win rate on deals with no losses to competitors. Market...PerformanceFlexible hours$180k
...fulfill the need of our high-performance large-scale LLM... ...scale LLMs with JAX (on GPU or TPU) and applying various... ...complex use cases. Kernel Compiler Experience:... ...interview”) during which a member of our team will ask... ...which consists of four technical interviews: # Coding...PerformanceTemporary workRelocation$300k
...inference and/or RL training. Experience with GPU clusters, distributed training, model... ...observability, testing, debugging, and performance optimization. Ability to work closely... ..., platforms, or services used by other technical users. Strong judgment around technical...PerformanceFull timeWork at officeLocal area$1,000 per month
...Member of Technical Staff, Mechanical Engineer Title of Role: Member of Technical Staff, Mechanical Engineer Location: San Francisco... ...element analysis (FEA) to validate design integrity and performance. Collaborate with cross-functional teams to ensure seamless...PerformanceWork at office$100k - $300k
...ambitious Backend Senior and Staff Engineers who are excited to... ...support and uplevel future team members Participate in, provide... ...as a hands-on engineer and technical leader, overseeing and contributing... ..., to support high-performance, enterprise-level applications...Performance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff - Kernels & GPU Performance. Be the first to apply!
Related searches
- technical support assistant San Francisco, CA
- technical analyst San Francisco, CA
- end user support technician San Francisco, CA
- IT assistant San Francisco, CA
- help desk assistant San Francisco, CA
- IT support technician San Francisco, CA
- operations support technician San Francisco, CA
- desktop support analyst San Francisco, CA
- support analyst San Francisco, CA
- technical associate San Francisco, CA

