Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Software Engineer - GenAI Performance and Kernel

$190.9k - $232.8k

Cacheflow

About This Role As a staff software engineer for GenAI Performance and Kernel, you will own the design, implementation, optimization, and correctness of the high-performance GPU kernels powering our GenAI inference stack. You will lead development of highly-tuned, low-level compute paths, manage trade-offs between hardware efficiency and generality, and mentor others in kernel-level performance engineering. You will work closely with ML researchers, systems engineers, and product teams to push the state-of-the-art in inference performance at scale. What You Will Do Lead the design, implementation, benchmarking, and maintenance of core compute kernels (e.g. attention, MLP, softmax, layernorm, memory management) optimized for various hardware backends (GPU, accelerators) Drive the performance roadmap for kernel-level improvements: vectorization, tensorization, tiling, fusion, mixed precision, sparsity, quantization, memory reuse, scheduling, auto-tuning, etc. Integrate kernel optimizations with higher-level ML systems Build and maintain profiling, instrumentation, and verification tooling to detect correctness, performance regressions, numerical issues, and hardware utilization gaps Lead performance investigations and root‑cause analysis on inference bottlenecks, e.g. memory bandwidth, cache contention, kernel launch overhead, tensor fragmentation Establish coding patterns, abstractions, and frameworks to modularize kernels for reuse, cross‑backend portability, and maintainability Influence system architecture decisions to make kernel improvements more effective (e.g. memory layout, dataflow scheduling, kernel fusion boundaries) Mentor and guide other engineers working on lower‑level performance, provide code reviews, help set best practices Collaborate with infrastructure, tooling, and ML teams to roll out kernel-level optimizations into production, and monitor their impact What We Look For BS/MS/PhD in Computer Science, or a related field Deep hands‑on experience writing and tuning compute kernels (CUDA, Triton, OpenCL, LLVM IR, assembly or similar sort) for ML workloads Strong knowledge of GPU/accelerator architecture: warp structure, memory hierarchy (global, shared, register, L1/L2 caches), tensor cores, scheduling, SM occupancy, etc. Experience with advanced optimization techniques: tiling, blocking, software pipelining, vectorization, fusion, loop transformations, auto‑tuning Familiarity with ML‑specific kernel libraries (cuBLAS, cuDNN, CUTLASS, oneDNN, etc.) or open kernels Strong debugging and profiling skills (Nsight, NVProf, perf, vtune, custom instrumentation) Experience reasoning about numerical stability, mixed precision, quantization, and error propagation Experience in integrating optimized kernels into real‑world ML inference systems; exposure to distributed inference pipelines, memory management, and runtime systems Experience building high‑performance products leveraging GPU acceleration Excellent communication and leadership skills — able to drive design discussions, mentor colleagues, and make trade‑offs visible A track record of shipping performance‑critical, high‑quality production software Bonus: published in systems/ML performance venues (e.g. MLSys, ASPLOS, ISCA, PPoPP), experience with custom accelerators or FPGA, experience with sparsity or model compression techniques Pay Range Transparency Databricks is committed to fair and equitable compensation practices. The pay range(s) for this role is listed below and represents the expected salary range for non‑commissionable roles or on‑target earnings for commissionable roles. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to job‑related skills, depth of experience, relevant certifications and training, and specific work location. Based on the factors above, Databricks anticipates utilizing the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus, equity, and the benefits listed above. For more information regarding which range your location is in visit our page here. Local Pay Range

$190,900 — $232,800 USD

About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visit Our Commitment to Diversity and Inclusion At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio‑economic status, veteran status, and other protected characteristics. Compliance If access to export‑controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone. #J-18808-Ljbffr Cacheflow

Vacancy posted 22 hours ago
Similar jobs that could be interesting for youBased on the Staff Software Engineer - GenAI Performance and Kernel in San Francisco, CA vacancy
  • Staff Software Engineer - GenAI Performance and Kernel
    Performance

    Databricks

    San Francisco, CA
    22 hours ago
  • A leading data and AI company in San Francisco seeks a Staff Software Engineer to lead kernel-level performance engineering for GenAI workloads. The role involves designing and optimizing high-performance GPU kernels, mentoring engineers, and driving performance roadmaps... 
    Performance

    Databricks

    San Francisco, CA
    4 days ago
  • $208k - $250k

     ...for the job. Are you an ambitious engineer looking to make an outstanding impact...  ...Ripple Labs Inc., we are seeking a GenAI Platform Staff Software Engineer to join our team in San Francisco...  ...build-vs-use decisions based on performance, scalability, control, and long‑term... 
    Performance
    Local area

    Ripple

    San Francisco, CA
    1 day ago
  • $142.2k - $204.6k

     ...P-1284 About This Role As a software engineer for GenAI inference, you will help design, develop...  ...full GenAI inference stack - from kernels and runtimes to orchestration and memory...  ...(3+ years or equivalent) in performance-critical systems Solid understanding... 
    Performance
    Local area
    Worldwide

    Databricks

    San Francisco, CA
    4 days ago
  • A pioneering tech company in San Francisco is seeking a Software Engineer to lead engineering projects on their Product Engineering team. The...  ..., and Python, with a strong focus on application layer and genAI products. This role requires collaboration with various teams... 
    Suggested
    Relocation package

    Harvey

    San Francisco, CA
    22 hours ago
  • $166k - $225k

     ...Job Description As a research engineer on the Scaling team, you will...  ...Databricks, you will: Drive performance improvements through advanced...  ...optimization techniques including kernel fusion, mixed precision,...  ...distributed workloads Strong software engineering skills in Python... 
    Performance
    Worldwide

    Cacheflow

    San Francisco, CA
    2 days ago
  • $240k - $310k

     ...Staff Software Engineer Crusoe is on a mission to accelerate the abundance of energy and intelligence...  ...AI strategies, and be part of a high-performing team that believes in each other, come...  ...bottlenecks in the stack—from kernel-level IO context switching to global tail... 
    Performance
    Temporary work

    Crusoe

    San Francisco, CA
    3 days ago
  • $194k - $267k

     ...too, let's talk. The Global AI Engineering Team At Okta, you'll be...  ...opportunity to make an impact. The Staff Software Engineer Opportunity At...  ...intelligent automation and GenAI real for our workforce - all while ensuring performance, security, and reliability at... 
    Performance
    Local area
    Worldwide
    Flexible hours

    Okta, Inc.

    San Francisco, CA
    1 day ago
  •  ...Principal Staff Engineer As a Principal Staff Engineer at Jazzx.ai,...  ...and enhance innovative, high-performance, AI-driven platform, products...  ...reliable, and user-centric software solutions. Lead architectural...  ...applications with modern GenAI technologies, LLMs and AI... 
    Performance

    JBA International

    San Francisco, CA
    3 days ago
  • $180k - $250k

     ...Staff Software Engineer, ML Performance & Systems San Francisco fal is the generative media ecosystem powering the next generation of AI products...  ...deeper into the stack to fix bottlenecks (custom GEMM kernels with CUTLASS for common shapes). Proficient in Triton... 
    Performance
    Currently hiring
    Relocation package

    fal

    San Francisco, CA
    19 days ago
  •  ...Overview As a Senior/Staff Embedded Linux Engineer at BrightAI, you will help...  ...bootloader configuration, kernel updates, and device tree changes...  ...maintain low-level system software in C/C++, working closely...  ...system reliability, performance, boot time, and debuggability... 
    Performance

    BrightAI Corporation

    San Francisco, CA
    1 day ago
  • $200k - $225k

     ...applying robotics and distributed software to create a new class of...  ...looking for an experienced Staff Software Engineer to develop software...  ...systems critical to our robot's performance. Responsibilities:...  ...Linux operating systems and kernel fundamentals A Final Note... 
    Performance
    Work at office

    Mytra

    Brisbane, CA
    2 days ago
  •  ...GenAI Software Engineers We are looking for GenAI Software Engineers of all levels who are passionate about making a positive impact. You...  ...understanding model behavior—developing tools to measure system performance, conduct A/B tests against established baselines, and... 
    Performance
    Work at office
    Relocation package

    Abridge

    San Francisco, CA
    2 days ago
  • $300 per month

     ...AI strategies, and be part of a high‑performing team that believes in each other, come...  ...a highly skilled and motivated Senior Staff Software Engineer - SoftwareDefinedNetworking to lead the...  ...SmartNICs, and DPUs/IPUs within the Linux Kernel to significantly enhance network... 
    Performance
    Temporary work

    Crusoe Energy Systems LLC

    San Francisco, CA
    22 hours ago
  • $216k - $270k

     ...private evaluations. About Data Engine Our Generative AI Data...  ...several teams within the GenAI Engineering organization, based...  ...Requirements: ~5+ years of software engineering experience,...  ...scale ~ Drive reliability and performance across critical infrastructure... 
    Performance
    Full time

    Scale AI

    San Francisco, CA
    14 days ago
  • Requirements BS in Mechatronics Engineering, Electrical...  ...changes affect robotics software and vice-versa ,...  ...operating systems and kernel fundamentals , F this...  ...looking for an experienced Staff Software Engineer to develop...  ...to our robot's performance , Deliver high quality... 
    Performance

    Mytra

    San Francisco, CA
    3 days ago
  •  ...applying robotics and distributed software to create a new class of...  ...looking for an experienced Staff Software Engineer to develop software...  ...systems critical to our robot’s performance. Responsibilities:...  ...Linux operating systems and kernel fundamentals A Final Note:... 
    Performance
    Work at office
    Flexible hours

    Mytra

    Brisbane, CA
    3 days ago
  •  ...generation of AI‑native silicon while working closely with software and research partners to co‑design hardware...  ...the Role We are looking for a systems‑minded engineer to help advance our kernel development, performance engineering, and hardware‑software co‑design capabilities... 
    Performance

    OpenAI

    San Francisco, CA
    3 days ago
  • $207k - $301k

     .... 8 years of experience in software development. 5 years of experience...  ...Android Audio HAL, Linux kernel, and audio frameworks....  ...Master’s degree or PhD in Engineering, Computer Science, or a related...  ...integration. Orchestrate audio performance across multiple chips (e.g.,... 
    Performance

    Google

    San Francisco, CA
    1 day ago
  • About the job FriendliAI is looking for a GPU Kernel Engineer to design, build, and optimize the low-level compute kernels that power our...  ...Key Responsibilities Design, implement, and optimize high-performance GPU kernels for AI inference (e.g., GEMM, attention, routing... 
    Performance
    Flexible hours

    FriendliAI

    San Francisco, CA
    22 hours ago
  • $100k - $120k

    Coda Robotics is looking for an experienced engineer to join their founding team, focusing on low-level compute kernels to enhance robotic foundation models. The ideal candidate will have substantial experience in systems programming (C/C++, assembly), expertise in GPU... 
    Performance

    Coda Robotics

    San Francisco, CA
    2 days ago
  •  ...The goal is to build the engineering foundation that allows researchers...  ...stacks Triton / custom kernels Data Infrastructure...  ...You You are a strong software engineer who speaks the language...  ...Distributed systems High-performance computing You care deeply... 
    Performance
    Relocation package

    Reflection AI

    San Francisco, CA
    2 days ago
  • $300 per month

     ...intelligence. We’re crafting the engine that powers a world...  ...: The Crusoe Cloud Software Development team is...  ...experienced Senior Staff Software Engineer specializing...  ...O virtualization, and performance optimization is...  ...tolerance mechanisms. Linux Kernel Familiarity :... 
    Performance
    Full time
    Temporary work

    Crusoe Energy Systems LLC

    San Francisco, CA
    2 days ago
  • $180k - $200k

     ...follow us on LinkedIn. AI Engineering @ Ironclad Ironclad is...  ...PostgreSQL , ensuring high performance and availability. Build Production...  ...reliable, highly scalable software and services designed for a...  ...0,000 Base Salary Range - Staff: $210,000 - $235,000 The... 
    Performance
    Contract work

    Ironclad

    San Francisco, CA
    3 days ago
  • A leading AI technology firm located in San Francisco is seeking a Research Engineer specializing in AI Performance & Kernel Optimization. The role involves enhancing the performance of large-scale AI systems, optimizing kernels, and collaborating with various teams. Ideal... 
    Performance

    Zyphra

    San Francisco, CA
    1 day ago
  • Quadric in San Francisco is looking for an experienced AI Kernel Engineer to develop and optimize AI kernels for their innovative neural processing platform. This role involves enhancing performance for various hardware configurations and providing technical support to... 
    Performance

    Quadric

    San Francisco, CA
    22 hours ago
  • $279.2k - $390.9k

     ...management of data. With a focus on performance, reliability, and scalability...  ..., Lexical retrieval & GenAI applications. How You'll Have...  ...ML Indexing & Retrieval engine, integrating capabilities across...  ...10+ years of experience in software engineering, specializing in... 
    Performance
    For contractors
    Work experience placement
    Remote work
    Flexible hours

    Tensec

    San Francisco, CA
    2 days ago
  •  ...architecture. Quadric's co-optimized software and hardware is targeted to run neural...  ...operated smart-sensor systems to high-performance automotive or autonomous vehicle systems...  ...DSP and control code. Role The AI Kernel Engineer in Quadric plays the key role to enable... 
    Performance

    Quadric

    San Francisco, CA
    22 hours ago
  • $252k - $315k

     ..., and more. We are looking for a strong engineer to join our team and help us build and scale...  ...will have a strong understanding of software engineering principles and practices, as...  ..., experience, qualifications, interview performance, and relevant education or training. Scale... 
    Performance
    Full time

    Scale AI

    San Francisco, CA
    22 hours ago
  •  ...A leading software monitoring company is seeking a Staff Software Engineer for their Issue Workflow team. This role involves architecting systems at massive scale, solving performance challenges, and mentoring team members. Ideal candidates have over 10 years of experience... 
    Performance

    Sentry

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer - GenAI Performance and Kernel. Be the first to apply!