Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Lead Kubernetes & GitOps Engineer for GPU Inference

Jack & Jill/External ATS

A cutting-edge AI infrastructure startup is seeking a Kubernetes DevOps Engineer to join their innovative team in San Francisco. The role involves building and maintaining production-grade Kubernetes clusters across various environments, focusing on high-performance GPU workloads. Ideal candidates will have deep Kubernetes expertise and a strong foundation in GitOps methodologies. This position offers the chance to influence core architecture in a high-impact startup with significant venture backing. #J-18808-Ljbffr Jack & Jill/External ATS

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Lead Kubernetes & GitOps Engineer for GPU Inference in San Francisco, CA vacancy
  • FriendliAI is seeking a GPU Kernel Engineer in San Francisco to design and optimize GPU kernels for AI inference. This role requires expertise in CUDA, C++, and performance-critical systems. You will work on cutting-edge GPU technology and contribute to a highly collaborative... 
    Suggested

    FriendliAI

    San Francisco, CA
    4 days ago
  • $220k - $320k

    inference.net, a growing company in San Francisco, seeks an experienced engineer to optimize AI inference performance. The ideal candidate will have over 2 years of experience in ML systems and GPU programming. Key responsibilities include implementing optimization techniques... 
    Suggested

    inference.net

    San Francisco, CA
    4 days ago
  • $180k - $250k

    A leading technology company in San Francisco is seeking a skilled engineer to build custom compute environments, enhancing GPU performance for customer workloads. Candidates should have deep...  ...fundamentals, along with experience in Kubernetes cluster management. The role... 
    Suggested
    Relocation package

    Fal

    San Francisco, CA
    4 days ago
  • Genesis AI is seeking an experienced individual to develop low-latency inference pipelines for on-device deployment in robotics. The role involves designing and optimizing distributed systems on GPU clusters, implementing efficient low-level code such as CUDA and Triton... 
    Suggested

    Genesis AI

    San Francisco, CA
    4 days ago
  • $160k - $320k

    A leading AI computing firm is seeking a Systems Engineer in San Francisco or Los Angeles to scale AI inference. Candidates should have strong C++ skills, HPC experience, and knowledge of...  ...techniques. Responsibilities include designing GPU kernels, optimizing performance, and... 
    Suggested

    Vast.ai

    San Francisco, CA
    4 days ago
  • $160k - $230k

     ...LLM Inference Frameworks and Optimization Engineer San Francisco, Singapore, Amsterdam About...  ...-throughput inference, GPU/accelerator optimizations...  ...orchestration frameworks, such as Kubernetes (K8S) Contributions to...  .... We have contributed to leading open-source research,... 
    Full time

    Together AI

    San Francisco, CA
    1 day ago
  •  ...Job Description Machine Learning Engineer, Inference Want to solve realtime inference problems...  ...streaming inference, scheduler design, GPU utilisation, concurrency optimisation,...  ..., C++, Python, CUDA, TensorRT, Triton, Kubernetes, AWS, and custom realtime inference... 
    Remote work
    Flexible hours

    techire ai

    San Francisco, CA
    4 days ago
  • $220k - $320k

    A tech startup specializing in AI inference seeks a skilled professional to optimize their inference stack. Candidates should have over 2 years of experience in ML systems, fluency in Python, and hands-on experience with LLM frameworks. The role offers competitive compensation... 
    Local area

    Inference

    San Francisco, CA
    3 days ago
  • A leading AI acceleration company in San Francisco is seeking a GPU Kernel Engineer to optimize performance for machine learning models. You will be responsible for designing high-performance GPU kernels and using advanced techniques to boost computation efficiency. Ideal... 

    Baseten

    San Francisco, CA
    4 days ago
  • An innovative company is seeking a talented software engineer to join their dynamic Inference team. This role involves designing and implementing infrastructure for large-scale multimodal models, focusing on high-performance delivery of audio and image inputs. You'll collaborate... 

    OpenAI

    San Francisco, CA
    3 days ago
  •  ...for building our Forward Deployed Engineering function. This team plays a...  ...contributing to our internal codebase for inference, fine‑tuning, and evaluation. Lead end‑to‑end deployment across...  ...modern DevOps practices (Docker, Kubernetes, and CI/CD). Deep understanding... 
    Full time
    Relocation package

    B Capital

    San Francisco, CA
    12 hours ago
  • $200k - $280k

    A leading AI company in San Francisco is looking for a Staff Machine Learning Engineer to enhance inference systems at production scale. You will design algorithms, optimize performance, and collaborate on RL and post-training pipelines. Ideal candidates have 3+ years of... 
    Full time

    AI Chopping Block, Inc.

    San Francisco, CA
    1 day ago
  •  ...GPU Kernel Engineer Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary, high-efficiency...  ...high-level ML frameworks used for large-scale training and inference. This role is ideal for someone who thrives at the... 
    Flexible hours

    Sciforium

    San Francisco, CA
    1 day ago
  •  ...Member of Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves designing end-to-...  ...real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems, and proficiency... 

    Acceler8 Talent

    San Francisco, CA
    3 days ago
  • $300k

    GPU Optimisation Engineer — Real-Time Inference Want to push GPU performance to its limits — not in theory, but in production systems handling real-time speech and multimodal workloads? This team is building low-latency AI systems where milliseconds actually matter. The... 
    Relocation
    Visa sponsorship
    Free visa

    Techire Ai

    San Francisco, CA
    1 day ago
  • $100k - $120k

     ...foundation models. As training and inference workloads grow, we need kernel‑level...  ...and faster. Responsibilities Lead a team of kernel and system engineers focused on performance-critical code...  ...compute kernels for CPU (AVX/ARM NEON), GPU (CUDA/ROCm), and hardware... 

    Coda Robotics

    San Francisco, CA
    2 days ago
  • $160k - $230k

    Together AI is seeking an Inference Frameworks and Optimization Engineer in San Francisco, California. The role focuses on designing and optimizing distributed...  ...in deep learning inference frameworks, proficiency in GPU programming, and strong collaboration skills.... 

    Together AI

    San Francisco, CA
    3 days ago
  • ABOUT BASETEN Baseten powers mission‑critical inference for the world's most dynamic AI companies, like...  ...Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE We’re seeking a GPU Kernel Engineer to join our team at the cutting... 
    Flexible hours

    Baseten

    San Francisco, CA
    1 day ago
  • $180k - $270k

     ...individuals for AI infrastructure roles in San Francisco, focusing on building high-performance inference engines for speech AI. Ideal candidates will have substantial experience in GPU architecture and real-time systems. This position offers a competitive salary range of $18... 

    Plaud

    San Francisco, CA
    2 days ago
  •  ...revenue, raised an $80M Series A, and is scaling a world-class engineering team across inference, distributed systems, compiler infrastructure, and high-...  ...‑adjacent systems experience Experience optimizing GPU utilization at scale Background in AI infrastructure or high... 

    Acceler8 Talent

    San Francisco, CA
    2 days ago
  •  ...team of YC and unicorn founders and senior engineers with deep expertise in 3D, generative...  ...We're looking for a Founding Engineer, ML Inference with deep expertise in high-performance ML...  ...serving architectures Working knowledge of GPU hardware (NVIDIA) and the ability to dive... 
    Relocation
    Visa sponsorship
    Relocation package

    Reactor

    San Francisco, CA
    12 hours ago
  • A leading AI infrastructure company in San Francisco is seeking an Infrastructure Ops Engineer to manage the operational health of their GPU fleet. This role involves working closely with customer success...  ...possess strong skills in Kubernetes and a solid background in cloud... 

    Baseten

    San Francisco, CA
    2 days ago
  •  ...Baseten powers mission‑critical inference for the world's most dynamic...  ...and help build the platform engineers turn to to ship AI products....  ...foundational engineers to lead our GPU Networking efforts, making...  ...upstream tools (like standard Kubernetes networking) are too slow for... 
    Flexible hours

    Baseten

    San Francisco, CA
    4 days ago
  • $160k - $320k

     ...initiative and deliver excellence. We seek engineers/researchers with strong intrinsic drive, a...  ...parallel programming experience to help scale AI inference. You'll leverage your knowledge of high-performance systems to optimize GPU performance at the bleeding edge of AI.... 
    Full time
    Work at office

    Vast.ai

    San Francisco, CA
    4 days ago
  • $280k

    Anthropic is looking for a TPU Kernel Engineer in San Francisco, California. In this role, you will identify and resolve performance...  ...issues across ML systems, particularly in research, training, and inference. You will design and optimize TPU kernels and provide critical... 

    Anthropic

    San Francisco, CA
    2 days ago
  • $160k - $200k

     ...Infrastructure Operations Engineer Lightning AI is the...  ..., and production inference, with security, observability...  ...operational scale for GPU infrastructure. This...  ...years experience with Kubernetes and strong container...  ...Familiarity with the gitops workflow. ~ Software... 
    Remote work
    Work from home
    Flexible hours

    Lightning AI

    San Francisco, CA
    2 days ago
  •  ...into useful intelligence - the inference services that serve LLMs at...  ...about both. Researchers and ML engineers will hand you workloads that...  ...cost across heterogeneous GPU fleets. Batching, scheduling,...  ...by. ~ Experience operating Kubernetes-based infrastructure, including... 
    Flexible hours

    Adaption

    San Francisco, CA
    3 days ago
  •  ...Kubernetes Expert Looking for vanilla Kubernetes experts to help us define and implement a Kube ecosystem including where AKS ends and where InfTech begins. Minimum years of experience 5+. Top responsibilities you would expect the subcon to shoulder and execute Kubernetes... 

    ClifyX

    San Francisco, CA
    1 day ago
  •  ...superintelligence. One person, one GPU. If you'd like to...  ...Tuesday. Product Engineering at Lambda is...  ...supporting AI training and inference at scale. Lambda's...  ...Partnerships: Lead technical communications...  ...Experience with Kubernetes, including CSI and COSI... 
    Work at office
    Local area
    Work from home
    Flexible hours

    Lambda

    San Francisco, CA
    1 day ago
  •  ...expertise in model innovation and systems engineering paired with a design-minded product...  ...models and experiences. We're funded by leading investors at Index Ventures and Lightspeed...  ...About the Role We're hiring an Inference Engineer to advance our mission of building... 
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia, Inc.

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Lead Kubernetes & GitOps Engineer for GPU Inference. Be the first to apply!