Lead Kubernetes & GitOps Engineer for GPU Inference

Jack & Jill/External ATS

A cutting-edge AI infrastructure startup is seeking a Kubernetes DevOps Engineer to join their innovative team in San Francisco. The role involves building and maintaining production-grade Kubernetes clusters across various environments, focusing on high-performance GPU workloads. Ideal candidates will have deep Kubernetes expertise and a strong foundation in GitOps methodologies. This position offers the chance to influence core architecture in a high-impact startup with significant venture backing. #J-18808-Ljbffr Jack & Jill/External ATS

Apply

Vacancy posted 16 hours ago

Similar jobs that could be interesting for youBased on the Lead Kubernetes & GitOps Engineer for GPU Inference in San Francisco, CA vacancy

GPU Kernel Engineer for AI Inference & Performance
FriendliAI is seeking a GPU Kernel Engineer in San Francisco to design and optimize GPU kernels for AI inference. This role requires expertise in CUDA, C++, and performance-critical systems. You will work on cutting-edge GPU technology and contribute to a highly collaborative...
Suggested
FriendliAI
San Francisco, CA
2 days ago
Senior Inference Performance Engineer - GPU & CUDA
$220k - $320k
inference.net, a growing company in San Francisco, seeks an experienced engineer to optimize AI inference performance. The ideal candidate will have over 2 years of experience in ML systems and GPU programming. Key responsibilities include implementing optimization techniques...
Suggested
inference.net
San Francisco, CA
2 days ago
Real-Time GPU Inference Optimization Engineer
$300k
A leading technology firm in San Francisco seeks a GPU Optimisation Engineer to maximize GPU performance in real-time AI systems. The ideal candidate will possess strong... ...of GPU execution, and a knack for optimizing inference latency for large generative models. With a...
Suggested
Visa sponsorship
Relocation package
Trades Workforce Solutions
San Francisco, CA
16 hours ago
Senior GPU-Driven Virtualization & Kubernetes Engineer
$180k - $250k
A leading technology company in San Francisco is seeking a skilled engineer to build custom compute environments, enhancing GPU performance for customer workloads. Candidates should have deep... ...fundamentals, along with experience in Kubernetes cluster management. The role...
Suggested
Relocation package
Fal
San Francisco, CA
2 days ago
Robotics GPU Inference Engineer — Hybrid (Relocation)
OpenAI is seeking a GPU Inference Engineer based in San Francisco, CA. In this high-impact role, you'll optimize inference performance and scalability for Robotics research, driving engineering efforts to enhance model serving and system efficiency. The ideal candidate...
Suggested
Work at office
Relocation
Relocation package
OpenAI
San Francisco, CA
16 hours ago
LLM Inference Frameworks and Optimization Engineer
$160k - $230k
...LLM Inference Frameworks and Optimization Engineer San Francisco, Singapore, Amsterdam About... ...-throughput inference, GPU/accelerator optimizations... ...orchestration frameworks, such as Kubernetes (K8S) Contributions to... .... We have contributed to leading open-source research,...
Full time
Together AI
San Francisco, CA
24 days ago
Inference Engineer
...Job Description Machine Learning Engineer, Inference Want to solve realtime inference problems... ...streaming inference, scheduler design, GPU utilisation, concurrency optimisation,... ..., C++, Python, CUDA, TensorRT, Triton, Kubernetes, AWS, and custom realtime inference...
Remote work
Flexible hours
techire ai
San Francisco, CA
16 hours ago
GPU Kernel Engineer: Build Fast AI Inference at Scale
A leading AI acceleration company in San Francisco is seeking a GPU Kernel Engineer to optimize performance for machine learning models. You will be responsible for designing high-performance GPU kernels and using advanced techniques to boost computation efficiency. Ideal...
Baseten
San Francisco, CA
2 days ago
Senior Inference Performance Engineer — GPU & CUDA
$220k - $320k
A tech startup specializing in AI inference seeks a skilled professional to optimize their inference stack. Candidates should have over 2 years of experience in ML systems, fluency in Python, and hands-on experience with LLM frameworks. The role offers competitive compensation...
Local area
Inference
San Francisco, CA
1 day ago
Forward Deployed Engineer, Lead - LLM Post-training
...and constraints. As a Forward Deployed Engineer Lead, Post-Training, you will own the end-to-... ...working with infrastructure teams to ensure inference performance, cost efficiency, and... ...scale: comfortable working with multi-node GPU clusters, managing large training runs,...
Relocation package
Reflection
San Francisco, CA
2 days ago
HPC/ GPU Hardware Engineer
...history. When people finance GPU clusters, the datacenters... ...year contracts for computer and inference, but sell to customers on monthly... ...shape culture, mentor junior engineers, and learn from our customers... ...systems such as Slurm and Kubernetes Exposure to virtualization...
Long term contract
Contract work
Fixed term contract
Work at office
Local area
Visa sponsorship
Shift work
3 days per week
The San Francisco Compute Company
San Francisco, CA
16 hours ago
Staff ML Inference Systems Engineer - Scalable GPU Infra (SF)
...Member of Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves designing end-to-... ...real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems, and proficiency...
Acceler8 Talent
San Francisco, CA
1 day ago
GPU Kernel Engineer
...Sciforium Gpu Kernel Engineer Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary... ...high-level ML frameworks used for large-scale training and inference. This role is ideal for someone who thrives at the intersection...
Flexible hours
Sciforium
San Francisco, CA
4 days ago
AI Inference BD Lead - Startup GTM (Bay Area)
$160k - $220k
...Business Development Manager, AI Inference (Startup GTM) Position Type:... ...infrastructure, and direct engineering support for teams shipping to... ...Do Source and qualify BD leads across the Bay Area AI startup... ...selling AI infrastructure, MaaS, GPU compute, or LLM API products...
Work experience placement
IntelliPro
San Francisco, CA
16 hours ago
Performance Engineer, GPU
$280k
...growing group of committed researchers, engineers, policy experts, and business leaders working... ...AI requires breakthrough innovations in GPU performance and systems engineering. As a... ...capabilities and dramatically improve inference efficiency. Working at the intersection...
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
1 day ago
GPU Kernel Engineer
$100k - $120k
...foundation models. As training and inference workloads grow, we need kernel‑level... ...and faster. Responsibilities Lead a team of kernel and system engineers focused on performance-critical code... ...compute kernels for CPU (AVX/ARM NEON), GPU (CUDA/ROCm), and hardware...
Coda Robotics
San Francisco, CA
16 hours ago
LLM Inference Engineer: Frameworks & Optimizations
$160k - $230k
Together AI is seeking an Inference Frameworks and Optimization Engineer in San Francisco, California. The role focuses on designing and optimizing distributed... ...in deep learning inference frameworks, proficiency in GPU programming, and strong collaboration skills....
Together AI
San Francisco, CA
1 day ago
Kubernetes Infra Ops Engineer — AI Fleet & Capacity
A leading AI infrastructure company in San Francisco is seeking an Infrastructure Ops Engineer to manage the operational health of their GPU fleet. This role involves working closely with customer success... ...possess strong skills in Kubernetes and a solid background in cloud...
Baseten
San Francisco, CA
16 hours ago
Senior Engineer 2: AI Inference Engine Systems
$167.2k - $209k
...applications. We are seeking a Senior Engineer 2 to join our AI Inference Data Plane team. In this role, you... ...and scale their models with industry-leading performance and reliability. This is... ...Hardware & Interconnects: Understanding of GPU‑level optimisation and experience...
Local area
Remote work
Worldwide
Flexible hours
DigitalOcean
San Francisco, CA
6 days ago
INFERENCE ENGINEER
...ABOUT THE ROLE You build and operate the inference systems that serve our models in... ...with running real workloads. This is an engineering role, not a research role. You'll measure... ...measured before you change Experience with GPU‑accelerated inference at scale (multi‑GPU...
MakerMaker.AI
San Francisco, CA
4 days ago
GPU Optimization Engineer
$300k
GPU Optimisation Engineer — Real-Time Inference Want to push GPU performance to its limits — not in theory, but in production systems handling real-time speech and multimodal workloads? This team is building low-latency AI systems where milliseconds actually matter. The...
Relocation
Visa sponsorship
Free visa
Techire Ai
San Francisco, CA
4 days ago
Kernel Engineer- GPU
ABOUT BASETEN Baseten powers mission‑critical inference for the world's most dynamic AI companies, like... ...Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE We’re seeking a GPU Kernel Engineer to join our team at the cutting...
Flexible hours
Baseten
San Francisco, CA
4 days ago
Founding Engineer, ML Inference
...team of YC and unicorn founders and senior engineers with deep expertise in 3D, generative... ...We're looking for a Founding Engineer, ML Inference with deep expertise in high-performance ML... ...serving architectures Working knowledge of GPU hardware (NVIDIA) and the ability to dive...
Relocation
Visa sponsorship
Relocation package
Reactor
San Francisco, CA
3 days ago
Senior Engineer 2: GPU Kernel and Performance
$167.2k - $209k
...DigitalOcean is seeking a Senior Engineer 2 to play a key technical role in our AI Inference Optimization team. DigitalOcean aims... ...ensure we can offer the industry-leading performance for our inference... ...optimizations at the inference engine and GPU kernel layers, ensuring our...
Local area
Remote work
Worldwide
Flexible hours
DigitalOcean
San Francisco, CA
16 hours ago
Software Engineer — GPU Networking & Distributed Systems
...Baseten powers mission‑critical inference for the world's most dynamic... ...and help build the platform engineers turn to to ship AI products.... ...foundational engineers to lead our GPU Networking efforts, making... ...upstream tools (like standard Kubernetes networking) are too slow for...
Flexible hours
Baseten
San Francisco, CA
2 days ago
Performance Engineer, Inference Systems
$350k
...group of committed researchers, engineers, policy experts, and business... ...the Role Anthropic's inference fleet serves Claude to millions... ...and serving configurations, and lead the investigation when it catches... ...plus Familiarity with GPU/TPU/accelerator performance concepts...
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
16 hours ago
TPU Kernel Engineer — Lead Low-Latency ML Kernels (Hybrid)
$280k
Anthropic is looking for a TPU Kernel Engineer in San Francisco, California. In this role, you will identify and resolve performance... ...issues across ML systems, particularly in research, training, and inference. You will design and optimize TPU kernels and provide critical...
Anthropic
San Francisco, CA
16 hours ago
Inference Engineer, Robotics
...level AI capabilities with the constraints of physical systems to improve peoples’ lives. About the Role We’re looking for a GPU Inference Engineer to contribute to improvements in model serving efficiency for our Robotics research. This is a high‑impact role where you’...
Work at office
Relocation package
OpenAI
San Francisco, CA
16 hours ago
Distributed Systems Engineer, Data & Inference Platform
...into useful intelligence - the inference services that serve LLMs at... ...about both. Researchers and ML engineers will hand you workloads that... ...cost across heterogeneous GPU fleets. Batching, scheduling,... ...by. ~ Experience operating Kubernetes-based infrastructure, including...
Flexible hours
Adaption
San Francisco, CA
22 days ago
Senior ML Inference Engineer Production Systems
...looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and... ...fluent in Python, and have strong knowledge in GPU-accelerated inference. Excellent communication...
MakerMaker.AI
San Francisco, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Lead Kubernetes & GitOps Engineer for GPU Inference. Be the first to apply!