Technical Staff Lead, AI Inference & GPU Infra
Wafer
A tech company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates should have expertise in GPU fundamentals, deep learning frameworks like PyTorch and TensorFlow, along with experience in C++ and Python. Join a team at the forefront of AI technology in the heart of San Francisco. #J-18808-Ljbffr Wafer
- ...About the Team Our Inference team brings OpenAI’s most capable research... ...access our state-of-the-art AI models, allowing them to do... ...infrastructure across emerging GPU platforms. You’ll work across... ...collaborate closely with research, infra, and performance teams to...SuggestedFull time
$150k - $300k
...agentic models to the infra that enables anyone to... ...Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao... ...cloud LLM serving, LLM inference optimization and RL systems... ...training stack. Core Technical Responsibilities LLM... ...operates across our cloud GPU fleets. GPU‑Aware...SuggestedWork at officeRemote workVisa sponsorshipRelocation packageFlexible hoursShift work$380k
...integrating multimodal functionalities into our AI products, ensuring they are reliable,... ...About the Role We're looking for a GPU Inference Engineer to contribute to improvements... ...initiatives by building a stronger technical foundation. In this role you will:...SuggestedWork at officeRelocation package$220k
...is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels, and developing a Rust-based serving runtime. The ideal candidate has 3+ years of experience in...Suggested- A cutting-edge AI infrastructure startup is seeking a Kubernetes DevOps Engineer to join their innovative team in San Francisco. The... ...clusters across various environments, focusing on high-performance GPU workloads. Ideal candidates will have deep Kubernetes expertise...Suggested
- ...agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind,... ...the Role Design, build, and operate large-scale GPU infrastructure for high-throughput model inference and mid-training workloads. Develop systems that power...Relocation package
- Member of Technical Staff, ML Infrastructure & Inference Overview We are a cutting-edge AI infrastructure company is building a scalable cloud platform designed for next-generation... ..., Model Serving, Distributed Systems, GPU Infrastructure, AI Infrastructure, Inference Runtime...
- Overview Build low-latency inference pipelines for on-device deployment, enabling real-time next... ...optimize distributed inference systems on GPU clusters, pushing throughput with large-... ...for maximum efficiency, throughput, and responsiveness #J-18808-Ljbffr Genesis AIRemote job
$220k
We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures at scale with tight latency and... ...scheduling and KV-cache management to support in API Gateway. GPU kernels migration to CuTe DSL. Port our in-house CUDA kernels to...- Overview About Liquid AI Spun out of MIT CSAIL, we build general... .... The Opportunity Our Edge Inference team compiles Liquid... ...will work directly with the technical lead on problems that require deep... ...inference kernels for CPU, NPU, and GPU architectures across diverse...
- ...of humanity. About the Role As a Technical Lead on the Future of Computing Research team... ...responsible for implementing the low-level inference stack, including kernel development and... .... About OpenAI OpenAI is an AI research and deployment company dedicated...Work at officeRelocation package
$200k - $280k
...intersection of efficient inference (algorithms,... ...engines, or similar), GPU performance,... ...collaborating with infra, research, and product... ...experience owning complex technical projects end‑to‑end... ...leadership (Staff level) Set technical... ...Together AI is an Equal Opportunity...Full time- ...with the web by building AI agents that can... ...for a member of the AI technical staff to join the founding team... ...Responsibilities: Scale infra for post-training of... ...Scale infra for agentic inference (throughput and latency... ...ML infrastructure (GPU clusters) and supporting...Work at officeRelocationVisa sponsorship
$150k - $200k
A tech startup specializing in AI infrastructure seeks an AI Infrastructure Specialist to lead technical deployments for GPU neocloud and AI Factory customers. Ideal candidates have over 5 years of Kubernetes experience, practical GPU skills, and networking knowledge. Offering...Flexible hours- ...Staff Technical Lead for Inference & ML Performance San Francisco fal is the generative media ecosystem powering the next generation of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at...
$320k - $405k
..., and steerable AI systems. We want... ...AI systems. Staff Infrastructure Engineer, Node Infra About the role... ...that keep every GPU, TPU and... ...Own the technical strategy and roadmap... ...internal research/inference/product teams to... ...Track record of leading complex, multi-quarter...Work at officeVisa sponsorshipFlexible hours- A leading AI technology firm in San Francisco is seeking an AI Infra Engineer to enhance their infrastructure. The successful candidate will design and maintain Kubernetes clusters and manage Slurm for distributed training. Important skills include extensive experience...
- vCluster Labs is seeking an AI Infrastructure Specialist to engage directly with customers in deploying GPU solutions. You will drive technical deployments, optimize infrastructure, validate Kubernetes, and build self-sufficiency with customer teams. Ideal candidates will...
$209k - $253k
A leading AI infrastructure company in San Francisco seeks a Staff Software Engineer to design and develop control systems for GPU node management. The candidate will be critical in building foundational cloud infrastructure and achieving business goals. This role requires...- ...Introducing Moonlake, AI for creating world simulations.... ...'re looking for a Member of Technical Staff - Data & ML Infrastructure Engineer... ...'s model training and inference infrastructure. This role... ...regressions. You'll work across GPU kernels, inference systems, distributed...
$200k - $280k
A leading AI company in San Francisco is looking for a Staff Machine Learning Engineer to enhance inference systems at production scale. You will design algorithms, optimize performance, and collaborate on RL and post-training pipelines. Ideal candidates have 3+ years of...Full time- ...based platform and solving complex systems challenges, focusing on GPU infrastructures and multi-cloud environments. The ideal candidate... ...solutions. Join a team dedicated to building open superintelligence and make an impact in the AI space. #J-18808-Ljbffr B Capital
- ...combines a foundation model for physics with GPU-native solvers to deliver unprecedented... .... You will implement parsers, simulation/inference pipelines, and distributed execution... ...collaborating on internal UIs. Cloud and infra experience (GCP/AWS, Terraform) and operating...Remote workFlexible hours
$170k - $220k
Member of Technical Staff - Infrastructure & LLMs Location: San... ...next-generation inference infrastructure for LLMs... ...problems like: Scaling multi-GPU inference workloads... ...Ownership: Drive core infra design with zero red tape... ...GPU orchestration, or AI infra Strong technical...Full timeTemporary workImmediate startVisa sponsorshipWork visa- Gimlet Labs is building the first heterogeneous neocloud for AI workloads. As AI systems scale, the industry is hitting... ...AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on kernels and GPU performance. In this role, you will work close to accelerators...
- ...Labs is building the first heterogeneous neocloud for AI workloads. As AI systems scale, the industry is... ...class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build the inference...
$150k - $300k
...Chief Scientist, Together AI), Dylan Patel (... ...tuning runs on managed GPU clusters with a single... ...that runs the jobs. Core Technical Responsibilities Hosted... ...Kubernetes-based training and inference orchestration across... ..., training methods, infra patterns - and the ability...Work at officeLocal areaRemote workVisa sponsorshipRelocation packageFlexible hours- Introducing Moonlake, AI for creating world simulations. Scope... ...tensor+pipeline parallel; NCCL tuning. GPU + kernel performance Nsight... ...packing, KV-cache tricks. Inference optimization Low-latency... ...AWQ), distillation, pruning. Infra + reliability SLURM/K8s multi...
- ...individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI,... ...workloads through optimization of communication, memory usage, and GPU utilization. Build and maintain training pipelines that support...Relocation package
$142.2k - $204.6k
...Role As a software engineer for GenAI inference, you will help design, develop, and... ...operations, etc. Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS,... ...Databricks Databricks is the data and AI company. More than 10,000 organizations...Local areaWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Technical Staff Lead, AI Inference & GPU Infra. Be the first to apply!
- technical support assistant San Francisco, CA
- technical analyst San Francisco, CA
- end user support technician San Francisco, CA
- IT assistant San Francisco, CA
- help desk assistant San Francisco, CA
- IT support technician San Francisco, CA
- operations support technician San Francisco, CA
- desktop support analyst San Francisco, CA
- support analyst San Francisco, CA
- technical associate San Francisco, CA


