Technical Staff Lead, AI Inference & GPU Infra
Wafer
A tech company specializing in AI infrastructure is seeking a skilled professional to build scalable infrastructure for AI model training and inference. You will lead architectural decisions and work with core systems that power their GPU optimization platform. Candidates should have expertise in GPU fundamentals, deep learning frameworks like PyTorch and TensorFlow, along with experience in C++ and Python. Join a team at the forefront of AI technology in the heart of San Francisco. #J-18808-Ljbffr Wafer
$150k - $300k
...agentic models to the infra that enables anyone to... ...Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao... ...cloud LLM serving, LLM inference optimization and RL systems... ...training stack. Core Technical Responsibilities LLM... ...operates across our cloud GPU fleets. GPU‑Aware...SuggestedWork at officeRemote workVisa sponsorshipRelocation packageFlexible hoursShift work$220k
...is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels, and developing a Rust-based serving runtime. The ideal candidate has 3+ years of experience in...Suggested- A cutting-edge AI infrastructure startup is seeking a Kubernetes DevOps Engineer to join their innovative team in San Francisco. The... ...clusters across various environments, focusing on high-performance GPU workloads. Ideal candidates will have deep Kubernetes expertise...Suggested
$200k - $350k
...Member Of Technical Staff, Inference & Serving Inception creates the world's fastest, most efficient AI models. Our Mercury model is the world's... ...performance computing and GPU programming (CUDA). Experience... ...of diffusion models and leading AI researchers Shape...SuggestedImmediate startFlexible hours$180k
...Member Of Technical Staff - Inference Palo Alto, CA About Xai Xai's mission is to create AI systems that can accurately understand the universe and aid humanity in its... ...-scaling) to deep low-level optimizations (GPU kernels, quantization, speculative decoding...SuggestedTemporary work- ...maintain large distributed ML training and inference clusters Develop efficient, scalable... ...scales Analyze, profile and debug low-level GPU operations to optimize performance Stay... ...(GCP, AWS, or Azure) and their ML/AI service offerings Familiarity with containerization...
- Member of Technical Staff, ML Infrastructure & Inference Overview We are a cutting-edge AI infrastructure company is building a scalable cloud platform designed for next-generation... ..., Model Serving, Distributed Systems, GPU Infrastructure, AI Infrastructure, Inference Runtime...
- ...Inference Engine Engineer We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures... ...bottlenecks from network ingress through continuous batching and GPU kernel interleaving. Build dashboards, alerts, and automated...
- ...with the web by building AI agents that can... ...for a member of the AI technical staff to join the founding team... ...Responsibilities: Scale infra for post-training of multimodal... ...infra for agentic inference (throughput and latency... ...ML infrastructure (GPU clusters) and supporting...Work at officeRelocationVisa sponsorship
- Overview About Liquid AI Spun out of MIT CSAIL, we build general... .... The Opportunity Our Edge Inference team compiles Liquid... ...will work directly with the technical lead on problems that require deep... ...inference kernels for CPU, NPU, and GPU architectures across diverse...
- ...of humanity. About the Role As a Technical Lead on the Future of Computing Research team... ...responsible for implementing the low-level inference stack, including kernel development and... .... About OpenAI OpenAI is an AI research and deployment company dedicated...Work at officeRelocation package
- ...Staff Technical Lead for Inference & ML Performance San Francisco fal is the generative media ecosystem powering the next generation of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at...
- About the Team Our Inference team brings OpenAI’s most capable research... ...access our state-of-the-art AI models, allowing them to do things... ...across emerging GPU platforms. You’ll work across... ...collaborate closely with research, infra, and performance teams to ensure...
- A leading AI technology firm in San Francisco is seeking an AI Infra Engineer to enhance their infrastructure. The successful candidate will design and maintain Kubernetes clusters and manage Slurm for distributed training. Important skills include extensive experience...
- ...Associates Limited is looking for a Senior Storage Engineer to support large-scale AI infrastructure in San Francisco. This role involves designing scalable storage solutions for high-performance GPU platforms. The ideal candidate has extensive experience in storage...Remote job
$209k - $253k
A leading AI infrastructure company in San Francisco seeks a Staff Software Engineer to design and develop control systems for GPU node management. The candidate will be critical in building foundational cloud infrastructure and achieving business goals. This role requires...- ...large-scale driver navigation AI models and one of the top chess... ..., SDK design, and large-scale inference infrastructure. You'll... ...cloud ingestion to distributed GPU inference pipelines that run our... ...orchestration frameworks or ML infra tools (e.g., DeepSpeed, Triton...Worldwide
$200k - $350k
...Member Of Technical Staff, Training Infra Bay Area Ai Systems Inception creates the world's fastest, most efficient AI models. Our Mercury model is... ...Collaborate with the inventors of diffusion models and leading AI researchers Shape Foundational Technology : Your...Immediate startFlexible hours$150k - $300k
...Chief Scientist, Together AI), Dylan Patel (... ...tuning runs on managed GPU clusters with a single... ...the jobs. Core Technical Responsibilities Hosted... ...Kubernetes-based training and inference orchestration across... ..., training methods, infra patterns - and the ability...Work at officeLocal areaRemote workVisa sponsorshipRelocation packageFlexible hours- ...About Us Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration... ...About the role Gimlet Labs is seeking a Member of Technical Staff focused on kernels and GPU performance. In this role, you will work close to...
- ...based platform and solving complex systems challenges, focusing on GPU infrastructures and multi-cloud environments. The ideal candidate... ...solutions. Join a team dedicated to building open superintelligence and make an impact in the AI space. #J-18808-Ljbffr B Capital
- ...Labs is building the first heterogeneous neocloud for AI workloads. As AI systems scale, the industry is... ...datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build inference systems...
$170k - $220k
Member of Technical Staff - Infrastructure & LLMs Location: San... ...next-generation inference infrastructure for LLMs... ...problems like: Scaling multi-GPU inference workloads... ...Ownership: Drive core infra design with zero red tape... ...GPU orchestration, or AI infra Strong technical...Full timeTemporary workImmediate startVisa sponsorshipWork visa- Introducing Moonlake, AI for creating world simulations. Scope... ...tensor+pipeline parallel; NCCL tuning. GPU + kernel performance Nsight... ...packing, KV-cache tricks. Inference optimization Low-latency... ...AWQ), distillation, pruning. Infra + reliability SLURM/K8s multi...
- ...About the Job We are seeking a highly technical Inference Engine Engineer to optimize the... ...designing, implementing, and optimizing GPU kernels and supporting infrastructure for... ...next-generation generative and agentic AI workloads. Your work will directly power...WorldwideFlexible hours
$142.2k - $204.6k
...Role As a software engineer for GenAI inference, you will help design, develop, and... ...operations, etc. Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS,... ...Databricks Databricks is the data and AI company. More than 10,000 organizations...Local areaWorldwide$187.5k - $395k
...Software Engineer, Inference Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is critical... ...scheduling systems to optimally leverage our expensive GPU resources while meeting internal SLOs Build and...- ...AI Infra Engineer We are looking for an AI Infra engineer to join our growing team. We... ...you will be partnering closely with our Inference and Research teams to build, deploy, and... ...training strategies) Experience managing GPU clusters and optimizing compute resource...
- ...Customer Support Engineer (GPU Cluster) San... ...Engineer at a pioneering AI company, you'll be the... ...training, fine tuning, and inference solutions with Together... ...dive deep into complex technical challenges, providing... ...We have contributed to leading open-source research, models...Full timeRemote workFlexible hoursNight shiftWeekend work
- A tech company focused on AI is seeking a Site Reliability Engineer to ensure the reliability and performance of its GPU marketplace. This role involves maintaining service level... ...standards, design monitoring systems and lead incident response. Join a forward-thinking...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Technical Staff Lead, AI Inference & GPU Infra. Be the first to apply!
- remote support technician San Francisco, CA
- personal computer support technician San Francisco, CA
- customer support analyst San Francisco, CA
- systems support technician San Francisco, CA
- help desk administrator San Francisco, CA
- decision support analyst San Francisco, CA
- technical support assistant San Francisco, CA
- technical analyst San Francisco, CA
- technical assistant San Francisco, CA
- IT support technician San Francisco, CA


