Staff Infra Engineer - Global GPU ML Inference

The Token Company

The Token Company in San Francisco is seeking a Member of Technical Staff for their infrastructure team. In this role, you will own the cloud systems that serve our compression API and build global low-latency, high-throughput GPU ML inference infrastructure. The ideal candidate will have solid experience in cloud infrastructure, including AWS and Docker, and a proven track record in building production environments. Additional benefits include equity, housing, food, and visa sponsorship. #J-18808-Ljbffr The Token Company

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Staff Infra Engineer - Global GPU ML Inference in San Francisco, CA vacancy

Staff Engineer - AI Infra & GPU Optimization
A tech startup focusing on AI optimization is seeking engineers in San Francisco to enhance their GPU kernel optimization framework. Candidates should possess... .... Previous experience in GPU programming and AI/ML research is advantageous. Join a small team committed...
Suggested
Wafer
San Francisco, CA
3 days ago
Staff Engineer, GPU AI Inference & RL Infrastructure
B Capital is seeking a skilled engineer for GPU infrastructure in San Francisco. This role involves designing and operating high-performance systems for model inference, synthetic data generation, and reinforcement learning. The ideal candidate has strong GPU systems experience...
Suggested
B Capital
San Francisco, CA
3 days ago
Member of Technical Staff - Edge Inference Engineer
...there. The Opportunity Our Edge Inference team compiles Liquid... ...require deep understanding of both ML architectures and hardware constraints... ...kernels for CPU, NPU, and GPU architectures across diverse... ...Experience Embedded software engineering experience or work on resource...
Suggested
Liquid AI
San Francisco, CA
4 days ago
Staff ML Inference Engineer — Model Efficiency (Remote)
Jaide Health is seeking an engineer for their Model Efficiency team in... ...focuses on building reliable ML systems while enhancing core performance... ...techniques such as GPU/CUDA optimizations and collaborate... ...and insights into the LLM inference ecosystem. A commitment to diversity...
Suggested
Remote job
Jaide Health
San Francisco, CA
5 days ago
Staff Engineer
..., Fly over AWS when it makes sense, PyTorch over legacy ML frameworks. The Work GPU inference : We run our own ASR models. Real-time transcription : WebSocket... ...C# Runtime : Bun, Node.js, Django, FastAPI ML : PyTorch Infra : Fly.io, Terraform, AWS (RDS), Redis Protocols : gRPC,...
Suggested
Aqua Voice
San Francisco, CA
1 day ago
Staff Engineer - ML Inference & Model Efficiency
A leading AI research firm in San Francisco is seeking a Member of Technical Staff specialized in Model Efficiency. In this role, you will enhance LLM inference systems by tackling performance issues and collaborating with cross-functional teams. Ideal candidates have over...
Remote work
Cohere
San Francisco, CA
2 days ago
Staff Engineer: Foundation Model API & GPU Inference
$192k - $260k
A leading data and AI company is seeking a Staff Engineer to design and implement core systems for Foundation Model Serving. The ideal candidate... ...closely across teams to ensure operational excellence in GPU serving workloads. Competitive salary range of $192,000 to $26...
Databricks Inc.
San Francisco, CA
2 days ago
Staff Machine Learning Engineer
$200k - $250k
...Overview Build and operate the ML platform that powers... ...scalable training, inference, and cost‑efficient operations... ...ECS, SageMaker, GPU fleets, model serving, autoscaling... ..., or similar). Prior staff‑level role in a company with a significant AI infra footprint. Experience...
Remote work
AppFolio
San Francisco, CA
2 days ago
Staff Engineer, ML Inference Systems
Acceler8 Talent is seeking a Member of Technical Staff focused on ML Systems & Inference in San Francisco, California. This role includes building and... ...AI workloads. The ideal candidate has strong software engineering roots and experience in inference systems. You will...
Acceler8 Talent
San Francisco, CA
4 days ago
Staff Data Platform Engineer — Scale Global ML Infra
...technology company in San Francisco is seeking a Data Platform Engineer to drive architecture and implementation of core systems. The ideal... ...and demonstrates strong analytical skills in fields such as ML and statistics. Responsibilities include planning technical roadmaps...
Hive
San Francisco, CA
3 days ago
Senior / Staff Infrastructure Engineer
$160k - $300k
...product development. We empower global innovators in automotive,... ...mission is to revolutionize how engineering decisions are made, turning... ...About the Role As a Senior / Staff Infrastructure Engineer at... ...distributed systems) Exposure to ML infra Personality & Values:...
Work at office
Visa sponsorship
Flexible hours
Apiphany
San Francisco, CA
22 hours ago
Staff ML Systems Engineer — Frontier AI Infra
A tech-first company is seeking a Member of Technical Staff to focus on cutting-edge AI research and development. The role involves building and scaling training and inference infrastructure, designing ML kernels, and optimizing performance. Ideal candidates should have...
Mirendil
San Francisco, CA
2 days ago
Senior Staff Network Reliability Engineer (Global GPU)
Crusoe in San Francisco is looking for a Senior Staff Network Operations Engineer to oversee the reliability of its global network. This role entails leading incident responses, defining operational standards, and guiding a team of engineers in maintaining a high-performing...
ProducePay
San Francisco, CA
3 days ago
Staff Infra Engineer: Kubernetes & Bare-Metal Automation
Crusoe is seeking a Staff Software Infrastructure Engineer in San Francisco to manage cloud infrastructure, develop... ...critical role requires expertise in GPU troubleshooting, strong Linux skills... ...make a significant impact on the global energy landscape. #J-18808-Ljbffr...
Crusoe
San Francisco, CA
3 days ago
Staff Engineer, Mid-Training Infra for Large-Scale AI
A cutting-edge AI research firm in San Francisco is seeking talent to build and optimize GPU infrastructure for large-scale model inference and training workloads. The ideal candidate will have hands-on experience with GPU systems and optimization techniques, actively...
Reflection
San Francisco, CA
5 days ago
Staff Computer Vision Deployment Engineer (Production ML Infra)
Claryo is seeking a Staff Software Engineer with a focus on Computer Vision Deployment based in San Francisco. The successful candidate will develop... ...include creating and managing distributed cloud GPU infrastructures and building comprehensive computer vision pipelines...
Work at office
3 days per week
Claryo
San Francisco, CA
1 day ago
AI Inference Engineer (Member of Technical Staff)
Requirements Deep experience with GPU programming and performance work (CUDA, Triton, CUTLASS, or similar)... ...laid out for you 3+ years of professional software engineering experience with meaningful work on ML inference or high-performance systems Familiarity with at least...
Perplexity AI
San Francisco, CA
5 days ago
Staff Cloud Support Engineer
$300 per month
...We’re crafting the engine that powers a world... ...the Role As a Senior Staff Cloud Support... ...networking, and AI/ML infrastructure, and... ...AI infrastructure globally. What You’ll Be Working... ...Troubleshoot NCCL, IB, GPU driver/firmware... ...workloads (training + inference) with performance...
Full time
Temporary work
Epoch Biodesign
San Francisco, CA
4 days ago
Member of Technical Staff (AI Infrastructure Engineer)
We are looking for an AI Infra engineer to join our growing team. We work... ...partnering closely with our Inference and Research teams to build,... ...observability solutions tailored to ML workloads running on... ...strategies) Experience managing GPU clusters and optimizing compute...
Perplexity
San Francisco, CA
3 days ago
Staff ML Engineer: Efficient ML & Low-Latency AI
...candidates with expertise in AI simulation development. The role emphasizes optimizing training efficiency, enhancing GPU performance, and ensuring low-latency inference. Applicants should be proficient in methodologies for gradient checkpointing, Nsight profiling, and job...
Embedding VC
San Francisco, CA
5 days ago
Staff Engineer, AI Inference & Distributed Systems
Sail Research in San Francisco is seeking a talented engineer to design and implement robust systems that ensure fast and cost-efficient AI inference at global scale. You will be responsible for building high-performance schedulers and optimizing global routing while focusing...
Sail Research
San Francisco, CA
2 days ago
Staff ML Engineer, Fine Tuning - Slack
$197.3k - $313.7k
Staff ML Engineer, Fine Tuning - SlackSkip to main content#Staff ML Engineer, Fine Tuning - Slack... ...finetuning training pipelines on GPU infrastructure.* Brainstorm with Product... ...Familiarity with model optimization for inference (quantization, pruning, speculative decoding...
Work at office
Salesforce, Inc.
San Francisco, CA
2 days ago
Staff GenAI Inference Engineer: Optimize LLM Serving Latency
$190.9k - $232.8k
A leading data and AI company is seeking a Staff Software Engineer for GenAI inference to lead the architecture and optimization of the inference engine. The role requires expertise in CUDA, GPU programming, and distributed systems design. Ideal candidates will have a strong...
Menlo Ventures
San Francisco, CA
4 days ago
Staff ML Infra Engineer: Large-Scale Pretraining & MLOps
$181.1k - $318.4k
Apple Inc. is looking for a Staff ML Infrastructure Engineer in San Francisco to lead pre-training initiatives for cutting-edge foundation models in machine learning. The successful candidate will have over 6 years of experience in building scalable backend systems, be...
Apple
San Francisco, CA
4 days ago
Staff, ML Infrastructure Engineer
$227.2k - $417k
Software Engineer, ML Infra & Distributed Systems (Staff & Principal) About the Role: As a Software Engineer on the ML Infrastructure team, you will collaborate... ...Product teams to build world‑class machine learning inference platforms. These platforms power essential services...
Full time
Temporary work
Local area
Flexible hours
Tubi Tv
San Francisco, CA
1 day ago
Staff Engineer
...a web application that distills complex ML signals, building automation tools that run... ...for: We’re looking for an experienced engineer to help shape our architecture, strengthen... ...platform securing digital trust for leading global businesses. Our deep investments in...
Sift Science
San Francisco, CA
2 days ago
Staff ML Infra Engineer - Low-Latency Distributed Systems
A leading streaming service is seeking a Staff Software Engineer to enhance ML infrastructure. The role involves designing scalable systems, mentoring engineers, and collaborating with cross-functional teams. Candidates should have over 8 years of experience in building...
Tubi Tv
San Francisco, CA
5 days ago
Staff Engineer, vLLM Inference & DevRel
$200k - $400k
Inferact is looking for a Developer Relations Engineer in San Francisco, California, to help developers utilize vLLM for AI inference. This unique role involves teaching technical concepts, creating educational content, and engaging with the AI infrastructure community....
Remote work
Inferact
San Francisco, CA
1 day ago
Staff Engineer, Engineering Productivity & AI Quality
$253k - $308k
Staff Engineer, Engineering Productivity & AI Quality Harper is an AI-native commercial insurance... ...productivity, platform, CI/CD, build, test‑infra, or internal tooling that other engineers... ...not a process or PM role. Production AI/ML systems experience (agent harness, eval...
Part time
Work at office
Relocation
Harper Group
San Francisco, CA
4 days ago
Staff Engineer: GPU Kernels & AI Performance
...company in San Francisco is seeking a Member of Technical Staff focused on kernels and GPU performance. This role involves optimizing GPU and... ...various hardware. Ideal candidates have strong software engineering foundations and experience with performance-critical systems...
Gimlet Labs
San Francisco, CA
5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Infra Engineer - Global GPU ML Inference. Be the first to apply!