Staff ML Inference Systems Engineer - Scalable GPU Infra (SF)

Acceler8 Talent

Acceler8 Talent is looking for a Member of Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves designing end-to-end inference pipelines and enhancing performance under real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems, and proficiency in programming languages like Python and C++. This position offers an exciting opportunity to work at the forefront of AI infrastructure technology. #J-18808-Ljbffr Acceler8 Talent

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Staff ML Inference Systems Engineer - Scalable GPU Infra (SF) in San Francisco, CA vacancy

Senior ML Training Systems Engineer - Distributed GPU Infra
...Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale... ...will design distributed training systems and optimize GPU utilization while collaborating... ...have over 5 years of experience in ML infrastructure and a strong background...
Suggested
Baseten
San Francisco, CA
4 days ago
Systems Research Engineer, GPU Programming
$160k - $230k
...Systems Research Engineer, GPU Programming San Francisco About the Role As a Systems Research Engineer... ...kernels and algorithms for ML/AI applications. Working closely with... ...code to achieve better performance and scalability Collaborate with cross-functional...
Suggested
Full time
Remote work
Together AI
San Francisco, CA
4 days ago
ML Inference Systems Engineer
...looking for a Member of Technical Staff focused on ML systems and inference in San Francisco. You will design... ..., ensuring fast, predictable, and scalable performance. Key responsibilities... ...have strong foundations in software engineering, experience with ML inference systems...
Suggested
Gimlet Labs, Inc.
San Francisco, CA
5 days ago
Staff ML Performance & Systems Engineer — Scalable Inference
$180k - $250k
...You will design and implement innovative model serving architectures while working with the Applied ML team and customers. The ideal candidate has expertise in systems programming and deep understanding of cutting-edge ML infrastructure. Compensation ranges from $180,0...
Suggested
fal
San Francisco, CA
3 days ago
Software Engineer, ML Infra & Distributed Systems (Staff & Principal)
...Staff Software Engineer, ML Infra & Distributed Systems About the Role: As a Staff Software Engineer on the ML Infrastructure... ...world-class machine learning inference platforms. These platforms power... ...: Design and build scalable, high throughput, and low latency...
Suggested
Tubi TV
San Francisco, CA
4 days ago
Senior Systems Engineering
$225k
...ultra‑long context, and inference‑time compute to... ...The Role As a Software Engineer on the Inference & RL Systems team, you will design... ...performance bottlenecks across GPU, networking, and... ...issues in production ML systems Ability to... ...to bring you to SF, if possible A small...
Relocation
Visa sponsorship
Magic
San Francisco, CA
1 day ago
Senior AI/ML Infra & SRE Engineer
...Infrastructure Engineer – Bland As a Senior... ...the design of scalable architecture by... ...distributed systems using Kubernetes... ...and real-time inference serving across... ...industries. Lead – AI/ML Stack... ...capabilities. Staff DevOps Engineer... ...workloads with GPU support, implementing...
Temporary work
AI Chopping Block, Inc.
San Francisco, CA
4 days ago
Senior ML Systems Engineer - LLM Infra & Governance
...A tech-driven company focused on blockchain solutions is seeking a Senior ML Systems Engineer. In this role, you will build reusable workflows, automate model versioning, and deploy scalable AI systems. Candidates should have strong programming skills, experience with...
TRM Labs
San Francisco, CA
5 days ago
Distributed Systems Engineer, Data & Inference Platform
...real-time. Our vision is AI systems that are flexible, personalized... ...useful intelligence - the inference services that serve LLMs at scale... ...about both. Researchers and ML engineers will hand you workloads that... ...and cost across heterogeneous GPU fleets. Batching, scheduling,...
Flexible hours
Adaption
San Francisco, CA
20 days ago
Senior ML Inference Engineer Production Systems
...is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and reliability... ...in Python, and have strong knowledge in GPU-accelerated inference. Excellent...
MakerMaker.AI
San Francisco, CA
4 days ago
Senior Staff AI Research TLM - AI Systems
$270k - $340k
...LLM) training and inference efficiency beyond... ...across algorithms, systems, and infrastructure... ...details with engineering partners. Role Summary... ...end‑to‑end ML systems for distributed... ...training/inference, scalable model... ...distributed systems and infra teams to push the...
Local area
Worldwide
I did my part and supported the Regular Toilet
San Francisco, CA
3 days ago
Staff Technical Lead for Inference & ML Performance
...Staff Technical Lead for Inference & ML Performance San Francisco fal is the generative media ecosystem... ...shape the future of fal's inference engine and ensure our generative models achieve... ...enhancing inference speed and scalability. Mentor and scale your team. Coach...
Fal
San Francisco, CA
4 days ago
Staff, Pre-Training Infra — Distributed ML Training
B Capital is seeking a talented engineer in San Francisco to build and scale distributed training systems for machine learning models. The ideal candidate will have strong... ...in distributed training frameworks, debug GPU compute systems, and optimize training throughput...
B Capital
San Francisco, CA
2 days ago
ML Systems Engineer, Robotics
$248.8k - $311k
...Physical AI and developing ML pipelines for... ...The Role As an ML Systems Engineer on the Physical AI team... ...and build platforms for scalable, reliable, and efficient... ...performance tracking of model inference. Lead: Own projects... ..., including GPU-level algorithm optimizations...
Full time
Scale AI
San Francisco, CA
16 days ago
ML Systems Engineer - Scalable AI Pipelines
AI Chopping Block, Inc. is seeking a Machine Learning Engineer to design and build scalable machine learning systems. Responsibilities involve developing end-to-end ML pipelines, optimizing AI models for mobile environments, and integrating AI-driven solutions into applications...
AI Chopping Block, Inc.
San Francisco, CA
3 days ago
Senior ML Systems Engineer, LLM Infra & AI Ops
TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical infrastructure for AI/ML systems... ...have strong Python programming skills, a solid background in scalable infrastructure, and experience deploying LLM workflows....
TRM Labs
San Francisco, CA
5 days ago
Senior ML Inference Systems Engineer
...AI workloads is seeking a Member of Technical Staff to design and optimize inference systems. The role involves managing KV cache... ...Ideal candidates should have strong software engineering skills and experience with ML inference systems, particularly in Python and...
Gimlet Labs
San Francisco, CA
3 days ago
ML Infra Engineer — Scalable Training Systems
A leading tech company in San Francisco seeks a Machine Learning Engineer to build and maintain infrastructure for large-scale model training. In this hands-on role, you will design systems, work closely with researchers, and optimize training processes. Candidates should...
Monograph
San Francisco, CA
5 days ago
GPU Systems Engineer - HPC / Parallel Computing
$160k - $320k
...excellence. We seek engineers/researchers with... ...’re looking for a systems engineer with HPC... ...to help scale AI inference. You’ll leverage... ...systems to optimize GPU performance at the... ...at either our SF or LA offices Tech... ...HPC techniques into scalable AI inference...
Full time
Work at office
Vast.ai Inc.
San Francisco, CA
more than 2 months ago
Staff ML Systems Engineer — Frontier AI Infra
A tech-first company is seeking a Member of Technical Staff to focus on cutting-edge AI research and development. The role involves building and scaling training and inference infrastructure, designing ML kernels, and optimizing performance. Ideal candidates should have...
Mirendil
San Francisco, CA
5 days ago
Software Engineer GPU Networking & Distributed Systems
...powers mission‑critical inference for the world's most... ...help build the platform engineers turn to to ship AI products... ...the global operating system for distributed,... ...engineers to lead our GPU Networking efforts, making... ...Exposure to a variety of ML startups, offering unparalleled...
Flexible hours
Baseten
San Francisco, CA
5 days ago
Lead Distributed Systems Engineer (Cloud Infra)
$117.2k - $313.7k
...immediate opportunities for Lead software engineers who want their lines of code to have... ...issues and drive innovations that improve system scalability, robustness, and availability.... ...Design patterns & Experience with Big-Data/ML and S3 Hands-on experience with Streaming...
Immediate start
Remote work
Salesforce
San Francisco, CA
5 days ago
Remote Systems Engineer — Build Fast ML Infra & Open Source
Stars Arena is seeking a Systems Engineer to develop pioneering machine learning infrastructure that enhances the efficiency of experiments on local and cloud GPUs. This role requires strong skills in Python and a passion for understanding system internals. Alongside a...
Remote job
Local area
Flexible hours
Stars Arena
San Francisco, CA
4 days ago
Staff + Sr. Software Engineer, Cloud Inference
$320k
...Staff + Sr. Software Engineer, Cloud Inference San Francisco, CA About Anthropic Anthropic's mission is to... ...reliable, interpretable, and steerable AI systems. We want AI to be safe and... ...about LLM serving; prior inference or ML experience is not required Thrive...
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
1 day ago
Machine Learning Engineer: LLM Interpretability & Systems
...lightweight, model-agnostic system that enforces... ...Machine Learning Engineer will operate deep... ...policy enforcement at inference time. Who You... ...graph databases Infra: Docker, Kubernetes... ...and customer VPCs ML: Self hosted models on multiple GPU providers and...
CTGT
San Francisco, CA
3 days ago
Staff SWE, Inference Infrastructure — High-Scale ML
Jaide Health is seeking experienced Members of Technical Staff to join their Model Serving team. This role involves... ...ideal candidate will have significant experience in engineering, especially with distributed systems. The company offers a hybrid work model and extensive...
Jaide Health
San Francisco, CA
3 days ago
Senior Site Reliability Engineer (GPU Clusters) - Hosting
$250k
...next-generation GPU platform designed... ..., and inference at scale. The company... ...for a Senior / Staff Site Reliability Engineer to support and scale... ...with platform, ML, and infrastructure... ...growth and scalability. Don’t miss out... ...infrastructure systems Improve CI/CD...
Permanent employment
Remote work
San Francisco, CA
20 days ago
Staff ML Infra Engineer - Low-Latency Distributed Systems
...A leading streaming service is seeking a Staff Software Engineer to enhance ML infrastructure. The role involves designing scalable systems, mentoring engineers, and collaborating with cross-functional teams. Candidates should have over 8 years of experience in building...
Tubi TV
San Francisco, CA
4 days ago
Senior ML Systems Engineer Training Infra for Robotics
$295k - $380k
...OpenAI is searching for a Senior Software Engineer to join their Robotics team in San Francisco. The role focuses on maintaining and... ...framework while actively reviewing and debugging code within ML systems. The ideal candidate should thrive in hands-on settings, possess...
OpenAI
San Francisco, CA
4 days ago
Staff GRC Risk Architect for AI & Third-Party Risk
$300 per month
...Role We’re seeking a Staff GRC Risk... ...architecture, AI systems, data flows, and infrastructure... ...’ll also design scalable, automated GRC... ...architectures, and inference infrastructure Reviewing... ...in GRC, security engineering, or IT risk roles... ...with AI/ML systems, agentic AI...
Temporary work
Crusoe Energy Systems
San Francisco, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff ML Inference Systems Engineer - Scalable GPU Infra (SF). Be the first to apply!