Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff ML Inference Systems Engineer - Scalable GPU Infra (SF)

Acceler8 Talent

Acceler8 Talent is looking for a Member of Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves designing end-to-end inference pipelines and enhancing performance under real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems, and proficiency in programming languages like Python and C++. This position offers an exciting opportunity to work at the forefront of AI infrastructure technology. #J-18808-Ljbffr Acceler8 Talent

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Staff ML Inference Systems Engineer - Scalable GPU Infra (SF) in San Francisco, CA vacancy
  •  ...Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale...  ...will design distributed training systems and optimize GPU utilization while collaborating...  ...have over 5 years of experience in ML infrastructure and a strong background... 
    Suggested

    Baseten

    San Francisco, CA
    4 days ago
  • $160k - $230k

     ...Systems Research Engineer, GPU Programming San Francisco About the Role As a Systems Research Engineer...  ...kernels and algorithms for ML/AI applications. Working closely with...  ...code to achieve better performance and scalability Collaborate with cross-functional... 
    Suggested
    Full time
    Remote work

    Together AI

    San Francisco, CA
    4 days ago
  •  ...looking for a Member of Technical Staff focused on ML systems and inference in San Francisco. You will design...  ..., ensuring fast, predictable, and scalable performance. Key responsibilities...  ...have strong foundations in software engineering, experience with ML inference systems... 
    Suggested

    Gimlet Labs, Inc.

    San Francisco, CA
    5 days ago
  • $180k - $250k

     ...You will design and implement innovative model serving architectures while working with the Applied ML team and customers. The ideal candidate has expertise in systems programming and deep understanding of cutting-edge ML infrastructure. Compensation ranges from $180,0... 
    Suggested

    fal

    San Francisco, CA
    3 days ago
  •  ...Staff Software Engineer, ML Infra & Distributed Systems About the Role: As a Staff Software Engineer on the ML Infrastructure...  ...world-class machine learning inference platforms. These platforms power...  ...: Design and build scalable, high throughput, and low latency... 
    Suggested

    Tubi TV

    San Francisco, CA
    4 days ago
  • $225k

     ...ultra‑long context, and inference‑time compute to...  ...The Role As a Software Engineer on the Inference & RL Systems team, you will design...  ...performance bottlenecks across GPU, networking, and...  ...issues in production ML systems Ability to...  ...to bring you to SF, if possible A small... 
    Relocation
    Visa sponsorship

    Magic

    San Francisco, CA
    1 day ago
  •  ...Infrastructure Engineer – Bland As a Senior...  ...the design of scalable architecture by...  ...distributed systems using Kubernetes...  ...and real-time inference serving across...  ...industries. Lead – AI/ML Stack...  ...capabilities. Staff DevOps Engineer...  ...workloads with GPU support, implementing... 
    Temporary work

    AI Chopping Block, Inc.

    San Francisco, CA
    4 days ago
  •  ...A tech-driven company focused on blockchain solutions is seeking a Senior ML Systems Engineer. In this role, you will build reusable workflows, automate model versioning, and deploy scalable AI systems. Candidates should have strong programming skills, experience with... 

    TRM Labs

    San Francisco, CA
    5 days ago
  •  ...real-time. Our vision is AI systems that are flexible, personalized...  ...useful intelligence - the inference services that serve LLMs at scale...  ...about both. Researchers and ML engineers will hand you workloads that...  ...and cost across heterogeneous GPU fleets. Batching, scheduling,... 
    Flexible hours

    Adaption

    San Francisco, CA
    20 days ago
  •  ...is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and reliability...  ...in Python, and have strong knowledge in GPU-accelerated inference. Excellent... 

    MakerMaker.AI

    San Francisco, CA
    4 days ago
  • $270k - $340k

     ...LLM) training and inference efficiency beyond...  ...across algorithms, systems, and infrastructure...  ...details with engineering partners. Role Summary...  ...end‑to‑end ML systems for distributed...  ...training/inference, scalable model...  ...distributed systems and infra teams to push the... 
    Local area
    Worldwide

    I did my part and supported the Regular Toilet

    San Francisco, CA
    3 days ago
  •  ...Staff Technical Lead for Inference & ML Performance San Francisco fal is the generative media ecosystem...  ...shape the future of fal's inference engine and ensure our generative models achieve...  ...enhancing inference speed and scalability. Mentor and scale your team. Coach... 

    Fal

    San Francisco, CA
    4 days ago
  • B Capital is seeking a talented engineer in San Francisco to build and scale distributed training systems for machine learning models. The ideal candidate will have strong...  ...in distributed training frameworks, debug GPU compute systems, and optimize training throughput... 

    B Capital

    San Francisco, CA
    2 days ago
  • $248.8k - $311k

     ...Physical AI and developing ML pipelines for...  ...The Role As an ML Systems Engineer on the Physical AI team...  ...and build platforms for scalable, reliable, and efficient...  ...performance tracking of model inference. Lead: Own projects...  ..., including GPU-level algorithm optimizations... 
    Full time

    Scale AI

    San Francisco, CA
    16 days ago
  • AI Chopping Block, Inc. is seeking a Machine Learning Engineer to design and build scalable machine learning systems. Responsibilities involve developing end-to-end ML pipelines, optimizing AI models for mobile environments, and integrating AI-driven solutions into applications... 

    AI Chopping Block, Inc.

    San Francisco, CA
    3 days ago
  • TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical infrastructure for AI/ML systems...  ...have strong Python programming skills, a solid background in scalable infrastructure, and experience deploying LLM workflows.... 

    TRM Labs

    San Francisco, CA
    5 days ago
  •  ...AI workloads is seeking a Member of Technical Staff to design and optimize inference systems. The role involves managing KV cache...  ...Ideal candidates should have strong software engineering skills and experience with ML inference systems, particularly in Python and... 

    Gimlet Labs

    San Francisco, CA
    3 days ago
  • A leading tech company in San Francisco seeks a Machine Learning Engineer to build and maintain infrastructure for large-scale model training. In this hands-on role, you will design systems, work closely with researchers, and optimize training processes. Candidates should... 

    Monograph

    San Francisco, CA
    5 days ago
  • $160k - $320k

     ...excellence.  We seek engineers/researchers with...  ...’re looking for a systems engineer with HPC...  ...to help scale AI inference. You’ll leverage...  ...systems to optimize GPU performance at the...  ...at either our SF or LA offices Tech...  ...HPC techniques into scalable AI inference... 
    Full time
    Work at office

    Vast.ai Inc.

    San Francisco, CA
    more than 2 months ago
  • A tech-first company is seeking a Member of Technical Staff to focus on cutting-edge AI research and development. The role involves building and scaling training and inference infrastructure, designing ML kernels, and optimizing performance. Ideal candidates should have... 

    Mirendil

    San Francisco, CA
    5 days ago
  •  ...powers mission‑critical inference for the world's most...  ...help build the platform engineers turn to to ship AI products...  ...the global operating system for distributed,...  ...engineers to lead our GPU Networking efforts, making...  ...Exposure to a variety of ML startups, offering unparalleled... 
    Flexible hours

    Baseten

    San Francisco, CA
    5 days ago
  • $117.2k - $313.7k

     ...immediate opportunities for Lead software engineers who want their lines of code to have...  ...issues and drive innovations that improve system scalability, robustness, and availability....  ...Design patterns & Experience with Big-Data/ML and S3 Hands-on experience with Streaming... 
    Immediate start
    Remote work

    Salesforce

    San Francisco, CA
    5 days ago
  • Stars Arena is seeking a Systems Engineer to develop pioneering machine learning infrastructure that enhances the efficiency of experiments on local and cloud GPUs. This role requires strong skills in Python and a passion for understanding system internals. Alongside a... 
    Remote job
    Local area
    Flexible hours

    Stars Arena

    San Francisco, CA
    4 days ago
  • $320k

     ...Staff + Sr. Software Engineer, Cloud Inference San Francisco, CA About Anthropic Anthropic's mission is to...  ...reliable, interpretable, and steerable AI systems. We want AI to be safe and...  ...about LLM serving; prior inference or ML experience is not required Thrive... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    1 day ago
  •  ...lightweight, model-agnostic system that enforces...  ...Machine Learning Engineer will operate deep...  ...policy enforcement at inference time. Who You...  ...graph databases Infra: Docker, Kubernetes...  ...and customer VPCs ML: Self hosted models on multiple GPU providers and... 

    CTGT

    San Francisco, CA
    3 days ago
  • Jaide Health is seeking experienced Members of Technical Staff to join their Model Serving team. This role involves...  ...ideal candidate will have significant experience in engineering, especially with distributed systems. The company offers a hybrid work model and extensive... 

    Jaide Health

    San Francisco, CA
    3 days ago
  • $250k

     ...next-generation GPU platform designed...  ..., and inference at scale. The company...  ...for a Senior / Staff Site Reliability Engineer to support and scale...  ...with platform, ML, and infrastructure...  ...growth and scalability. Don’t miss out...  ...infrastructure systems Improve CI/CD... 
    Permanent employment
    Remote work
    San Francisco, CA
    20 days ago
  •  ...A leading streaming service is seeking a Staff Software Engineer to enhance ML infrastructure. The role involves designing scalable systems, mentoring engineers, and collaborating with cross-functional teams. Candidates should have over 8 years of experience in building... 

    Tubi TV

    San Francisco, CA
    4 days ago
  • $295k - $380k

     ...OpenAI is searching for a Senior Software Engineer to join their Robotics team in San Francisco. The role focuses on maintaining and...  ...framework while actively reviewing and debugging code within ML systems. The ideal candidate should thrive in hands-on settings, possess... 

    OpenAI

    San Francisco, CA
    4 days ago
  • $300 per month

     ...Role We’re seeking a Staff GRC Risk...  ...architecture, AI systems, data flows, and infrastructure...  ...’ll also design scalable, automated GRC...  ...architectures, and inference infrastructure Reviewing...  ...in GRC, security engineering, or IT risk roles...  ...with AI/ML systems, agentic AI... 
    Temporary work

    Crusoe Energy Systems

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff ML Inference Systems Engineer - Scalable GPU Infra (SF). Be the first to apply!