Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff ML Inference Systems Engineer - Scalable GPU Infra (SF)

Acceler8 Talent

Acceler8 Talent is looking for a Member of Technical Staff focused on building and optimizing ML inference systems in San Francisco. The role involves designing end-to-end inference pipelines and enhancing performance under real-world workloads. Candidates should have strong software engineering skills, experience with ML inference systems, and proficiency in programming languages like Python and C++. This position offers an exciting opportunity to work at the forefront of AI infrastructure technology. #J-18808-Ljbffr Acceler8 Talent

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Staff ML Inference Systems Engineer - Scalable GPU Infra (SF) in San Francisco, CA vacancy
  •  ...Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale...  ...will design distributed training systems and optimize GPU utilization while collaborating...  ...have over 5 years of experience in ML infrastructure and a strong background... 
    Suggested

    Baseten

    San Francisco, CA
    4 days ago
  • $200k - $280k

     ...leading AI company in San Francisco is looking for a Staff Machine Learning Engineer to enhance inference systems at production scale. You will design algorithms,...  ...pipelines. Ideal candidates have 3+ years of experience in ML systems, preferably with a strong foundation in... 
    Suggested
    Full time

    AI Chopping Block, Inc.

    San Francisco, CA
    1 day ago
  • $160k - $320k

    A leading AI computing firm is seeking a Systems Engineer in San Francisco or Los Angeles to scale AI inference. Candidates should have strong C++ skills, HPC experience...  ...techniques. Responsibilities include designing GPU kernels, optimizing performance, and... 
    Suggested

    Vast.ai

    San Francisco, CA
    4 days ago
  • Genesis AI is seeking an experienced individual to develop low-latency inference pipelines for on-device deployment in robotics. The role involves designing and optimizing distributed systems on GPU clusters, implementing efficient low-level code such as CUDA and Triton... 
    Suggested

    Genesis AI

    San Francisco, CA
    4 days ago
  • $160k - $320k

     ...excellence. We seek engineers/researchers with...  ...'re looking for a systems engineer with HPC...  ...to help scale AI inference. You'll leverage your...  ...to optimize GPU performance at the...  ...site at either our SF or LA offices Tech...  ...HPC techniques into scalable AI inference solutions... 
    Suggested
    Full time
    Work at office

    Vast.ai

    San Francisco, CA
    4 days ago
  •  ...looking for a Member of Technical Staff focused on ML systems and inference in San Francisco. You will design...  ..., ensuring fast, predictable, and scalable performance. Key responsibilities...  ...have strong foundations in software engineering, experience with ML inference systems... 

    Gimlet Labs, Inc.

    San Francisco, CA
    4 days ago
  • $180k - $250k

     ...You will design and implement innovative model serving architectures while working with the Applied ML team and customers. The ideal candidate has expertise in systems programming and deep understanding of cutting-edge ML infrastructure. Compensation ranges from $180,0... 

    fal

    San Francisco, CA
    2 days ago
  • $227.2k - $417k

     ...Software Engineer, ML Infra & Distributed Systems (Staff & Principal) San Francisco, CA; Los Angeles, CA; New...  ...build world-class machine learning inference platforms. These platforms power...  ...Responsibilities: Design and build scalable, high throughput, and low latency... 
    Full time
    Temporary work
    Local area
    Remote work
    Flexible hours

    Tubi

    San Francisco, CA
    1 day ago
  • Acceler8 Talent is looking for a Software Engineer in San Francisco to focus on building and optimizing inference systems for next-generation AI at scale. You will design production...  ...ideal candidate has hands-on experience in ML inference systems and strong skills in Python... 

    Acceler8 Talent

    San Francisco, CA
    2 days ago
  •  ...real-time. Our vision is AI systems that are flexible, personalized...  ...useful intelligence - the inference services that serve LLMs at scale...  ...about both. Researchers and ML engineers will hand you workloads that...  ...and cost across heterogeneous GPU fleets. Batching, scheduling,... 
    Flexible hours

    Adaption

    San Francisco, CA
    3 days ago
  •  ...Staff Technical Lead for Inference & ML Performance San Francisco fal is the generative media ecosystem...  ...shape the future of fal's inference engine and ensure our generative models achieve...  ...enhancing inference speed and scalability. Mentor and scale your team. Coach... 

    Fal

    San Francisco, CA
    14 hours ago
  • B Capital is seeking a talented engineer in San Francisco to build and scale distributed training systems for machine learning models. The ideal candidate will have strong...  ...in distributed training frameworks, debug GPU compute systems, and optimize training throughput... 

    B Capital

    San Francisco, CA
    1 day ago
  • $248.8k - $311k

     ...Physical AI and developing ML pipelines for...  ...Role As an ML Systems Engineer on the Physical AI team...  ...and build platforms for scalable, reliable, and...  ...performance tracking of model inference. Lead: Own projects...  ..., including GPU-level algorithm optimizations... 
    Full time

    Scale AI

    San Francisco, CA
    2 days ago
  • A leading tech company in San Francisco seeks a Machine Learning Engineer to build and maintain infrastructure for large-scale model training. In this hands-on role, you will design systems, work closely with researchers, and optimize training processes. Candidates should... 

    Monograph

    San Francisco, CA
    4 days ago
  • AI Chopping Block, Inc. is seeking a Machine Learning Engineer to design and build scalable machine learning systems. Responsibilities involve developing end-to-end ML pipelines, optimizing AI models for mobile environments, and integrating AI-driven solutions into applications... 

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  •  ...AI workloads is seeking a Member of Technical Staff to design and optimize inference systems. The role involves managing KV cache...  ...Ideal candidates should have strong software engineering skills and experience with ML inference systems, particularly in Python and... 

    Gimlet Labs

    San Francisco, CA
    2 days ago
  • A tech-driven company focused on blockchain solutions is seeking a Senior ML Systems Engineer. In this role, you will build reusable workflows, automate model versioning, and deploy scalable AI systems. Candidates should have strong programming skills, experience with... 

    TRM Labs

    San Francisco, CA
    4 days ago
  • A tech-first company is seeking a Member of Technical Staff to focus on cutting-edge AI research and development. The role involves building and scaling training and inference infrastructure, designing ML kernels, and optimizing performance. Ideal candidates should have... 

    Mirendil

    San Francisco, CA
    4 days ago
  • $117.2k - $313.7k

     ...duplicating efforts. Job Category Software Engineering Job Details About Salesforce...  ...and drive innovations that improve system scalability, robustness, and availability....  ...Design patterns & Experience with Big-Data/ML and S3 Hands-on experience with Streaming... 
    Immediate start
    Remote work

    Salesforce

    San Francisco, CA
    2 days ago
  •  ...powers mission‑critical inference for the world's most...  ...help build the platform engineers turn to to ship AI products...  ...the global operating system for distributed,...  ...engineers to lead our GPU Networking efforts, making...  ...Exposure to a variety of ML startups, offering unparalleled... 
    Flexible hours

    Baseten

    San Francisco, CA
    4 days ago
  • AI Chopping Block, Inc. is looking for a Systems Engineer to develop and maintain infrastructure for large-scale ML model training in San Francisco. This role demands a focus on performance and reliability across various systems. Candidates should have strong debugging... 

    AI Chopping Block, Inc.

    San Francisco, CA
    4 days ago
  • Stars Arena is seeking a Systems Engineer to develop pioneering machine learning infrastructure that enhances the efficiency of experiments on local and cloud GPUs. This role requires strong skills in Python and a passion for understanding system internals. Alongside a... 
    Remote job
    Local area
    Flexible hours

    Stars Arena

    San Francisco, CA
    3 days ago
  • $320k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...committed researchers, engineers, policy experts, and...  ...Role The Cloud Inference team scales and...  ...serving; prior inference or ML experience is not required...  ..., we expect all staff to be in one of our offices... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    4 days ago
  • Jaide Health is seeking experienced Members of Technical Staff to join their Model Serving team. This role involves...  ...ideal candidate will have significant experience in engineering, especially with distributed systems. The company offers a hybrid work model and extensive... 

    Jaide Health

    San Francisco, CA
    2 days ago
  •  ...are looking for an experienced engineer to join our small, but growing...  ...to help build and maintain a scalable backend. This role is focused...  ...safe and transparent financial system. Fast moving, challenging and...  ...to eradicate it entirely. SF Office: Work in person in... 
    Work at office
    Flexible hours

    ABC Labs

    San Francisco, CA
    5 days ago
  • $181.1k - $318.4k

     ...Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge...  ...team to optimize inference for cutting edge model...  .... ~ Familiar with GPU programming concepts...  ...one of the popular ML Frameworks like...  ...building and maintaining systems written in modern... 
    Relocation

    Apple

    San Francisco, CA
    3 days ago
  • Genesis AI in San Francisco is looking for an experienced professional to optimize and build distributed training systems using PyTorch. The ideal candidate has over 8 years of experience in distributed systems, high-performance computing, and extensive expertise in Python... 

    Genesis AI

    San Francisco, CA
    4 days ago
  • Handshake is hiring a Software Engineer in San Francisco to develop the Reinforcement Learning Environments platform. This role entails building core components of RLE systems and improving system performance, requiring expertise in backend systems and cloud infrastructure... 
    Flexible hours

    Handshake

    San Francisco, CA
    3 days ago
  •  ...leading technology recruitment service seeks a Senior Software Engineer to join a fast-growing VC-backed B2B software platform in San Francisco. This role involves designing and building scalable backend systems that will impact the future of software transactions. Ideal... 

    Jack & Jill/External ATS

    San Francisco, CA
    2 days ago
  • A leading streaming service is seeking a Staff Software Engineer to enhance ML infrastructure. The role involves designing scalable systems, mentoring engineers, and collaborating with cross-functional teams. Candidates should have over 8 years of experience in building... 

    Tubi Tv

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff ML Inference Systems Engineer - Scalable GPU Infra (SF). Be the first to apply!