Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Systems Engineer: Distributed LLM Training & Inference

$200.8k - $251k

Scale AI

A leading AI technology company in San Francisco seeks a team member to build and optimize a machine learning framework for large language models. Candidates should have system optimization experience and solid software engineering skills, particularly in tools like CUDA and Pytorch. This full-time position offers a competitive salary range of $200,800 - $251,000, along with comprehensive benefits. #J-18808-Ljbffr

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the ML Systems Engineer: Distributed LLM Training & Inference in Seattle, WA vacancy
  •  ...reliable, field-ready AI systems that solve the...  ...rigorous engineering with learning systems...  ...are seeking a Staff ML Systems Engineer to...  ...architect and build the distributed infrastructure...  ...processing, model training, evaluation, and...  ...learning training and inference systems.... 
    Training
    Local area

    FieldAI

    Seattle, WA
    15 days ago
  •  ...technology company in Seattle is seeking a Machine Learning Engineer for Model Serving Infrastructure. The ideal candidate will...  ...programming skills in C/C++/CUDA. You will design and implement distributed inference infrastructure and collaborate with product teams. This... 
    Suggested

    ByteDance

    Seattle, WA
    4 days ago
  • $204k - $259k

     ...Machine Learning Engineer – VLM/LLM Evaluation Waymo...  ..., Bayesian inference, hierarchical learning...  ...Waymo's systems, both onboard autonomous...  ...life-cycle from pre-training and supervised...  ...Experience in ML engineering and applied...  ...with large scale distributed system... 
    Training
    Full time
    Temporary work
    Remote work

    Waymo

    Kirkland, WA
    3 days ago
  • $189.6k - $237k

     ...Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been...  ...and evaluation of LLM's, as well as evaluation...  ...optimize our ML system Ideally you...  ...Strong software engineering skills,... 
    Training
    Full time

    Scale AI

    Seattle, WA
    2 days ago
  •  ...seeking a Senior or Staff Software Engineer for the ML Infrastructure team. The role...  ...designing and operating systems for large-scale model training and inference, focusing on reliability and performance...  ...extensive experience with distributed systems, Kubernetes, and... 
    Training

    Salesforce, Inc.

    Seattle, WA
    1 day ago
  •  ...Performance Engineer, Inference Systems San Francisco, CA | New York City,...  ...kernels, model servers, distributed routing, autoscaling, capacity...  ...Experience with ML systems, especially training or inference infrastructure or general LLM serving stacks. Direct large... 
    Training
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    Seattle, WA
    3 days ago
  •  ...A leading software company in Seattle is seeking a Senior Machine Learning Systems & Efficiency Engineer to enhance inference performance and cost efficiency in image editing applications. The role requires expertise in AI, machine learning systems, and performance optimization... 

    Adobe

    Seattle, WA
    4 days ago
  • $164k - $313.3k

     ...Senior Machine Learning (ML) Systems & Efficiency Engineer to join our R&D team focused...  ...-ready improvements in inference performance, latency, and...  ...with experience in distributed inference, multimodal model...  ...communication efficiency. Explore training or fine-tuning approaches... 
    Training
    Temporary work
    Local area
    Worldwide

    Adobe

    Seattle, WA
    1 day ago
  •  ...interpretable, and steerable AI systems. We want AI to be...  ...researchers, engineers, policy experts, and...  ...-edge systems that train AI models like...  ...steerable AI. As an ML Systems Engineer on...  ...performance, large scale distributed systems Large scale LLM training Python Implementing... 
    Training
    Work at office
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    Seattle, WA
    1 day ago
  • $320k - $405k

     ...Machine Learning Systems Engineer, Research Tools About Anthropic...  ...more efficient and effective training of our AI systems while ensuring...  ...systems, data pipelines, or ML infrastructure Are proficient...  ...scientific progress Distributed systems and parallel computing... 
    Training
    Full time
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    Seattle, WA
    4 days ago
  •  ...Annapurna Labs (U.S.) Inc. in Seattle is seeking a Senior Software Engineer to join the Machine Learning Inference Applications team. This role involves adapting cutting-edge research in LLM optimization to enhance performance on Neuron devices. The position requires extensive... 

    Annapurna Labs

    Seattle, WA
    4 days ago
  • $233.4k - $339.65k

     ...highly skilled and experienced Principal ML Systems Engineer to join our Autonomous Vehicles team....  ...Design & develop the next generation distributed ML data platform (Ingestion,...  ...model lifecycle (feature engineering, training, validation, deployment, monitoring, etc... 
    Training
    H1b
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Bellevue, WA
    6 days ago
  • $204.8k - $296.6k

     ...a Senior Machine Learning Systems & Efficiency Engineer in Seattle. This critical...  ...will focus on optimizing ML systems for efficiency and...  ...relevant field and experience in distributed systems and performance...  ...designing high-throughput inference systems, conducting... 

    Adobe

    Seattle, WA
    4 days ago
  •  ...Principal AI Agent / ML Software Engineer The Principal...  ...-generation AI systems on Oracle Cloud...  ...workflows, scalable inference infrastructure,...  ...candidate combines deep distributed systems...  ...understanding of LLM application patterns...  ...GPU inference or training workloads for latency... 
    Training

    Oracle

    Seattle, WA
    4 days ago
  • $170k - $240k

     ...the Product and Engineering team at...  ...MLE) on the AI & ML (Insights) team...  ...architecture, training, deployment, and...  ...scalable data systems. You will be expected...  ...models that can infer meaning and...  ...grade GenAI or LLM‑based systems with...  ...pipelines and distributed systems using technologies... 
    Training
    Work at office
    Remote work
    Visa sponsorship

    PitchBook

    Seattle, WA
    4 days ago
  •  ...Machine Learning Engineer (Senior) About AZX...  ...specialize in physics-informed ML and enterprise AI...  ...climate risk, energy systems, and global economics....  ...~ Generative AI and LLM-related capabilities (...  ...Large-scale data and distributed training paradigms (e.g., Spark... 
    Training
    Full time
    Remote work
    Work visa
    Flexible hours
    Shift work

    AZX

    Seattle, WA
    3 days ago
  • $148.2k - $300.96k

     ...Advanced machine learning systems to detect and prevent...  ...- Design prompt engineering and reasoning workflows...  ...indicators, and real-time LLM-based decisions. - Knowledge...  ...part of a cutting-edge ML + LLM team shaping the...  ...with LLM post-training applications , especially... 
    Training
    Temporary work
    Local area
    Worldwide

    Tik Tok

    Seattle, WA
    1 day ago
  • $200k - $250k

     ...Machine Learning Engineering within the Advanced...  ...our foundational systems that power our...  ...annotation pipelines, ML Infrastructure...  ...workflows (LLM-in-the-loop) to reduce...  ...lifecycle, including distributed training infrastructure,...  ...and low-latency inference services. Ensure... 
    Training
    Temporary work
    Work at office
    Local area

    Metropolis

    Seattle, WA
    16 days ago
  • $171.6k - $302.2k

     ...Machine Learning Engineer, AI, SIML Seattle...  ...The Intelligence System Experience (ISE)...  ...Efficient and Scalable ML Infrastructure,...  ..., scalable training and inference for Apple's AI-driven...  ...understanding of LLM architectures and...  ...such as PyTorch Distributed (torch.distributed... 
    Training
    Relocation
    Flexible hours

    Apple Inc.

    Seattle, WA
    4 days ago
  •  ...Capital is seeking Software Engineers to join the ML Infrastructure team. In...  ...you will design and operate systems to support large scale machine learning model training and inference. Candidates need...  ...experience in backend systems and distributed technologies like Kubernetes... 
    Training

    B Capital

    Seattle, WA
    3 days ago
  • $171.6k - $302.2k

    Senior ML Infrastructure Engineer - Training Algorithms, SIML Seattle, Washington, United...  ...? We are the Intelligence System Experience (ISE) team within...  ...in training / adapting LLM and Diffusion models Advanced...  ...projects Experience with distributed training of large models... 
    Training
    Relocation

    Apple Inc.

    Seattle, WA
    4 days ago
  •  ...seeking a Senior or Staff Software Engineer to join their ML Infrastructure team. You will...  ...and operating core systems for large scale model training and inference in a fast-paced environment....  ...This role requires expertise in distributed systems and Kubernetes, with... 
    Training

    Slack Enterprise

    Seattle, WA
    3 days ago
  •  ...Systems Engineer About Us We are building next-generation...  ...-built for large-scale AI training and inference. As a startup, we operate...  ...system performance across distributed environments Troubleshoot...  ...Experience in AI/ML infrastructure or HPC environments... 
    Training
    Work at office

    Nscale

    Seattle, WA
    5 days ago
  • $171.6k - $302.2k

     ...Description As a Senior/Staff Engineer on the Foundation...  ...and orchestration systems for large-scale TPU...  ...clusters. You will work on distributed systems that manage...  ...of large-scale training and inference jobs. This role spans...  ...systems for distributed ML workloads running on... 
    Training
    Relocation

    Apple Inc.

    Seattle, WA
    3 days ago
  •  ...VAST Data is looking for a Senior Systems Engineer to join our growing team! This is...  ...for real-time data analysis and AI training and inference. Designed from the ground up to make...  ...matter expertise on storage products, distributed storage architectures, file systems,... 
    Training
    Traineeship

    VAST Data

    Seattle, WA
    1 day ago
  • $176.76k - $232k

     ...company for yoga, running, training, and other athletic...  ...As a Senior AI/ML Engineer, you will lead the delivery...  ...tuning to architectures and system design for serving AI/ML inference solutions in production....  ...architecture and engineering of LLM and GenAI systems including... 
    Training
    Permanent employment
    Contract work
    Part time
    Work visa

    lululemon

    Seattle, WA
    5 days ago
  •  ...Anomaly Detection, and LLM fine-tuning —...  ...As one of our AI ML Engineer’s, you'll be a...  ...performance multi-agent systems that perceive,...  ...Build real-time inference pipelines for...  ...architecting large-scale distributed systems on cloud...  ..., Paternity) Training & Development... 
    Training
    Shift work

    C-Serv

    Bellevue, WA
    9 days ago
  •  ...Machine Learning Engineer As a Machine Learning...  ...of intelligent systems. You will bridge...  ...high-performance distributed systems to support...  ...large-scale model inference and data processing...  ...Implement robust ML pipelines, focusing...  ...data ingestion and training to production... 
    Training

    DocuSign

    Seattle, WA
    3 days ago
  • $184.5k

     ...Senior Machine Learning Engineer to join our high-performing...  ...batch and real-time ML systems that power pricing, inventory...  ...of machine learning, distributed systems, and MLOps,...  ...feature pipelines, model training and validation, scalable inference, monitoring, drift detection... 
    Training
    Local area
    Flexible hours

    Expedia Group

    Seattle, WA
    4 days ago
  •  ...requires production-grade AI/ML systems that meet federal data...  ...preprocessing, feature engineering, model selection, training, evaluation, and validation...  ...Large Language Model (LLM) capabilities, including...  ...backend services for model inference, batch processing, and real... 
    Training
    For contractors
    1 day per week

    Innosoft Corporation

    Seattle, WA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Systems Engineer: Distributed LLM Training & Inference. Be the first to apply!