Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior ML Systems Engineer: Scalable Training Frameworks

Cohere

A leading AI research firm located in San Francisco is seeking a Senior ML Systems Engineer to build and maintain the training framework for large-scale language models. The role involves designing distributed training solutions and improving training throughput across multi-node clusters. The ideal candidate will have strong engineering experience in distributed training, familiarity with JAX, and excellent collaboration skills. This position promises significant ownership over critical components and engagement with cutting-edge AI technologies while offering a flexible work environment. #J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior ML Systems Engineer: Scalable Training Frameworks in San Francisco, CA vacancy
  •  ...Senior ML Systems Engineer, Frameworks & Tooling at Cohere Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises...  ...that enable fast, reliable, and scalable model training and build the tooling... 
    Senior
    Training
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    2 days ago
  •  ...Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale training and fine-tuning of...  ...training systems and optimize GPU utilization...  ...of experience in ML infrastructure and a...  ...distributed training frameworks. Competitive compensation... 
    Senior
    Training

    BaseTen

    San Francisco, CA
    2 days ago
  •  ...secure. The AI Engineering Team is chartered...  ...LLMs) and agentic systems. Our mission is...  ..., evaluation frameworks, and orchestration...  ...About the Role As a Senior or Staff ML Systems Engineer...  ...for model training, evaluation, and...  ...out a modular and scalable AI infrastructure... 
    Senior
    Training
    Remote work
    Worldwide

    TRM Labs

    San Francisco, CA
    3 days ago
  • $295k - $380k

    OpenAI is searching for a Senior Software Engineer to join their Robotics team in San Francisco. The role focuses on maintaining and improving the training framework while actively reviewing and debugging code within ML systems. The ideal candidate should thrive in hands... 
    Senior
    Training

    OpenAI

    San Francisco, CA
    13 hours ago
  •  ...A growing technology company in San Francisco is seeking a Senior Machine Learning Engineer to design and implement advanced ML systems. Responsibilities include optimizing data pipelines and collaborating with product teams. The ideal candidate has strong experience... 
    Senior
    Flexible hours

    EvenUp Inc.

    San Francisco, CA
    2 days ago
  • MakerMaker.AI is seeking a Senior ML Engineer in San Francisco. In this role, you will build and maintain machine learning systems and pipelines for research purposes, ensuring accurate...  ...and owning the data pipelines for training and evaluation. If you have 6+ years of... 
    Senior
    Training

    MakerMaker.AI

    San Francisco, CA
    13 hours ago
  • $200k - $400k

     ...data platform is seeking a Senior Machine Learning Engineer to build and optimize large-...  ...pipelines for AI video model training. This role involves architecting data ingestion systems, developing multimodal models...  ...has over 6 years of ML engineering experience, expertise... 
    Senior
    Training

    Troveo AI

    San Francisco, CA
    2 days ago
  •  ...Senior ML/RL Engineer, Behavior Planning At Bot Auto, we are revolutionizing...  ...world by developing a scalable policy framework that represents both our L...  ...MARL) and safety-critical system design to ensure our...  ...Behavioral Modeling: Develop and train diverse, conditioned... 
    Senior
    Training
    Shift work

    Bot Auto

    San Francisco, CA
    3 days ago
  •  ...and maintain large distributed ML training and inference clusters Develop efficient, scalable end-to-end pipelines to manage...  ...proficiency with distributed training frameworks (e.g. FSDP, DeepSpeed) to train...  ...on distributed task management systems and scalable model serving &... 
    Senior
    Training

    Kindredventures

    San Francisco, CA
    3 hours ago
  •  ...Python and standard ML frameworks (e.g., JAX,...  ...scale distributed training and data processing...  ...evaluating complex AI systems , (Desirable)...  ..., influencing senior stakeholders, and...  ...build and operate scalable machine learning...  ...researchers and software engineers who are... 
    Senior
    Training

    Waymo

    San Francisco, CA
    1 day ago
  • $148.5k - $223.9k

     ...Category Software Engineering Job Details About...  ...DET is seeking a Senior Tax Systems Engineer (Vertex) to...  ...business requirements into scalable technical solutions...  ...middleware, or custom frameworks ~ Solid...  ...promotion, benefits, training, assessment of job performance... 
    Senior
    Training

    Salesforce

    San Francisco, CA
    13 hours ago
  • $118k - $169k

     ...protected the U.S. financial system for over thirty years...  ...in real time. The Sr. ML Ops Engineer will partner with our...  ...builds, and maintains scalable ML infrastructure and pipelines for model training, deployment, and...  ...Science and ML packages and frameworks. Experience with... 
    Senior
    Training
    Hourly pay
    Work experience placement
    Work at office
    Immediate start
    Visa sponsorship
    Work visa
    Flexible hours

    Early Warning Services, LLC

    San Francisco, CA
    4 days ago
  • $172k

     ...Senior AI/ML Engineer Chicago, IL, USA; New York, NY, USA; San Francisco...  ...teams to deploy scalable AI systems that improve member engagement...  ...improve infrastructure for training, serving, and monitoring large...  ...Contribute to experimentation frameworks, optimization strategies,... 
    Senior
    Training
    Full time
    Work at office
    Local area
    Remote work
    Night shift

    CHIME INC.

    San Francisco, CA
    2 days ago
  • $272k - $336k

     ...Senior Staff Regulatory and Compliance Systems Engineer Waymo is an autonomous driving technology...  ...called upon in setting (scalable) strategy for (technical...  ...lower-level fault response frameworks Deeply understand...  ..., experience, relevant training and education, and... 
    Senior
    Training
    Odd job
    Full time
    Remote work

    Waymo

    San Francisco, CA
    1 day ago
  • $275k - $325k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...researchers, engineers, policy experts, and...  ...platforms secure and scalable as the company doubles...  ...Security and compliance frameworks Deadline to apply: None...  ...of education, training, and/or experience Required... 
    Senior
    Training
    Work at office
    Visa sponsorship
    Flexible hours

    Nerdleveltech

    San Francisco, CA
    1 day ago
  •  ...seeking an experienced Software Engineer to develop machine learning...  ...infrastructure for monetization and ads systems. The role involves building data pipelines, creating training platforms, and collaborating...  ...in distributed systems and ML workflows. Join us in shaping the... 
    Senior
    Training

    AI Chopping Block, Inc.

    San Francisco, CA
    13 hours ago
  •  ...An innovative company is seeking a Distributed Systems/ML Engineer to enhance the training throughput of its internal framework. This role involves collaborating with researchers to develop efficient video models and applying cutting-edge techniques to optimize training... 
    Senior
    Training

    OpenAI

    San Francisco, CA
    2 days ago
  • A tech-driven company focused on blockchain solutions is seeking a Senior ML Systems Engineer. In this role, you will build reusable workflows, automate model versioning, and deploy scalable AI systems. Candidates should have strong programming skills, experience with scalable... 
    Senior

    TRM Labs

    San Francisco, CA
    3 days ago
  • TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical infrastructure for AI/ML systems in San...  ...strong Python programming skills, a solid background in scalable infrastructure, and experience deploying LLM workflows.... 
    Senior

    TRM Labs

    San Francisco, CA
    3 days ago
  • $172.5k - $260.1k

     ...Job Category Software Engineering About Salesforce...  ...technical depth to build systems from the ground up...  ...leverage AI to ensure scalability. We partner with the...  .... Mastery of agentic frameworks such as LangGraph. Leverage...  ...promotion, benefits, training, assessment of job... 
    Senior
    Training
    Shift work

    Centaur Labs

    San Francisco, CA
    4 days ago
  • $200.8k - $251k

     ...San Francisco seeks a team member to build and optimize a machine learning framework for large language models. Candidates should have system optimization experience and solid software engineering skills, particularly in tools like CUDA and Pytorch. This full-time... 
    Training
    Full time

    Scale AI

    San Francisco, CA
    2 days ago
  • $180.6k - $315k

     ...A leading AI data foundry is seeking a Machine Learning Systems Research Engineer to optimize and develop their training frameworks for next-generation AI models. Ideal candidates will have 1-3 years of LLM training experience, strong software engineering skills, and advanced... 
    Training
    Full time

    Scale AI

    San Francisco, CA
    2 days ago
  •  ...integration into real-world systems, with the observability,...  ...are looking for a visionary Senior ML Engineer who will bridge the gap between...  ...years of ML, specifically training or fine-tuning LLM models,...  ...; utilizing evaluation frameworks to quantify performance ~... 
    Senior
    Training
    Shift work

    Palm Venture Studios

    San Francisco, CA
    4 days ago
  •  ...infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization...  ...candidate will have hands-on experience with modern inference frameworks and a solid understanding of reinforcement learning... 
    Senior
    Training

    Reflection AI

    San Francisco, CA
    3 days ago
  • $204k - $259k

     ...Perception team builds the system which learns the...  ...sensors, enabling engineers like you to (1)...  ...models and model training at scale, to (3)...  ...of our work is ML-related. Recently...  ..., etc. Develop scalable recipes for large...  ...Experience with ML frameworks like PyTorch, JAX,... 
    Senior
    Training
    Full time
    Remote work

    Waymo

    San Francisco, CA
    1 day ago
  •  ...Highlight AI We're a small, senior team building the...  ...We're hiring a Senior ML Engineer to help build the AI systems that power Highlight. You...  ...stack: data pipelines, model training, retrieval, ranking, evals...  ...engineering, fine tuning, eval frameworks) ~ Product thinker:... 
    Senior
    Training
    Work at office
    Relocation
    Relocation package
    Flexible hours

    Highlight AI

    San Francisco, CA
    2 days ago
  • $160k - $250k

     ...Senior Machine Learning Engineer In order to execute our vision...  ...of planning out scalable, maintainable...  ...involved in applying a ML model to a...  ...refining data, training and tuning the model...  ...core backend systems by suggesting and...  ...machine learning frameworks, such as PyTorch... 
    Senior
    Training

    Hive

    San Francisco, CA
    13 hours ago
  • $200k - $400k

     ...data platform to train AI video models...  ...labs, enabling scalable, compliant, and...  ...strategic engineer to help us scale...  ...Overview The Senior Machine Learning...  ...across the full ML lifecycle, from...  ...‑in‑the‑loop systems to curate high‑...  ...Build evaluation frameworks with metrics... 
    Senior
    Training
    Work experience placement

    Troveo AI

    San Francisco, CA
    2 days ago
  • $204k - $259k

     ...serving as the foundation for training and validating the AV stack. We are an advanced ML and engineering team that leverages state-...  ...Design and implement a scalable AI agent framework that integrates large...  ...continuously improving the system's captioning and reasoning... 
    Senior
    Training
    Full time
    Remote work

    Waymo

    San Francisco, CA
    3 days ago
  • A leading tech company in San Francisco seeks a Machine Learning Engineer to build and maintain infrastructure for large-scale model training. In this hands-on role, you will design systems, work closely with researchers, and optimize training processes. Candidates should... 
    Training

    Monograph

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Systems Engineer: Scalable Training Frameworks. Be the first to apply!