Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Fellow: Distributed ML Engineer (Scale Training)

Advanced Micro Devices , Inc.

A leading technology company is seeking a Fellow/Sr. Fellow Machine Learning Engineer to join the Training At Scale team in San Jose, CA. The candidate will work on distributed training of large models and improve training efficiency. Responsibilities include enhancing pipeline performance, contributing to open source, and collaborating with various teams. A strong background in machine learning and experience with ML frameworks like PyTorch and TensorFlow is required. A Master’s or Ph.D. in a related field is preferred. #J-18808-Ljbffr

Vacancy posted 11 hours ago
Similar jobs that could be interesting for youBased on the Senior Fellow: Distributed ML Engineer (Scale Training) in San Jose, CA vacancy
  • $153.2k - $234.1k

     ...transportation on a global scale. Role Overview:...  ...machine learning engineer working on our cutting...  ...driverless vehicles. As a Senior ML Infra Engineer, you...  ...machine learning model training and evaluation...  ...building large-scale distributed systems/applications or... 
    Senior
    Training
    Work at office
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    5 days ago
  •  ...AI lab in Santa Clara is seeking a skilled software engineer with over 8 years of experience to optimize machine...  ...-time applications. The role involves designing distributed training strategies, collaborating with ML researchers, and developing tools for performance enhancement... 
    Senior
    Training

    Odyssey

    Santa Clara, CA
    2 days ago
  •  ...career. THE ROLE: We are looking for a Fellow/Sr. Fellow Machine Learning Engineer to join our Training At Scale team. If you are excited by the challenge of distributed training of large models on a...  ...optimization. Strong familiarity with ML frameworks (PyTorch, JAX,... 
    Senior
    Training

    Advanced Micro Devices , Inc.

    San Jose, CA
    9 hours ago
  • $224k - $356.5k

     ...Santa Clara is seeking exceptional Senior Machine Learning and Simulation Engineers for their Autonomous Vehicles (AV...  ...design and development of large-scale RL training frameworks to enhance multi-modal...  ...over 12 years of experience in ML training, simulating AV systems,... 
    Senior
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    10 hours ago
  • $159.3k - $230.7k

     ...transportation on a global scale. The Data Scaling...  ...works on and delivers ML models to the product that...  ...foundation model pre-training and fine-tuning with...  ...high-impact team of AI/ML engineers, data scientists and...  ...autonomous vehicles. As a Senior AI/ML Engineer in the... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    5 days ago
  • $144.7k - $261.3k

     ...infrastructure, and ML/AI GPU platforms for...  ...GM is looking for a Senior Performance Engineer to join the AV Capacity...  ...provide input into large scale ML infrastructure...  ...reliability of large-scale ML training and inference...  ...issues within large-scale distributed production... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Flexible hours
    3 days per week

    General Motors

    Sunnyvale, CA
    1 day ago
  •  ...we advance your career. PMTS Large Scale Training Performance Optimization ENGINEER THE ROLE: We are looking for a...  ...you are excited by the challenge of distributed training of large models on a large...  ...PREFERRED EXPERIENCE: Experience with ML/DL frameworks such as PyTorch, JAX... 
    Training

    Advanced Micro Devices , Inc.

    San Jose, CA
    10 hours ago
  •  ...The Apple Ray team is seeking a Senior / Staff Software Engineer with strong distributed systems expertise and a solid background...  ...of Apple’s unified data+ML platform powered by open-source...  ...platform meets the needs of large-scale training and inference workloads. You... 
    Senior
    Training

    Apple

    Cupertino, CA
    2 days ago
  • $153.2k - $234.1k

     ...transportation on a global scale. Role: Are you passionate...  ...-world scenarios. As a Senior ML Infra Engineer, you will work on the core...  ...rapid dataset generation, training, evaluation and iteration...  ...experienceworking onlarge-scale distributed systems, applications, or... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  • $150k

     ...A leading research lab in Sunnyvale is seeking a distributed ML infrastructure engineer to extend and scale training systems. The ideal candidate must have over 5 years of experience in ML systems with strong expertise in distributed training frameworks like DeepSpeed... 
    Training

    Institute of Foundation Models

    Sunnyvale, CA
    10 hours ago
  •  ...GPUs. Our novel wafer‑scale architecture provides the...  ...industry‑leading training and inference speeds and...  ...effortlessly run large‑scale ML applications, without...  ...looking for a Software Engineer to join the ML Integration...  ...infrastructure, distributed systems, and hardware/software... 
    Senior
    Training
    Work at office
    Remote work

    Dormont Manufacturing Company

    Sunnyvale, CA
    11 hours ago
  •  ...California, is offering a Staff ML Infra Engineer position that focuses on...  ...designing scalable systems for training and evaluating ML models,...  ...requiring a strong background in distributed systems and ML algorithms. A...  ...minimum of 5 years in large-scale systems is essential,... 
    Senior
    Training
    Remote work

    General Motors

    Sunnyvale, CA
    10 hours ago
  •  ...operating system. At 42dot, our AD ML Platform Engineers build the core data platform and ML training / eval platform for the cutting edge...  ...autonomous driving. We develop the distributed system of a scalable data platform for large‑scale dataset (millions of scenes), as... 
    Senior
    Training
    Full time
    Work experience placement

    42dot Inc.

    Sunnyvale, CA
    10 hours ago
  •  ...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the...  ...You are able to build and scale software deliverables in...  .... Experience with distributed, high performance software...  ...deployment, including training, quantization, sparsity,... 
    Senior
    Training
    Work experience placement
    3 days per week

    D-Matrix

    Santa Clara, CA
    3 days ago
  • $155.42k - $395.9k

     ...to-end AI lifecycle of ML pipelines-from local...  ...experimentation and large-scale training to evaluation, lineage...  ..., enabling ML engineers and researchers to develop...  ...The Role: As a Senior AI/ML Engineer, you will...  ...implement, and test scalable distributed computing and data... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  • $152k - $241.5k

     ...vehicles. We are now looking for a ML Platform Engineer to help accelerate the next...  ...will architect, build, and scale our high-performance ML...  ...scientists and engineers to train and deploy the most advanced...  ...scalability across large-scale, distributed GPU clusters. Apply SRE... 
    Senior
    Training

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $170.6k - $261.3k

     ...world! The Data Labeling Engineering team designs, builds, and operates...  ..., data engineering, and AI/ML, defining the strategies,...  ...that create reliable training data at scale. Our tools and platform are...  ...experience building robust distributed platforms and applications.... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    29 days ago
  •  ...technology company in California is seeking a Senior Technical Program Manager to oversee and execute large-scale AI training programs. The ideal candidate will manage...  ...strong background in AI training frameworks and distributed systems, coupled with excellent... 
    Senior
    Training

    Advanced Micro Devices , Inc.

    San Jose, CA
    11 hours ago
  •  ...A leading technology firm is seeking a Principal Machine Learning Engineer in San Jose, CA. The role focuses on optimizing distributed training for large models, making significant contributions to AMD's AI platform. The ideal candidate should have expertise in distributed... 
    Training

    Advanced Micro Devices , Inc.

    San Jose, CA
    1 day ago
  • $181.1k - $318.4k

     ...Apple Inc. is looking for a Senior Machine Learning Engineer for the Siri Speech team in Cupertino, California...  ...data processing to improve model training. Candidates should have strong experience in processing complex data, distributed frameworks, and software engineering... 
    Senior
    Training

    Apple

    Cupertino, CA
    1 day ago
  • $184k - $287.5k

     ...analyzers that are candidates for ML replacement and build the...  ...in Computer Science, Computer Engineering, or a related technical field....  ...pipelines Comfort with large‑scale data processing (Spark, Dask,...  ...JAX. Comfortable with GPU‑based training workflows. Ways to stand out from... 
    Senior
    Training

    NVIDIA Gruppe

    Santa Clara, CA
    10 hours ago
  •  ...forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8... 
    Senior
    Training

    FlexAI

    Santa Clara, CA
    11 hours ago
  • $181.1k - $318.4k

     ...Sr. ML Engineer, Siri User Experience Metrics and Data Cupertino...  ...Services We’re looking for a Senior Machine Learning Engineer...  ..., building and training ML models using distributed processing frameworks such...  ...infrastructure, and large‑scale operations, including model... 
    Senior
    Training
    Relocation

    Apple

    Cupertino, CA
    11 hours ago
  •  ...seeking a highly skilled Machine Learning Engineer with deep expertise in developing Bird’...  ...perception models using large-scale datasets and well-defined quantitative...  ...production environments. Experience with distributed training, high-performance computing, or GPU acceleration... 
    Senior
    Training

    PlusAI

    Santa Clara, CA
    26 days ago
  •  ...Staff/Sr. ML Compute Efficiency Engineer Scaling machine learning workloads across thousands of GPUs and TPUs creates...  ...that powers large-scale ML training and inference workloads, bringing together expertise in distributed systems, machine learning infrastructure... 
    Senior
    Training

    Apple

    Santa Clara, CA
    2 days ago
  • $181.1k - $318.4k

     ...Senior Machine Learning Engineer, AI, SIML Work Locations (2) Submit...  ...Efficient and Scalable ML Infrastructure, and...  ...performant, scalable training and inference for...  ...generation tools for large-scale deep learning. You'...  ...such as PyTorch Distributed (torch.distributed),... 
    Senior
    Training
    Relocation
    Flexible hours

    Apple

    Cupertino, CA
    3 days ago
  • $181.1k - $318.4k

     ...Senior Computer Vision and Machine Learning Engineer, Creator Studio Work Locations (2) Submit Resume At Apple, new ideas...  ...machine learning models, from large distributed training and validation to efficient inference at scale Design data pipelines in partnership... 
    Senior
    Training
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $184.5k

     ...Senior ML/Gen AI Engineer Strategic Partnerships & Affiliates is part of the Expedia...  .... We work with a geo‑distributed, cross‑functional team of 5...  ...scientists to productize and scale ML models, from experimentation...  ...learning systems for training, deployment, inference, and... 
    Senior
    Training

    Expedia , Inc.

    San Jose, CA
    10 hours ago
  • $184.5k

     ...open world. Join us. Senior ML/Gen AI Engineer Introduction to the...  ....
You will work with a geo-distributed, cross functional team of 5...  ...Scientists to productize and scale ML models, from experimentation...  ...learning systems for training, deployment, inference, and... 
    Senior
    Training
    Local area
    Flexible hours

    Expedia Group

    San Jose, CA
    1 day ago
  • $150k

     ...cutting‑edge foundation model training, alongside world‑class...  ...researchers, data scientists, and engineers, tackling the most...  ...pioneers. The Role The Distributed ML Engineer will play a role at...  ...debug methodologies, and large‑scale machine learning experience.... 
    Training
    Work experience placement
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    10 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Fellow: Distributed ML Engineer (Scale Training). Be the first to apply!