Senior Fellow: Distributed ML Engineer (Scale Training)
Advanced Micro Devices , Inc.
A leading technology company is seeking a Fellow/Sr. Fellow Machine Learning Engineer to join the Training At Scale team in San Jose, CA. The candidate will work on distributed training of large models and improve training efficiency. Responsibilities include enhancing pipeline performance, contributing to open source, and collaborating with various teams. A strong background in machine learning and experience with ML frameworks like PyTorch and TensorFlow is required. A Master’s or Ph.D. in a related field is preferred. #J-18808-Ljbffr
$153.2k - $234.1k
...transportation on a global scale. Role Overview:... ...machine learning engineer working on our cutting... ...driverless vehicles. As a Senior ML Infra Engineer, you... ...machine learning model training and evaluation... ...building large-scale distributed systems/applications or...SeniorTrainingWork at officeLocal areaRemote workWork from homeRelocationRelocation packageFlexible hours- ...AI lab in Santa Clara is seeking a skilled software engineer with over 8 years of experience to optimize machine... ...-time applications. The role involves designing distributed training strategies, collaborating with ML researchers, and developing tools for performance enhancement...SeniorTraining
- ...career. THE ROLE: We are looking for a Fellow/Sr. Fellow Machine Learning Engineer to join our Training At Scale team. If you are excited by the challenge of distributed training of large models on a... ...optimization. Strong familiarity with ML frameworks (PyTorch, JAX,...SeniorTraining
$224k - $356.5k
...Santa Clara is seeking exceptional Senior Machine Learning and Simulation Engineers for their Autonomous Vehicles (AV... ...design and development of large-scale RL training frameworks to enhance multi-modal... ...over 12 years of experience in ML training, simulating AV systems,...SeniorTraining$159.3k - $230.7k
...transportation on a global scale. The Data Scaling... ...works on and delivers ML models to the product that... ...foundation model pre-training and fine-tuning with... ...high-impact team of AI/ML engineers, data scientists and... ...autonomous vehicles. As a Senior AI/ML Engineer in the...SeniorTrainingLocal areaRemote workWork from homeRelocation packageFlexible hours$144.7k - $261.3k
...infrastructure, and ML/AI GPU platforms for... ...GM is looking for a Senior Performance Engineer to join the AV Capacity... ...provide input into large scale ML infrastructure... ...reliability of large-scale ML training and inference... ...issues within large-scale distributed production...SeniorTrainingLocal areaRemote workWork from homeFlexible hours3 days per week- ...we advance your career. PMTS Large Scale Training Performance Optimization ENGINEER THE ROLE: We are looking for a... ...you are excited by the challenge of distributed training of large models on a large... ...PREFERRED EXPERIENCE: Experience with ML/DL frameworks such as PyTorch, JAX...Training
- ...The Apple Ray team is seeking a Senior / Staff Software Engineer with strong distributed systems expertise and a solid background... ...of Apple’s unified data+ML platform powered by open-source... ...platform meets the needs of large-scale training and inference workloads. You...SeniorTraining
$153.2k - $234.1k
...transportation on a global scale. Role: Are you passionate... ...-world scenarios. As a Senior ML Infra Engineer, you will work on the core... ...rapid dataset generation, training, evaluation and iteration... ...experienceworking onlarge-scale distributed systems, applications, or...SeniorTrainingLocal areaRemote workWork from homeRelocation packageFlexible hours$150k
...A leading research lab in Sunnyvale is seeking a distributed ML infrastructure engineer to extend and scale training systems. The ideal candidate must have over 5 years of experience in ML systems with strong expertise in distributed training frameworks like DeepSpeed...Training- ...GPUs. Our novel wafer‑scale architecture provides the... ...industry‑leading training and inference speeds and... ...effortlessly run large‑scale ML applications, without... ...looking for a Software Engineer to join the ML Integration... ...infrastructure, distributed systems, and hardware/software...SeniorTrainingWork at officeRemote work
- ...California, is offering a Staff ML Infra Engineer position that focuses on... ...designing scalable systems for training and evaluating ML models,... ...requiring a strong background in distributed systems and ML algorithms. A... ...minimum of 5 years in large-scale systems is essential,...SeniorTrainingRemote work
- ...operating system. At 42dot, our AD ML Platform Engineers build the core data platform and ML training / eval platform for the cutting edge... ...autonomous driving. We develop the distributed system of a scalable data platform for large‑scale dataset (millions of scenes), as...SeniorTrainingFull timeWork experience placement
- ...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the... ...You are able to build and scale software deliverables in... .... Experience with distributed, high performance software... ...deployment, including training, quantization, sparsity,...SeniorTrainingWork experience placement3 days per week
$155.42k - $395.9k
...to-end AI lifecycle of ML pipelines-from local... ...experimentation and large-scale training to evaluation, lineage... ..., enabling ML engineers and researchers to develop... ...The Role: As a Senior AI/ML Engineer, you will... ...implement, and test scalable distributed computing and data...SeniorTrainingLocal areaRemote workWork from homeRelocationRelocation packageFlexible hours$152k - $241.5k
...vehicles. We are now looking for a ML Platform Engineer to help accelerate the next... ...will architect, build, and scale our high-performance ML... ...scientists and engineers to train and deploy the most advanced... ...scalability across large-scale, distributed GPU clusters. Apply SRE...SeniorTraining$170.6k - $261.3k
...world! The Data Labeling Engineering team designs, builds, and operates... ..., data engineering, and AI/ML, defining the strategies,... ...that create reliable training data at scale. Our tools and platform are... ...experience building robust distributed platforms and applications....SeniorTrainingLocal areaRemote workWork from homeFlexible hours- ...technology company in California is seeking a Senior Technical Program Manager to oversee and execute large-scale AI training programs. The ideal candidate will manage... ...strong background in AI training frameworks and distributed systems, coupled with excellent...SeniorTraining
- ...A leading technology firm is seeking a Principal Machine Learning Engineer in San Jose, CA. The role focuses on optimizing distributed training for large models, making significant contributions to AMD's AI platform. The ideal candidate should have expertise in distributed...Training
$181.1k - $318.4k
...Apple Inc. is looking for a Senior Machine Learning Engineer for the Siri Speech team in Cupertino, California... ...data processing to improve model training. Candidates should have strong experience in processing complex data, distributed frameworks, and software engineering...SeniorTraining$184k - $287.5k
...analyzers that are candidates for ML replacement and build the... ...in Computer Science, Computer Engineering, or a related technical field.... ...pipelines Comfort with large‑scale data processing (Spark, Dask,... ...JAX. Comfortable with GPU‑based training workflows. Ways to stand out from...SeniorTraining- ...forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8...SeniorTraining
$181.1k - $318.4k
...Sr. ML Engineer, Siri User Experience Metrics and Data Cupertino... ...Services We’re looking for a Senior Machine Learning Engineer... ..., building and training ML models using distributed processing frameworks such... ...infrastructure, and large‑scale operations, including model...SeniorTrainingRelocation- ...seeking a highly skilled Machine Learning Engineer with deep expertise in developing Bird’... ...perception models using large-scale datasets and well-defined quantitative... ...production environments. Experience with distributed training, high-performance computing, or GPU acceleration...SeniorTraining
- ...Staff/Sr. ML Compute Efficiency Engineer Scaling machine learning workloads across thousands of GPUs and TPUs creates... ...that powers large-scale ML training and inference workloads, bringing together expertise in distributed systems, machine learning infrastructure...SeniorTraining
$181.1k - $318.4k
...Senior Machine Learning Engineer, AI, SIML Work Locations (2) Submit... ...Efficient and Scalable ML Infrastructure, and... ...performant, scalable training and inference for... ...generation tools for large-scale deep learning. You'... ...such as PyTorch Distributed (torch.distributed),...SeniorTrainingRelocationFlexible hours$181.1k - $318.4k
...Senior Computer Vision and Machine Learning Engineer, Creator Studio Work Locations (2) Submit Resume At Apple, new ideas... ...machine learning models, from large distributed training and validation to efficient inference at scale Design data pipelines in partnership...SeniorTrainingRelocation$184.5k
...Senior ML/Gen AI Engineer Strategic Partnerships & Affiliates is part of the Expedia... .... We work with a geo‑distributed, cross‑functional team of 5... ...scientists to productize and scale ML models, from experimentation... ...learning systems for training, deployment, inference, and...SeniorTraining$184.5k
...open world. Join us. Senior ML/Gen AI Engineer Introduction to the... .... You will work with a geo-distributed, cross functional team of 5... ...Scientists to productize and scale ML models, from experimentation... ...learning systems for training, deployment, inference, and...SeniorTrainingLocal areaFlexible hours$150k
...cutting‑edge foundation model training, alongside world‑class... ...researchers, data scientists, and engineers, tackling the most... ...pioneers. The Role The Distributed ML Engineer will play a role at... ...debug methodologies, and large‑scale machine learning experience....TrainingWork experience placementVisa sponsorship
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Fellow: Distributed ML Engineer (Scale Training). Be the first to apply!
- machine learning software engineer San Jose, CA
- ai ml engineer San Jose, CA
- computer vision machine learning engineer San Jose, CA
- machine learning engineer San Jose, CA
- senior ml engineer San Jose, CA
- machine learning ai engineer San Jose, CA
- senior cost analyst San Jose, CA
- senior computer engineer San Jose, CA
- senior manager quality engineering San Jose, CA
- senior software test automation engineer San Jose, CA


