Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Sr. / Staff ML Engineer, FM Training Integration - ML Compute

$181.1k - $318.4k

Apple Oakbrook

Sr. / Staff ML Engineer, FM Training Integration - ML Compute

We are looking for a ML Engineer to join our ML Compute team to help improve the efficiency, scalability, and reliability of model training and inference workloads in the cloud. In this role, you will lead the integration of large-scale ML workloads with cloud infrastructure, working cross-functionally with ML engineers, infrastructure engineers, and researchers to optimize performance, improve system efficiency, and drive high utilization of accelerator resources.

We are a group of engineers to support training foundation models at Apple! We build infrastructure to support training foundation models with general capabilities such as understanding and generation of text, images, speech, videos, and other modalities and apply these models to Apple products. We are looking for engineers who are passionate about building systems that push the frontier of deep learning in terms of scaling, efficiency, and flexibility and delight millions of users in Apple products.

Responsibilities
  • Own the integration of large-scale model training workloads with accelerator-based cloud infrastructure, ensuring scalable and reliable execution.
  • Drive performance optimization across the ML stack, including data pipelines, model execution, and distributed systems, to improve throughput, latency, and hardware utilization.
  • Design and run benchmarks to evaluate model performance and infrastructure configurations, using results to guide optimization efforts.
  • Build and improve tooling for observability, profiling, and debugging to increase visibility and reliability of ML workloads.
  • Collaborate cross-functionally with ML engineers, infrastructure engineers, and researchers to improve system efficiency and scalability.
  • Establish and promote best practices for performance tuning and resource utilization.
  • Drive high-quality design and code reviews, share best practices, and elevate engineering standards across the team.
Minimum Qualifications
  • 5+ years of experience in software engineering, ML infrastructure, or related domains.
  • Hands-on experience with machine learning workflows, including training, evaluation, and inference at scale.
  • Proficiency in Python and experience with at least one major ML framework (e.g., PyTorch or JAX).
  • Experience with cloud-based infrastructure and distributed systems (e.g., containers, orchestration, storage, and networking).
  • Bachelor's degree in Computer Science, Engineering, or a related field.
Preferred Qualifications
  • Experience working with accelerator-based systems (e.g., GPUs/TPUs), including performance tuning and debugging of ML workloads.
  • Hands-on experience with distributed training or inference at scale (e.g., data, model, or pipeline parallelism).
  • Experience optimizing large-scale ML systems, including bottleneck analysis across compute, memory, and networking.
  • Familiarity with profiling, tracing, and benchmarking tools for ML workloads (e.g., PyTorch Profiler, NVIDIA Nsight).
  • Experience building or operating ML infrastructure using containerization and orchestration frameworks (e.g., Docker, Kubernetes).
  • Advanced degree in Computer Science, Engineering, or a related field.
Pay & Benefits

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $181,100 and $318,400, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant At Apple, we believe accessibility is a fundamental human right. You'll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong. Learn about accessibility in Apple's workplace Learn about reasonable accommodations for job applicants Apple accepts applications to this posting on an ongoing basis.

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Sr. / Staff ML Engineer, FM Training Integration - ML Compute in Santa Clara, CA vacancy
  • $181.1k - $318.4k

     ...Sr./Staff ML Infrastructure Engineer, Compute (TPU Scheduling) - Foundation Model Work Locations (2) Submit Resume Apple is where individual imaginations...  ...enable reliable, efficient execution of large-scale training and inference jobs. This role spans scheduling... 
    Senior
    Training
    Relocation

    Apple

    Santa Clara, CA
    1 day ago
  • $181.1k - $318.4k

     ...something! Description As a Senior/Staff Engineer on the Foundation Model Compute Infrastructure team, you will lead...  ...efficient execution of large‑scale training and inference jobs. This role spans...  ...orchestration systems for distributed ML workloads running on Kubernetes and... 
    Senior
    Training
    Relocation

    Apple Inc.

    Santa Clara, CA
    10 hours ago
  • $157.2k - $254.1k

     ...Collaboration, Execution, Integrity, and Inclusion. We...  ...seeking a Machine Learning Engineer to join our pioneering...  ...Master's, or Ph.D. in Computer Science, Machine Learning...  ...experience building, training, and deploying machine...  ...track record of taking ML projects from initial research... 
    Senior
    Training
    Full time
    Work at office

    Palo Alto Networks

    Santa Clara, CA
    4 days ago
  • $198.9k - $304.8k

     ...Job Description Staff ML Engineer, ML Compute Platform About the Team: The ML Compute Platform...  ...cases. Our platform supports the training and deployment of state-of-the-art (...  ...building critical backend services, integrating with GPU hardware and orchestration... 
    Training
    Local area
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  •  ...Staff/Sr. ML Compute Efficiency Engineer Scaling machine learning workloads across thousands of GPUs and TPUs creates challenges that few engineers...  ...we build the infrastructure that powers large-scale ML training and inference workloads, bringing together expertise in... 
    Senior
    Training

    Apple

    Santa Clara, CA
    3 days ago
  • $155.42k - $395.9k

     ...Description About the Team: The ML Compute Platform is part of the AI...  .... Our platform supports the training and deployment of state-of-...  ...for a Senior Software Engineer to join our team and help us...  ...critical backend services, integrating with GPU hardware and orchestration... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  •  ...architecture provides the AI compute power of dozens of GPUs on...  ...deliver industry-leading training and inference speeds and...  ...effortlessly run large-scale ML applications, without the...  ...looking for a Software Engineer to join the ML Integration and Quality team at Cerebras... 
    Senior
    Training
    Work at office
    Remote work

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $244.14k - $413.16k

     ...Senior Staff Machine Learning Engineer – Autonomous Driving Foundation Models...  ...of innovation, integrating advanced AI and autonomous...  ...for closed-loop training and evaluation. Policy...  ...Infrastructures and computational resources to support your ML model development/research... 
    Senior
    Training
    Full time
    Overseas

    XPENG

    Santa Clara, CA
    2 days ago
  • $190.2k - $345.65k

     ...Senior Machine Learning Engineers for our GenAI...  ...APIs and ecosystems that integrate both Adobe's first-party...  ..., and mentor other ML engineers. Job Responsibilities...  ...(customized) model training and inference-...  ...Succeed ~ MS or PhD in Computer Science, Machine... 
    Senior
    Training
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    2 days ago
  • $179k - $285k

     ...Role Summary As a Sr./Staff ML Engineer within Rivian’s Perception...  ...scalable data pipelines and training infrastructure to support ML...  ...monitoring systems, and ensure integration with downstream autonomy...  ...research in machine learning, computer vision, and autonomous... 
    Senior
    Training
    Full time
    Contract work
    Live in
    Local area

    Rivian

    Palo Alto, CA
    4 days ago
  • $281k - $356k

     ...Senior Staff ML Engineer, Driver Understanding and Evaluation Waymo...  ...learning models to deliver training and evaluation data for hundreds...  ...: ~ PhD degree in Computer Science, Machine Learning,...  ...simulation platforms and their integration with ML training workflows... 
    Senior
    Training
    Full time

    Waymo

    Mountain View, CA
    4 days ago
  • $244.14k - $413.16k

     ...Senior Staff Machine Learning Engineer - Foundation Model Santa Clara, CA XPENG...  ...forefront of innovation, integrating advanced AI and autonomous...  ...infrastructure experts to design, train, and deploy large-scale...  ...'s degree or higher in Computer Science, Electrical/Computer... 
    Senior
    Training
    Full time

    XPENG

    Santa Clara, CA
    4 days ago
  •  ...architecture provides the AI compute power of dozens of...  ...industry-leading training and inference speeds and...  ...effortlessly run large-scale ML applications, without...  ...The Inference ML Engineering team at Cerebras...  ...complex machine learning integration projects. Design and... 
    Training

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $189.3k - $290.7k

     ...-world scenarios. As a Staff ML Infra Engineer on the Offboard Perception...  ...vehicle development-from training and validation to testing...  ...perception models. Own the integration of models into production...  ...using ML and scientific computing libraries such as PyTorch,... 
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  • $154k - $220k

     ...We are looking for a Sr. Staff Software Engineer to join our Zscaler Digital...  ...Evaluate and integrate state-of-the-art GenAI...  ...Qualifications) BS in Computer Science with 8+ years...  ...problems using AI/ML and distributed systems...  ...education or training. The base salary range... 
    Senior
    Training
    Full time
    Work at office
    Local area
    Worldwide
    3 days per week

    Zscaler

    San Jose, CA
    1 day ago
  •  ...development by prioritizing high-impact, ML-centric use cases. About the Role We are seeking a Staff MLInfrastructure engineer to help build and scale robust Compute platforms for Simulation, data...  .... Proactively research and integrate frameworks, hardware accelerators... 

    General Motors

    Sunnyvale, CA
    2 days ago
  • $212k - $386.3k

     ...Senior Staff Machine Learning Engineer, Apple Search & Knowledge Platforms...  ...'s Private Cloud Compute servers. In addition...  ...frameworks for context integration fast and cost-...  ...engineering/applied research/ML experiences in...  ...experience in LLM post-training, advanced RL-based... 
    Senior
    Training
    Temporary work
    Worldwide
    Relocation

    Apple

    Santa Clara, CA
    3 days ago
  • $214k - $289.5k

     ...Intuit as a Senior Staff Machine Learning Engineer (MLE). Senior Staff...  ...define and evolve ML architecture, guide...  ...across teams for training, deployment, monitoring...  ...improvement. Evaluate and integrate transformative...  ...~ BS, MS, or PhD in Computer Science, Machine... 
    Senior
    Training
    Local area

    Intuit

    Mountain View, CA
    3 days ago
  • $213k - $263k

     ...Senior/Staff ML Engineer, 3D/4D World Modeling, Simulation Waymo is an autonomous driving technology...  ...of realistic environments for testing, training, and validation of the Waymo Driver....  .... You have: ~ MS or PhD in Computer Science, Machine Learning, Robotics, or... 
    Senior
    Training
    Full time
    Remote work

    Waymo

    Mountain View, CA
    4 days ago
  • $130k - $220k

     ...growing teams. We're looking for a machine learning engineer to train and deploy the latest generation of ML-based planning algorithms on the extensive data we...  ...a real-time focus and operates efficiently in compute-constrained environments. Track and incorporate... 
    Senior
    Training

    PlusAI, Inc.

    Santa Clara, CA
    4 days ago
  • $212k - $386.3k

     ...Senior Staff Machine Learning Engineer – Ads Prediction, Signals & Quality...  ...user privacy, integrate advertising thoughtfully...  ...experience applying ML at scale in ads, recommender...  ..., LLMs, DNNs) and training frameworks (...  ..., secure multiparty computation) is preferred. ~ Familiarity... 
    Senior
    Training
    Relocation

    Apple

    Cupertino, CA
    10 hours ago
  • $150k - $300k

     ...seeking a Senior Staff AI engineer to join our AI org...  ..., multi-system AI/ML platforms, with hands...  ...Collaborate with sr. tech leads in...  ...degree or above in Computer Science, Engineering...  ...experience, education and training, the work location...  ..., rooted in integrity, a bias for action... 
    Senior
    Training
    Hourly pay
    Work experience placement
    Local area
    Flexible hours

    GEICO

    Palo Alto, CA
    3 days ago
  • $195k - $298k

     ...Global Technical Center - Cole Engineering Center Podium or Mountain...  ...assistance. About the Team: The ML Compute Platform is part of the AI...  .... Our platform supports the training and deployment of state-of-...  ...Role: We are seeking a Staff ML Engineer to help build and... 
    Training
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    4 days ago
  • $215.28k - $364.32k

     ...Staff Machine Learning Engineer - Ai Foundation Santa Clara, CA Xpeng is...  ...forefront of innovation, integrating advanced AI and...  ...establishing a state-of-art ML infrastructure for training very large foundation model...  ...of memory bandwidth, compute bottlenecks, and... 
    Training
    Full time

    XPENG

    Santa Clara, CA
    2 days ago
  • $180k - $280k

     ...real-world scenarios. As a Staff AI/ML Engineer within the Onboard Embodied...  ...neural networks trained from large-scale driving data...  ...groups, ensuring seamless integration of ML capabilities into autonomous...  ...Machine Learning, Robotics, Computer Science, Electrical... 
    Training
    Local area
    Work from home
    Relocation
    Relocation package

    General Motors

    Mountain View, CA
    4 days ago
  • $189k - $300k

     ...works on and delivers ML models to the product that...  ...foundation model pre-training and fine-tuning with data...  ...-impact team of AI/ML engineers, data scientists and...  ...autonomous vehicles. As a Staff AI/ML Engineer in the...  ...'s,Master's, or PhD in Computer Science, Robotics,... 
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  • $185.1k - $335.3k

     ...maps toward automated, ML-driven map...  ...We are looking for a Staff Machine Learning Engineer to serve as a technical...  ...deliver end-to-end ML and computer vision pipelines...  ...labeling strategies, model training, evaluation, and...  ...data contracts, and integration points. Drive... 
    Training
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    3 days ago
  • $189.3k - $320.7k

     ...behavior across real-world scenarios. As a Staff ML Engineer on the Prometheus team within the...  ...of autonomous vehicle development-from training and validation to testing and safety....  ...Skills & Abilities Master's or PhD in Computer Science, Robotics, Machine Learning, or... 
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  •  ...Staff Machine Learning Engineer It started with a simple idea: what...  ...develop, and implement AI/ML approaches to...  ...to store, annotate, train, and test on large pathology...  ...validate models for integration into production...  ...Master's degree in Computer Science or related field... 
    Training
    Work at office
    Local area
    Worldwide
    Flexible hours

    Intuitive

    Sunnyvale, CA
    2 days ago
  • $214k - $292k

     ...Lead ML/CV Engineer - Computational Photography & Image Processing CoStar Group is a leading global...  ...characteristics, image quality, and system integration. Design, implement, and optimize...  ...and academic growth with internal training and tuition reimbursement. Our... 
    Training
    Work at office
    Remote work

    CoStar Group

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Sr. / Staff ML Engineer, FM Training Integration - ML Compute. Be the first to apply!