Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Research Engineer, Training Data Infrastructure in Foundation Models

Apple Inc.

Senior Research Engineer, Training Data Infrastructure in Foundation Models Cupertino, California, United States - Software and Services Our team is dedicated to solving the high-quality training data problem at the scale required to train advanced Foundation Models. We believe that the advanced model performance (including reasoning, codingارس کاربران, و agentic planning) fundamentally depends on a data-centric approach to Machine Learning. Our objective is to engineer a large-scale system that acquires,ликಿತ್ರ, processes, and curates the data required to advance the state of the art in Artificial Intelligence. We are seeking a Senior Research Engineer who possessesензи deep understanding of distributed systems and a strong intuition for Machine Learning. You will join a culture that values engineering craftsmanship, privacy, and rigorous scientific inquiry, utilizing advanced cloud technologies to build the data systems that power our most capable models. Description This position operates at the convergence of Software Engineering and Machine Learning Research. Unlike traditional backend roles, this position requires you to design systems where the outcome is the statistical distribution and quality of data itself. Youураль will work alongside Research Scientists to transform theoretical observations into concrete, scalable engineering solutions. Your core focus will be the architecture of our Data Acquisition, Processing, and Repository Management systems for Large Model training. You will lead technical efforts to enable active, quality-driven data curation, including filtering, deduping, synthetic data generation and data mixing, ensuring our models are trained on the highest-quality information available. Responsibilities Architect Scalable Ingestion Systems: Design and implement high-throughput distributed systems to ingest petabytes of text and multimodal data from diverse sources, including web crawls and third-party partnerships. Repository Optimization: Manage the lifecycle of large-scale datasets across data storage and high-performance file systems. Optimize data formats for efficient random access and sequential scanning during model training. Data Governance & Privacy: Engineer robust data governance and privacy solutions for the training data, in collaboration with compliance and legal teams, to ensure adherence to stringent regulatory standards. High-Performance Processing Pipelines: Build and maintain distributed data processing workflows using-connect frameworks on cloud infrastructure (e.g., GCP, AWS). Algorithmic Data Curation: Implement sophisticated data filtering and selection logic to remove low-quality content and develop semantic deduplication at scale to prevent model memorization and improve training efficiency. Decontamination Removal: Design automated systems to detect and remove benchmark leakage, ensuring that evaluation datasets remain strictly isolated from training corpora. Infrastructure for Scaling Laws: Collaborate with researchers to enable data ablations and scaling experiments. Build tools to support systematic data mixture optimization and empirically data studies. … #J-18808-Ljbffr Apple Inc.

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior Research Engineer, Training Data Infrastructure in Foundation Models in Cupertino, CA vacancy
  • $224k - $356.5k

    NVIDIA is searching for a senior or principal engineer who specializes in building cutting‑edge infrastructure for large‑scale foundation model training in the Generalist Embodied Agent Research (GEAR) group. Our team...  ...datasets. Implement scalable data loaders and... 
    Senior
    Training
    Foundation
    Full time

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • A leading technology company located in Cupertino, California, is seeking a Senior Research Engineer focused on training data infrastructure for advanced AI models. The ideal candidate will possess strong skills in distributed systems and a deep understanding of Machine... 
    Senior
    Training
    Foundation

    Apple Inc.

    Cupertino, CA
    2 days ago
  • $180k - $258.75k

     ...Description At Toyota Research Institute (TRI)...  ...developing the engineering infrastructure needed to train, evaluate, and...  ...looking for a Senior Research...  ...software engineering foundation, deep...  ...geometry or physical modeling, and a genuine...  ...including efficient data structures,... 
    Senior
    Training
    Foundation
    Local area
    Shift work

    Toyota Research Institute

    Los Altos, CA
    1 day ago
  •  ...TITLE: ML Data Infrastructure Engineer LOCATION: Sunnyvale CA or Remote Duration: 12+ Months...  ...composer), Vertext AI , Datapipeline, ML Training Role Overview: We're seeking...  ...learning. This role focuses on the data foundation that powers our ML capabilities.... 
    Senior
    Training
    Foundation
    Remote work

    Redolent

    Sunnyvale, CA
    3 days ago
  • $150k

    About the Institute of Foundation Models We are a dedicated research lab for building,...  ...edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the most fundamental...  ...: Data Infrastructure & Pipelines Design, implement... 
    Training
    Foundation
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    2 days ago
  • $124.09k - $210k

     ...core member of our AI Infrastructure team, you will work...  ...Autonomous Driving and Foundation Models. We don't just...  ...EB-scale perception data from tens of thousands...  ...high-performance Data Engine that powers our next-...  ...data versioning. * Training Throughput Optimization... 
    Senior
    Training
    Foundation
    Full time
    Work experience placement

    XPENG

    Santa Clara, CA
    3 days ago
  • $224k - $356.5k

    NVIDIA Gruppe is seeking a Senior or Principal Engineer for their GEAR group, focusing on large-scale foundation model training for humanoid robots. You'll design distributed training...  ...systems and collaborate with a top-tier research team to impact their projects... 
    Senior
    Training
    Foundation

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $210k - $267k

     ...trading energy are the foundation of what we do. We ingest large-scale data-weather, prices, load,...  ...volatility. Our deep learning models have proven very...  ...We're looking for an engineer to help lead the scaling...  ...reliability of our data infrastructure, which is core to the ML... 
    Senior
    Foundation
    Work at office
    Remote work
    Work from home
    Home office
    Flexible hours
    3 days per week

    Gridmatic

    Cupertino, CA
    5 days ago
  • $203.45k - $344.3k

     ...Senior Staff AI Data Infrastructure/Pipeline Engineer Santa Clara, CA XPENG is a leading smart technology company...  ...→ dataset production → model training / simulation input. In autonomous...  ...Java. Solid software engineering foundation, good coding standards, and a strong... 
    Senior
    Training
    Foundation
    Full time
    Overseas

    XPENG

    Santa Clara, CA
    4 days ago
  • $181.1k - $318.4k

     ...Sr. Machine Learning Research Engineer, Siri Speech We are a group...  ...to build cutting-edge infrastructure, datasets, and models that empower Siri with...  ...to push the frontiers of foundation models and conversational...  ...scale machine learning training/evaluation On-device... 
    Senior
    Training
    Foundation
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $184k - $287.5k

     ...recruiting top research engineers in the Autonomous...  ...and generative modeling. You must have strong...  ...track record of training deep learning...  ...mathematical foundation to analyze new AI...  ...Implement scalable data loaders and...  ...optimize simulation infrastructure (based on GPU-... 
    Senior
    Training
    Foundation
    Full time

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $150k - $200k

     ...growing teams. As a Research Engineer, you will deliver...  ...build optimal and data‑driven controls to...  ...‑learning vehicle models and learning‑based...  ...Develop tools and infrastructure for dataset generation, training, and evaluation to...  ...Strong foundation in motion control... 
    Senior
    Training
    Foundation

    PlusAI

    Santa Clara, CA
    4 days ago
  • $150k

    A leading research lab in AI located in Sunnyvale, California, is seeking an individual to join their AllWorld Team. The role focuses on developing scalable data pipelines and optimizing foundation model training. Candidates should hold at least a Master's or PhD in Machine... 
    Training
    Foundation

    Institute of Foundation Models

    Sunnyvale, CA
    2 days ago
  •  ...robot systems to the infrastructure and state-of-the‑art foundation world models that control our...  ...our cutting edge research and end‑to‑end system...  ...or Research Engineer to own the strategy...  ...quality robot learning data. This role sits at...  ...our models train on. What You’ll Do... 
    Training
    Foundation

    Rhoda ai

    Palo Alto, CA
    4 days ago
  • Ipro Networks Pte. Ltd. is seeking a Research Scientist / Engineer in Palo Alto, CA to develop and optimize distributed training infrastructure for multimodal foundation models. This role involves significant experience with PyTorch and managing large-scale GPU clusters... 
    Training
    Foundation
    Remote job

    Ipro Networks Pte. Ltd.

    Palo Alto, CA
    2 days ago
  • $153.2k - $234.1k

     ...Embodied AI Infra Foundation team at General...  ...build the critical infrastructure that powers every...  ...machine learning engineer working on our...  ...Autonomous Driving models. From foundational...  .... As a Senior ML Infra Engineer...  ...machine learning model training and evaluation workflows... 
    Senior
    Training
    Foundation
    Work at office
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  •  ...robot systems to the infrastructure and state-of-the-art foundation world models that control our robots...  ...possibly by our cutting edge research and end-to-end system...  ...hardware Develop training strategies that...  ...or equivalent research/engineering experience Publication... 
    Training
    Foundation

    Rhoda ai

    Palo Alto, CA
    1 day ago
  •  ...The Catalog Data Science team...  ...Learning, and Engineering. We tackle...  ...visualization, and model serving. We...  .... As a Senior Data...  ...reproducible training, robust evaluation...  ...) Strong foundation in classical...  ...and GPU infrastructure Working knowledge...  ...AI/ML research and translating... 
    Senior
    Training
    Foundation

    Walmart

    Sunnyvale, CA
    18 days ago
  • $281k - $356k

     ...The Perception Data team at Waymo...  ...data used to train and evaluate the...  ...-vocabulary modeling. By unifying...  ...development of foundation models and...  ...Machine Learning, Infrastructure, and...  ...a Director of Engineering   You will...  ...to Staff and Senior engineers across... 
    Senior
    Training
    Foundation
    Full time
    Remote work

    Waymo

    Mountain View, CA
    5 days ago
  •  ...and cutting-edge models, products and...  ...next generation of data infrastructure at Mistral AI....  ...access for MLOps and research. You will take...  ...for critical training jobs. What will...  ...growth. Platform Engineering: Contribute to...  ...interest in supporting foundational compute and... 
    Training
    Work at office
    Visa sponsorship

    Mistral AI

    Palo Alto, CA
    3 days ago
  • $147.4k - $272.1k

     ...Machine Learning Engineer, Data and ML...  ...the revolution in Foundation Models? Contribute to model...  ...to improve model training and evaluation efficiency...  .... As a Senior Machine Learning...  ...of modeling, infrastructure, and product, helping...  ...closely with research, infrastructure,... 
    Senior
    Training
    Foundation
    Relocation

    Apple

    Cupertino, CA
    2 days ago
  • $220k - $300k

    Job Title: Research Scientist / Engineer - Training Infrastructure Position Type: Full time Location: Palo Alto, CA •...  ...intelligence. To go beyond language models and build more aware, capable...  ...training and scaling up multimodal foundation models for systems that can see... 
    Training
    Foundation
    Full time
    Work experience placement
    Remote work

    Ipro Networks Pte. Ltd.

    Palo Alto, CA
    2 days ago
  • $172.43k - $230.95k

     ...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission...  ...integrated AI infrastructure company built from...  ...energy, manufacturing, data center...  ...systems for large foundation models (SFT, PEFT,...  ...maintain end-to-end training pipelines for Large... 
    Senior
    Training
    Foundation
    Temporary work

    Crusoe

    Sunnyvale, CA
    1 day ago
  • $196k - $230k

     ...the rewards. The Data Engineering team builds and maintains the foundational datasets that...  ...ensure accurate, well-modeled data is...  ...products. As a Senior Data Engineer, you...  ...data stack (Data Infrastructure, Analytics and Visualization...  ...education, training, experience, location... 
    Senior
    Training
    Foundation
    Work at office
    Flexible hours
    Shift work
    3 days per week

    Robinhood

    Menlo Park, CA
    1 day ago
  • $193.93k - $352.29k

     ...looking for a Senior/Staff Software Engineer to serve as...  ...Nuro’s ML Data engine. You...  ...Learning, and Infrastructure, acting as an...  ...autonomy AI models. In this...  ...high-value training signals for...  ...for autonomy researchers, develop queries...  ...for foundation model training... 
    Senior
    Training
    Foundation
    Shift work

    Icehouseventures

    Mountain View, CA
    3 days ago
  • $213k - $263k

     ...team, builds tools and infrastructure to realize the ML...  ...partners closely with the modeling team to realize solutions...  ...contribute to Waymo's data infrastructure...  ...the field of software engineering ~ Experience programming...  ...experience, relevant training and education, and skill... 
    Senior
    Training
    Full time
    Remote work

    Waymo

    Mountain View, CA
    2 days ago
  • $256k - $356k

    Principal Engineer, Infrastructure and Data Center Operations Google Sunnyvale, CA, USA Director+ Master'...  ...the world. Our data centers are the foundation of all Google services and infrastructure...  ..., and relevant education or training. Your recruiter can share more about... 
    Training
    Foundation
    Permanent employment
    Full time
    Flexible hours

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $150k - $230k

     ...founded by Stanford researchers and veteran systems engineers who share a...  ...the foundations of distributed...  ..., traditional infrastructure struggles to meet...  ...distributed GPU training. You'll work at...  ...concurrency, memory models, and failure...  ...stack. Senior Expectations... 
    Senior
    Training
    Foundation

    Clockwork Inc

    Palo Alto, CA
    2 days ago
  • $197k - $291k

    Staff AI Research Engineer, Large User Models Google Mountain View, CA, USA...  ...direction related to Foundation Models, Large...  ...of experience with data structures/algorithms...  ...Recommender Model pre‑training. You will own the...  ...collective roadmaps, ML infrastructure leads to define... 
    Training
    Foundation
    Full time
    Worldwide

    NLP PEOPLE

    Mountain View, CA
    3 days ago
  • $244.14k - $413.16k

     ...Senior Staff Machine Learning Engineer - Foundation Model Santa Clara, CA XPENG is a leading...  ...Learning Engineer / Research Scientist to drive...  ...engineers, and infrastructure experts to design, train, and deploy large-scale...  ...unlabeled fleet data (images, video, LiDAR... 
    Senior
    Training
    Foundation
    Full time

    XPENG

    Santa Clara, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Research Engineer, Training Data Infrastructure in Foundation Models. Be the first to apply!