Senior Research Engineer, Training Data Infrastructure in Foundation Models
Apple Inc.
Senior Research Engineer, Training Data Infrastructure in Foundation Models Cupertino, California, United States - Software and Services Our team is dedicated to solving the high-quality training data problem at the scale required to train advanced Foundation Models. We believe that the advanced model performance (including reasoning, codingارس کاربران, و agentic planning) fundamentally depends on a data-centric approach to Machine Learning. Our objective is to engineer a large-scale system that acquires,ликಿತ್ರ, processes, and curates the data required to advance the state of the art in Artificial Intelligence. We are seeking a Senior Research Engineer who possessesензи deep understanding of distributed systems and a strong intuition for Machine Learning. You will join a culture that values engineering craftsmanship, privacy, and rigorous scientific inquiry, utilizing advanced cloud technologies to build the data systems that power our most capable models. Description This position operates at the convergence of Software Engineering and Machine Learning Research. Unlike traditional backend roles, this position requires you to design systems where the outcome is the statistical distribution and quality of data itself. Youураль will work alongside Research Scientists to transform theoretical observations into concrete, scalable engineering solutions. Your core focus will be the architecture of our Data Acquisition, Processing, and Repository Management systems for Large Model training. You will lead technical efforts to enable active, quality-driven data curation, including filtering, deduping, synthetic data generation and data mixing, ensuring our models are trained on the highest-quality information available. Responsibilities Architect Scalable Ingestion Systems: Design and implement high-throughput distributed systems to ingest petabytes of text and multimodal data from diverse sources, including web crawls and third-party partnerships. Repository Optimization: Manage the lifecycle of large-scale datasets across data storage and high-performance file systems. Optimize data formats for efficient random access and sequential scanning during model training. Data Governance & Privacy: Engineer robust data governance and privacy solutions for the training data, in collaboration with compliance and legal teams, to ensure adherence to stringent regulatory standards. High-Performance Processing Pipelines: Build and maintain distributed data processing workflows using-connect frameworks on cloud infrastructure (e.g., GCP, AWS). Algorithmic Data Curation: Implement sophisticated data filtering and selection logic to remove low-quality content and develop semantic deduplication at scale to prevent model memorization and improve training efficiency. Decontamination Removal: Design automated systems to detect and remove benchmark leakage, ensuring that evaluation datasets remain strictly isolated from training corpora. Infrastructure for Scaling Laws: Collaborate with researchers to enable data ablations and scaling experiments. Build tools to support systematic data mixture optimization and empirically data studies. … #J-18808-Ljbffr Apple Inc.
$224k - $356.5k
NVIDIA is searching for a senior or principal engineer who specializes in building cutting‑edge infrastructure for large‑scale foundation model training in the Generalist Embodied Agent Research (GEAR) group. Our team... ...datasets. Implement scalable data loaders and...SeniorTrainingFoundationFull time- A leading technology company located in Cupertino, California, is seeking a Senior Research Engineer focused on training data infrastructure for advanced AI models. The ideal candidate will possess strong skills in distributed systems and a deep understanding of Machine...SeniorTrainingFoundation
$224k - $356.5k
NVIDIA Gruppe is seeking a Senior or Principal Engineer for their GEAR group, focusing on large-scale foundation model training for humanoid robots. You'll design distributed training... ...systems and collaborate with a top-tier research team to impact their projects...SeniorTrainingFoundation- ...recruiting top research engineers in the Autonomous... ...and generative modeling. You must have strong... ...track record of training deep learning... ...mathematical foundation to analyze new AI... ...Implement scalable data loaders and... ...optimize simulation infrastructure (based on GPU-...SeniorTrainingFoundationFull time
$181.1k - $318.4k
...something! Description As a Senior/Staff Engineer on the Foundation Model Compute Infrastructure team, you will lead the design... ...efficient execution of large‑scale training and inference jobs. This role... ...skills across engineering and research teams Bachelor’s degree in...SeniorTrainingFoundationRelocation- ...robot systems to the infrastructure and state-of-the-art foundation world models that control our robots... ...possible by our cutting edge research and end-to-end system... ...Scientist or Research Engineer focused on model... ...robot hardware Develop training strategies that...TrainingFoundation
- ...and cutting-edge models, products and... ...next generation of data infrastructure at Mistral AI.... ...access for MLOps and research. You will take... ...for critical training jobs. What will... ...growth. Platform Engineering: Contribute to... ...interest in supporting foundational compute and...TrainingWork at officeVisa sponsorship
$180k - $258.75k
...Description At Toyota Research Institute (TRI)... ...developing the engineering infrastructure needed to train, evaluate, and... ...looking for a Senior Research... ...software engineering foundation, deep... ...geometry or physical modeling, and a genuine... ...including efficient data structures,...SeniorTrainingFoundationLocal areaShift work$245k - $295k
...Senior Manager, Infrastructure Platform Engineering Crusoe is on a mission to accelerate... ..., manufacturing, data center... ...capacity. The team owns foundational services spanning... ...and control systems modeling resource and system... ...GPU clusters, AI training, and inference workloads...SeniorTrainingFoundationTemporary workImmediate start- ...technology company is seeking a Research Engineer to enhance its core research... .... The role involves improving models for web-scale indexing and establishing training strategies. Candidates should... ...systems and have a strong academic foundation. This fully in-person position...TrainingFoundation
$192k - $304.75k
...are now looking for a Senior Research Scientist focused on Multimodal Foundation Models and Robotics! NVIDIA... ...Develop large‑scale AI training and inference methods... ...with research and engineering teams across all of NVIDIA... ...systems and compute infrastructure. Robotics: Hands‑on...SeniorTrainingFoundation- ...the operator intelligence layer in AI infrastructure. You will design transformer frameworks... ...-scale datasets, manage distributed training, and ensure robust production systems.... ...candidates will have deep expertise in foundation model architecture and experience with production...TrainingFoundation
$233k - $341k
...combines superior infrastructure performance with... ...visionary VP of Research Training Infrastructure .... ...strategy and engineering execution for the... ...where frontier models are born. The Role... ...labs to refine foundation models with... ...Factories , not just data centers....SeniorTrainingFoundationPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hours- Senior Applied Scientist, Delivery Foundation Model job at Amazon.com Services LLC... ...science and engineering revolution at... ...Amazon's vast data and computational... ...problems to train foundation models... ...for specific research initiatives,... ...and evaluation infrastructure. Guide and...SeniorTrainingFoundationWorldwide
$165k - $238k
...and riskiness of research with the speed... ...an innovation engine, X focuses on repeatedly... ...ideas into the foundations for large,... ...The Role As a Senior Applied... ...of unstructured data. We look for engineers... ...Experience training, fine-tuning, or distilling models for specialized...SeniorTrainingFoundationFull timeWork at office3 days per week$147.4k - $272.1k
...and large language models. Our centralized applied research and engineering group is dedicated... ...encompassing early ideation, data definition, model training, and fine‑tuning.... ...projects, spanning foundational concepts to... ...to define data and infrastructure requirements crucial...TrainingFoundationRelocation$184k - $287.5k
We are now looking for a Senior Robotics Research Engineer (Robotics & AI for Drug Discovery... ..., simulation, world models, and multimodal action models... .../torque, tactile) and foundation models Translate experimental... ...motion planning pipelines Training robots to solve contact-...SeniorTrainingFoundation$165k - $185k
...Responsibilities: Conduct research on GenAI and Foundation models (FM) to address academic... ...model, map and localization, data curation and auto-... .... in Computer Science or Engineering, or a related discipline... ...foundation models, including training, fine-tuning, and prompting...SeniorTrainingFoundation- A leading AI infrastructure company is seeking a Member of Technical Staff to focus on foundation model architecture and AI systems engineering. You will drive architectural ownership, product reliability, and scalable training while deploying AI solutions in industrial...TrainingFoundationFull time
$174k - $253k
Senior Software Engineer, AI/ML Training Infrastructure Google Mountain View, CA, USA Apply... ...infrastructure (e.g., model deployment, model... ..., optimization, data processing,... ...data and training foundation that powers AI innovation... ...the latest research to improve model quality...SeniorTrainingFoundation$120.75k - $251.25k
...write mobile app code, engineer the servers behind... ...trillions of data points a day, what you... ...‑ and AI‑based data infrastructure, supporting new functionalities... ...learning and modeling, as well as satisfying... ...Build the data foundation for ML training pipelines—including...SeniorTrainingFoundationWork at officeFlexible hours$181.1k - $318.4k
Senior Machine Learning Research Engineer, NLP, Input Experience Cupertino, California... ...building groundbreaking NLP models to optimizing them for... ...building blocks and infrastructure that integrate these... ...of representative training and evaluation data. Implementation of experiments...SeniorTrainingRelocation$147.4k - $272.1k
...Sr Machine Learning Engineer - Data and ML Innovation Cupertino... ...for multi-modal models with strong agent and... ...of machine learning researchers, engineers, and data... ...innovative research in foundation models to with a particular... ...ML pipeline—from pre-training on large-scale...SeniorTrainingFoundationWorldwideRelocation$174k - $252k
Senior Software Engineer, Infrastructure, Persistent Disk Google Sunnyvale, CA, USA Qualifications... ...years of experience with data structures and algorithms... ...‑critical block storage foundation for Google Cloud,... ...and relevant education or training. Your recruiter can share...SeniorTrainingFoundationFull time$126k - $423k
...creating the digital infrastructure needed to bring... ...for a passionate Research Engineer (AI/RL... ...millions of miles of data from large fleets... ...Manager capacity; Senior/Staff level experience... ...Design and build training and evaluation... ...systems to measure model performance across...TrainingFull timeFor contractorsFor subcontractorCasual workWork at officeImmediate startRemote workDay shift$150k
A leading research lab in Sunnyvale is seeking a distributed ML infrastructure engineer to extend and scale training systems. The ideal candidate must have over 5 years of experience... ...with comprehensive benefits and amenities. #J-18808-Ljbffr Institute of Foundation ModelsTrainingFoundation$85.5k - $150.77k
...Senior Cost Engineer & Data Analyst | Lockheed Martin At the dawn of a new space age, Lockheed Martin... ...data collection, executing cost models, dashboard building/automation, and communicating... ...'s work experience, education/ training, key skills as well as market and...SeniorTrainingFull timeTemporary workWork experience placementFlexible hours$272k - $431.25k
We are seeking a Senior Research Manager to lead world‑model evaluation and benchmarking across... ...findings into better data, training recipes, model roadmaps... ..., including world foundation models, world‑action models... ...Science, Electrical Engineering, Robotics, Machine Learning...SeniorTrainingFoundation$204k - $343k
About The Role As an Engineering Manager on the Data Intelligence team,... ...the adoption of foundation models and cutting‑edge... ...scale data mining infrastructure Lead the integration... ...in alignment with training, evaluation, and... ...requirements Partner with research, autonomy, and...TrainingFoundationFull timeFor contractorsFor subcontractor$166k - $225k
...passionate about enabling data teams to solve the... ...world's best data and AI infrastructure platform so our... ...business. Founded by engineers — and customer obsessed... ...Scala or C++. Strong foundation in algorithms and data... ...relevant certifications and training, and specific work...SeniorTrainingFoundationLocal areaWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Research Engineer, Training Data Infrastructure in Foundation Models. Be the first to apply!
- research engineer Cupertino, CA
- research programmer Cupertino, CA
- deep learning research engineer Cupertino, CA
- data engineer Cupertino, CA
- software data engineer Cupertino, CA
- sr information security engineer Cupertino, CA
- senior data quality engineer Cupertino, CA
- finance data engineer Cupertino, CA
- data developer Cupertino, CA
- senior cloud data engineer Cupertino, CA
