Research Scientist, Data
Pika
About the Role At Pika, we are pioneering the next generation of creative infrastructure built around real-time, multimodal generation and intelligent agentic platforms. We are looking for a staff or lead-level Research Engineer, Data to architect and scale data engineering systems supporting model training for our advanced multimodal foundation models. This pivotal role will strengthen our research teams by building, optimizing, and owning large-scale data pipelines and robust ML data curation, ensuring our foundation models have access to the highest quality and most diverse datasets. If you are passionate about powerful data infrastructure and innovative research-engineering, join us to make an impact for millions of creators. What You’ll Do Take ownership of large-scale data pipeline architecture and implementation to support model training and research workflows for text, image, audio, and video datasets Partner with research and engineering teams to curate, clean, and manage diverse, sensory-rich datasets for pre-training and mid-training of multimodal models Develop strategies and tools for scalable data ingestion, labeling, filtering, augmentation, and storage Ensure data quality, reliability, and compliance, including managing privacy and ethical considerations throughout the data lifecycle Optimize data processing, transformation, and delivery for large-scale distributed training pipelines Prototype and productionize new methods for dataset creation, management, and continuous improvement in response to researcher needs Contribute to the integration of research-driven data advancements into production-ready systems Stay informed on emerging data engineering and ML data management developments, bringing best practices to our systems What We’re Looking For 5+ years of experience building and scaling data pipelines for machine learning applications at staff or lead engineer level, ideally in research or model training environments Strong background in data engineering and ML data curation for LLMs, VLMs, or other large-scale multimodal models Expertise in distributed data systems (e.g., Spark, Hadoop, Ray, or similar) and efficient large dataset processing/ETL workflows Proven ability to build robust, scalable, and production-grade data infrastructure for ML pipelines Experience developing tools for data labeling, filtering, deduplication, quality assurance, and dataset management Strong programming skills (Python, SQL, PySpark, or similar) and familiarity with cloud data platforms (AWS, GCP, Azure) Knowledge of privacy, compliance, ethics, and best practices in data collection and management Excellent cross-functional collaboration, problem-solving, and communication skills Passion for enabling cutting-edge generative AI and creative technology through data excellence What We Offer Competitive salary and substantial equity in a high-growth startup Full health benefits, 401k matching, and more Collaborative, mission-driven team environment with major growth opportunities Flexible on-site/remote hybrid (HQ in Palo Alto, CA) About Pika Pika empowers creators by building state-of-the-art agentic and multimedia platforms. Our vision is to break down technical barriers to creativity, making real-time generative and intelligent orchestration accessible to all. Join us and help shape the next evolution of creative technology! If you are a data-driven research engineer excited to lead and scale the data infrastructure powering real-time multimodal foundation models, we want to hear from you.
- ...every step of our exciting journey. The mission of the Waymo Research team is to develop machine learning solutions addressing open problems... ...learning, etc) to these problems; scale them to Google-sized data pipelines; and streamline them to run in real-time on the cars....DataInternshipSummer internshipLocal area
- ...product development and improving conversational technologies. The ideal candidate has at least 10 years of experience in large scale data processing, a Master's or Ph.D. in relevant fields, and hands-on experience with transformer models. Join us in shaping the future...Data
- ...determined according to the order of listing. What you’ll do As a Research Scientist at Simular, you will: Shape the future of agentic AI by... ...AI safety). Design and execute experiments end-to-end: from data collection and benchmarking, to model training and evaluation...Data
- ...model (LLM) for the healthcare industry. Our team comprised of ex-researchers from Microsoft, Meta, Nvidia, Apple, Stanford, John Hopkins and... .... Responsibilities: Design, Develop, Evaluate and update data-driven models for Speech First applications. Participate in Research...DataWork at office
$115k - $140k
...Alto, Subsense brings together leading scientists and engineers to redefine the future of... ...interaction. The Opportunity We’re seeking a Research Scientist with strong expertise in... ...Develop experimental protocols, maintain data integrity, and contribute to publications...Data- ...optimization and integration into the Waymo Driver. We conduct our own research to address real-world problems and collaborate with research teams at Alphabet. We have access to millions of miles of driving data from a diverse set of sensors, enabling engineers like you to (1...DataFull timeTemporary workRemote work
$147k - $211k
...organization, Google maintains a portfolio of research projects driven by fundamental research,... ...specific types of work. As a Research Scientist, you'll set up large-scale tests and... ...science, such as machine (and deep) learning, data mining, natural language processing,...DataFull time$204k - $259k
...initiate and foster collaborations with other research teams in Alphabet. AI Foundations areas... ...role, you will report to a Principal Scientist. You will: Participate in Waymo’s Foundation... ...and performant manner such as Data parallel, FSDP and other sharding approaches...DataTemporary workRemote work$174k - $252k
Senior Research Scientist, Google Research Mountain View, CA, USA; New York, NY, USA; +2 more Apply X Applicants in San Francisco: Qualified... ...of computer science, such as machine (and deep) learning, data mining, natural language processing, hardware and software performance...DataFull time$252k - $400k
...of models will look fundamentally different. We’re assembling a research team dedicated to shaping that future. The Opportunity We’re creating... ...community. Benefits of Research in Industry Rich real‑time data : access to large‑scale, diverse, and dynamic user interactions....DataImmediate startWorldwide$147k - $211k
...Gemini Robotics On-Device (our Gemini model that runs without a data network). You will also develop reasoning and agentic systems for... ...to unlock new robot capabilities. Write software to implement research ideas and iterate. Participate in research, including learning...DataFull time$174k - $253k
...organization, Google maintains a portfolio of research projects driven by fundamental research,... ...specific types of work. As a Research Scientist, you will set up large‑scale tests and... ...computer science, such as machine learning, data mining, natural language processing,...DataWorldwide$176k - $253k
At Toyota Research Institute (TRI), we’re on a mission to improve the quality of human life... ...Opportunity We are looking for a Research Scientist to join us in building intelligent... ...evaluate a wide range of architectural, data, and algorithmic choices, and help shape...DataWork experience placementInternshipLocal areaShift work- ...translation fluency under real-world disfluency. We’re looking for a Research Scientist who can define what "better" actually means across all of... ...conditions optimize for. Feed evaluation insights back into data acquisition and model training priorities — identifying which...Data
- About Us GenMD is unlocking healthcare data at scale. Today, roughly 97% of healthcare data goes unused because of patient privacy... ...that data—safely and ethically—for AI labs, pharma companies, and researchers. This isn’t a chatbot, or an AI agent replacing clinicians or...DataInternshipNight shift
- ...not months -automating the loop of evaluation, data synthesis, training, and repeat. Oumi also develops an open research stack and models in collaboration with... ...experimentation, and adoption. Role Overview The Research Scientist will be an integral part of Oumi's research...DataWorldwideFlexible hours
- ...models that leverage our large-scale, high-quality, real-world data collection system. At the same time, we’re building a new kind of... ...more time on the things they value most. As a Machine Learning Research Engineer, you will work on the software and algorithms that enable...Data
$184k - $299k
...Senior Research Scientist, Efficient Deep Learning NVIDIA is searching for an outstanding Senior Researcher working on efficient deep learning... ...‑on experience with large‑scale model training including data preparation and model parallelization (tensor and pipeline) is...Data$192k - $304.75k
...We are now looking for a Research Scientist with a focus in System Software and I/O! NVIDIA is seeking Research Scientists with a focus in System... ...workloads such as recommender systems, graph analytics, and data frames. Your base salary will be determined based on...DataWork experience placement$126k - $248k
...enable accurate, efficient unstructured data search and retrieval for RAG,... ...more. It is backed by a strong team of AI researchers from Stanford, MIT, Berkeley, Princeton,... ...OVERVIEW We are seeking a Senior Research Scientist to join our team and contribute to the...DataFull timeLocal areaWorldwideFlexible hours- ...Fortune 500 enterprises, we bring together research, engineering, product, and domain... ...Articul8 AI is seeking a Principal Research Scientist to define how we build, evaluate, and scale... ...full model development lifecycle: domain data strategy, continued pre‑training, supervised...DataShift work
- ## Senior Staff Research Scientist, Agentic AI & RLApplylocations: East Palo Alto, CAtime type: Full timeposted on: Posted Todayjob requisition id: JR107333**About Centific**Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose...Data
$202.35k - $303.05k
...challenging problems in autonomous driving. You will be focusing on researching and developing state of the art generative models, with an... ...Apply the model to various tasks such as planning, prediction, data generation, simulation, and so on. Research SoTA algorithms to...Data$151k - $297k
...enable accurate, efficient unstructured data search and retrieval for RAG, recommendation... .... It is backed by a strong team of AI researchers from Stanford, MIT, Berkeley, Princeton,... ...We are seeking a Staff Research Scientist to join our team and contribute to the development...DataWork at officeLocal areaRemote workFlexible hours$197.8k - $296.6k
...Summary The Robot Intelligence Lab at Samsung Research America is a new facility dedicated to... ...is looking for a Senior Staff Research Scientist with solid technical skills and rich academic... ...(vision, tactile, audio, semantic) data fusion. Work with the team to design and...DataWork at officeLocal area$225k - $275k
...an experienced consultant for a Principal Scientist position in our Polymers & Chemistry... ...solving skills, with the ability to interpret data, identify trends, and make informed... ...derived from government funding for academic research projects). Benefits you will enjoy Access...DataWork at office- ...healthcare industry. Our team comprised of ex‑researchers from Microsoft, Meta, Nvidia, Apple,... ...‑AI interactions. Overview Applied Scientists at Hippocratic provide a dynamic opportunity... ...TensorFlow. Experience with large‑scale data processing and distributed computing....DataWork at office
$190k - $250k
...realistic, physically consistent futures from real-world sensor data. This capability serves as the foundation for scalable... ...models that drive our autonomous trucks. We are looking for a research scientist to lead the design and development of world models capable of...DataFull timeTemporary workWork at officeVisa sponsorshipFlexible hours- ...A biotechnology company is seeking a Clinical Scientist to design and support clinical trials from its Palo Alto office or remotely. The role involves analyzing clinical data, collaborating with cross-functional teams, and ensuring the integrity of clinical studies. Ideal...DataWork at officeRemote work
$185k - $215k
...Company Description The Bosch Research and Technology Center North America with offices... ...Valley focuses on Foundation Models, Big Data Visual Analytics, Explainable AI (XAI), Natural... ...Job Description As a Senior Research Scientist- Vision-Language-Action (VLA) Models, you...DataPart timeWork experience placementLocal areaImmediate startWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Research Scientist, Data. Be the first to apply!
- quality control scientist Palo Alto, CA
- machine learning research scientist Palo Alto, CA
- validation scientist Palo Alto, CA
- scientist Palo Alto, CA
- qc scientist Palo Alto, CA
- remote scientist Palo Alto, CA
- cell culture scientist Palo Alto, CA
- research scientist Palo Alto, CA
- scientist assay development Palo Alto, CA
- decision scientist Palo Alto, CA



