Research Engineer, Multimodal Data

Eventual

About Eventual Every breakthrough Physical AI system — humanoid robots, autonomous vehicles, video generation models — is trained on petabytes of video, lidar, radar, and sensor data. But today's data platforms (Databricks, Snowflake) were built for spreadsheet‑like analytics, not the multimodal corpora that power AI. As a result, robotics and video‑AI teams iterate on model improvement about once a week. Most of that week isn't training — it's finding the right data: writing CV heuristics over raw footage, paying annotators for edge cases, hand‑curating clips before a cluster ever spins up. GPU bandwidth has grown 2‑3× per generation. Storage and pipelines haven't. The gap widens every year. Eventual was founded in 2022 to close it. Our open‑source engine, Daft, is the distributed data engine purpose‑built for multimodal AI — already running 2 PB/day at Amazon, 60‑100 PB at another FAANG company, and in production at Mobileye, TogetherAI, and CloudKitchens. We are building a video‑native index on top of our engine for Physical AI that collapses the data iteration loop. Describe the dataset you want, get a curated table in minutes, feed it to your GPUs at line rate. One iteration per day becomes the norm. We're building this in partnership with the top PhysicalAI labs and public AI infrastructure companies today. We have raised $30M from Felicis, CRV, Microsoft M12, Citi, Essence, Y Combinator, Caffeinated Capital, Array.vc, and angels from the co‑founders of Databricks and Perplexity. We've assembled a world‑class team from AWS, Render, Pinecone and Tesla. We have spent our careers powering the last generation of PhysicalAI in self‑driving, and are excited to now do this for the next. Join our small (but powerful!) team working together 4 days/week in our SF Mission district office. Your Role As a Research Engineer on the Visual Understanding team, you'll own the layer that makes petabytes of video queryable by content. Physical AI teams have raw footage, lidar, radar, and sim outputs scattered across object stores with no way to find what they need without weeks of human annotation. We change that economics: we run vision‑language models over every clip in a corpus along axes the customer cares about (gripper type, failure mode, object class, scene, motion density), so a researcher can ask "left-arm grasp failures on deformable objects" and get a curated dataset in minutes. You'll define the roadmap for our visual understanding capabilities, train and select the models that make corpus‑scale annotation tractable at single‑digit cents per hour of video, and build the rich datasets that go on to train customer models. This is a research engineering role — meaning you'll read papers and run experiments, but you ship to production and your work is judged by what it does for customer training runs. Key Responsibilities Own the visual understanding roadmap end‑to‑end: from picking the model family for a customer's taxonomy to landing it in production inference at corpus scale. Train, fine‑tune, and evaluate VLMs, VQA models, embedding models, and convolutional perception models against customer datasets and benchmarks. Drive down per‑clip annotation cost — model selection, distillation, batching, decode pipelining — so "annotate every clip in a 10K‑hour corpus" stays economical. Build the rich, queryable datasets that customers train on: design taxonomies with researchers, instrument quality, version the outputs. Partner with the dataloading and storage teams so visual understanding outputs flow into the index and on to the GPU without re‑engineering. Work directly with researchers at our partner labs — your shortest feedback loop is their next training iteration. What we look for Strong familiarity with modern vision and multimodal models — convolution nets, VLMs, VQA, embeddings — and a sense for the SOTA that's actually deployable today vs. on a leaderboard. Experience running these models at scale on real video and sensor data, ideally for perception tasks (detection, tracking, segmentation, retrieval, captioning). Background from a perception team at a self‑driving, robotics, or visual‑data company — or equivalent depth from a research lab. Comfortable with cloud infrastructure and large‑scale data processing — you don't need to be a distributed‑systems engineer, but you've shipped jobs that ran on thousands of GPU‑hours of video. Bias toward data and infrastructure: you reach for "annotate the whole corpus" before "fine‑tune another model." Nice to have Experience training vision or multimodal models from scratch (not just calling APIs). ML/AI research background — papers, citations, or a research org on your resume. Hands‑on time with big‑data frameworks like Spark, Ray, or Daft. Worked on embeddings, retrieval, or content‑aware search at scale. Experience designing labeling taxonomies or running annotation programs. Perks & Benefits In‑person, tight‑knit team — 4 days/week in our SF Mission office. Competitive comp and meaningful startup equity. Catered lunches and dinners for SF employees. Commuter benefit. Team‑building events and poker nights. Health, vision, and dental coverage. Flexible PTO. Latest Apple equipment. 401(k) plan with match. If you're excited about being on the team that turns petabytes of raw video into the training data for the next generation of Physical AI, we'd love to talk. #J-18808-Ljbffr Eventual

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Research Engineer, Multimodal Data in San Francisco, CA vacancy

Research Engineer, Data
...model innovation and systems engineering paired with a design‑minded product... ...models must be trained on data that reflects the world’s... ...generative models (speech, text, or multimodal). Ability to help guide... ...scalable systems that bridge research and production. What We...
Suggested
Work at office
Relocation package
Cartesia
San Francisco, CA
3 days ago
Research Engineer, Visual Understanding for Multimodal AI
Eventual, based in San Francisco, is seeking a Research Engineer for the Visual Understanding team. In this role, you'... ...transform vast video datasets into usable training data. Responsibilities include training multimodal models, designing efficient data taxonomies, and...
Suggested
Work at office
Eventual
San Francisco, CA
2 days ago
RLHF Research Engineer - Multimodal Personalization
A leading AI research organization is looking for a Research Engineer/Scientist to join their Future of Computing Research team in San Francisco. The role focuses... ...on developing methodologies for personalized multimodal AI systems and includes responsibilities such as building...
Suggested
Relocation package
OpenAI
San Francisco, CA
2 days ago
ML Infra Engineer for Multimodal Data Systems
A pioneering AI firm based in San Francisco is seeking a Research Engineer, Distributed Data Systems. In this role, you will design and maintain infrastructure for large-scale multimodal training, ensuring scalability and reliability of data systems. Candidates should have...
Suggested
Work at office
Relocation package
OpenAI
San Francisco, CA
3 days ago
Member of Technical Staff - ML Research Engineer, Data
...across deployment targets, from data center accelerators to on-... ...datasets. We need ML-minded engineers who can collect, filter, and... ...at scale. We treat data as a research problem, not an infrastructure... ..., vision-language, audio, or multimodal. What We're Looking For Thinks...
Suggested
Gravity Engineering Services Pvt Ltd.
San Francisco, CA
2 days ago
Applied RL Research Engineer: Domain Data & Environments
...tuning language models for real-world applications. You will manage data strategies, vendor relationships, and collaborate closely with... ...reinforcement learning and be eager to contribute to applied research that enhances AI systems across sectors like finance and healthcare...
United States Digital Space LLC
San Francisco, CA
1 day ago
Research Engineer: RL Data QA & Tooling
talentpluto is seeking a Research Engineer to enhance the quality assurance (QA) systems supporting training data for reinforcement learning. This position demands close collaboration with stakeholders to guarantee reliability and consistency in datasets. Key responsibilities...
talentpluto
San Francisco, CA
5 days ago
Applied Research Engineer - Video & Multimodal AI
Sieve, Inc. seeks an applied research engineer to build high-performance pipelines for video understanding. You will tackle ambiguous research problems in computer vision and audio processing, optimizing models for performance. The ideal candidate has 2+ years in the field...
Sieve, Inc.
San Francisco, CA
2 days ago
Machine Learning Research Scientist/ Engineer, Frontier Data Research
$200.8k - $251k
...8 years, Scale has been the leading AI data foundry, helping fuel the most exciting... ...at the intersection of cutting‑edge AI research and practical application, with a focus... ...customer researchers, and work alongside the engineering team to translate these advancements...
Full time
Shift work
Dormont Manufacturing Company
San Francisco, CA
1 day ago
Machine Learning Research Engineer, Agent Data Foundation - Enterprise GenAI
$180.6k - $315k
...of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including... ...agents in enterprises around the world. The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working...
Full time
Scale AI
San Francisco, CA
1 day ago
Multimodal Data Engineer: Datasets & Parallel Pipelines
Zyphra, an AI company based in San Francisco, is looking for a Data Engineer specialized in multimodal systems. You'll contribute to the creation and improvement of datasets and data pipelines across various modalities. Ideal candidates have experience in large dataset...
Energy Jobline ZR
San Francisco, CA
5 days ago
Research Engineer - Synthetic RL Data & Systems
hillclimb is seeking a research engineer to work on synthetic data generation and maintain quality pipelines for RL environments. The ideal candidate will possess a strong understanding of NLP and RL techniques, alongside a solid grasp of data structures and modern programming...
hillclimb
San Francisco, CA
1 day ago
Staff Research Engineer, Data Agents
$190k - $270k
About the company The company is the Data + AI company. More than 10,000 organizations... ...globe. About the Team The company AI Research organization is pushing the frontier of... ...blends research exploration with product and engineering rigor. Clear communication and strong...
Worldwide
United States Digital Space LLC
San Francisco, CA
5 days ago
Staff Research Engineer, Data Agents — Enterprise AI
$190k - $270k
Cacheflow is seeking talented AI researchers to join the team dedicated to building advanced enterprise data agents. You will be working on developing post-training recipes and enhancing product experiences by collaborating with cross-functional teams. A background in Computer...
Cacheflow
San Francisco, CA
2 days ago
Research Engineers, Data
$150k - $250k
...goods, and global social organizations. We research and deploy technologies that power AI-... ...We Are Looking For At Distyl, Research Engineers build the bridge between frontier AI research... ...Key Responsibilities Design and build data systems that power reliable AI workflows...
Full time
Work at office
3 days per week
Distyl AI
San Francisco, CA
5 days ago
Data Engineer - Multimodal Systems in San Francisco
...company based in San Francisco, California. The Role: As a Data Engineer - Multimodal Systems , you will be a core contributor to creating,... ...quickly The ability to work well with others in a high-paced research setting Can rapidly learn new fields and are excited to...
Work at office
Relocation package
Energy Jobline ZR
San Francisco, CA
5 days ago
ML Research Engineer
...platform will ultimately become the perception engine for a company’s physical footprint,... ...engineer responsible for turning sensor data pipelines into actionable insights for our... ...training platform Driving the design behind a multimodal software user interface Qualifications: 5...
Specter Services LLC
San Francisco, CA
1 day ago
Research Engineer, Economic Research Data Platform
$300k - $405k
...as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to... ...About the role As a Research Engineer on the Economic Research Data Platform team, you will design, build, and maintain critical...
Visa sponsorship
United States Digital Space LLC
San Francisco, CA
2 days ago
Multimodal AI Data Systems Engineer (Hybrid)
$295k
Slope is seeking a Research Engineer for Distributed Data Systems in San Francisco. In this role, you will design, build, and maintain data infrastructure systems to support large-scale multimodal training and evaluation. Candidates should have strong experience with distributed...
Slope
San Francisco, CA
1 day ago
Research Engineer
$200k - $350k
...second-time technical founders, engineers that made 100+ games for... ...world generation pipelines, data-efficient training methods, and... ...3D environments. Our current research spans: Distributed multi-agent... ...Experience building or training large multimodal or agentic systems....
Visa sponsorship
Relocation package
ROAM
San Francisco, CA
2 days ago
Research Engineer
...world industry and economy use cases. As a Research Engineer on our Physical AI team, you will lead... ...learning, and large-scale multimodal learning Design and generate training and... ...FSDP, and DeepSpeed Work with multimodal data pipelines involving video, sensory inputs...
Work at office
HEDRA INC
San Francisco, CA
2 days ago
Data Scientist - AI/ML for Multimodal Systems Biology
Chronicle Bio is developing a data-driven healthcare platform... ...conditions globally. We integrate multimodal data (clinical records,... ...our next-generation discovery engine. This role is central to building... ..., and clinical researchers to ensure data quality and biological...
ChronicleBio
San Francisco, CA
5 days ago
Research Engineer/Scientist - Human Alignment, Consumer Devices
...Team The Future of Computing Research team is an applied research team... .... We work at the frontier of multimodal AI, helping turn emerging... ...work closely across research, engineering, design, product, and safety... ...recipes, reward formulations, data pipelines, and evaluation suites...
Work at office
Immediate start
Relocation package
Slope
San Francisco, CA
2 days ago
Research Engineer, Trustworthy AI
$360k
...on action relevant or decision relevant research to ensure we shape AI keeping societal impacts... ...to hire exceptional research scientists/engineers that can push the rigor of work needed... ...involving large‑scale AI systems and multimodal datasets Enjoy working on large-scale,...
Work at office
Relocation package
Dormont Manufacturing Co
San Francisco, CA
1 day ago
Research Engineer, World Models
$155k - $269k
...controllable, and efficient simulation. As a Research Engineer in the World Models team, you will... ...generation, including video models, multimodal generative models, LLM/VLM/VLA models,... ...large‑scale datasets. Build large‑scale data pipelines to build high quality datasets...
Full time
Work at office
Work from home
Flexible hours
Waabi
San Francisco, CA
2 days ago
Research Engineer
$200k - $250k
Research Engineer Location San Francisco (On-site) Compensation $200,000 - $250,000 + variable... ...power Lotus. You will turn messy health data into accurate, cited, and actionable guidance... ...use Experience building speech or multimodal pipelines for medical settings...
Lotus Health AI
San Francisco, CA
4 days ago
Applied Research Engineer
$250k - $300k
...breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are... ...As an Applied Research Engineer at Labelbox, you will be at the... ...as large language models and multimodal models—and the human data strategies...
Work at office
Flexible hours
2 days per week
Labelbox
San Francisco, CA
1 day ago
Applied Research Engineer
...audio, images, text, and 3D. We combine exabyte-scale data infrastructure and novel multimodal understanding techniques that push the frontier of... ...defining the frontier. About the Role As an applied research engineer at Sieve, you’ll build high performance building...
Sieve, Inc.
San Francisco, CA
2 days ago
Research Engineer
...exploration in the real world. We're looking for research engineers with strong foundations in reinforcement learning, multimodal representation learning, or large-scale model... ...with Kubernetes You’ve done large-scale data processing with tools like Apache Spark You value...
Pantograph
San Francisco, CA
1 day ago
Research Engineer, Computer Use
...users and society. Our team is a rapidly growing group of researchers, engineers, policy experts, and business leaders building beneficial AI... ...long‑horizon or sparse‑reward settings. Familiarity with multimodal model training. Experience building evaluations or benchmarks...
Work at office
Visa sponsorship
United States Digital Space LLC
San Francisco, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer, Multimodal Data. Be the first to apply!