Data Engineer - Multimodal Systems
Zyphra
Zyphra is an artificial intelligence company based in San Francisco, California. The Role: As a Data Engineer - Multimodal Systems , you will be a core contributor to creating, collecting, and improving Zyphra’s datasets and data pipelines across a variety of modalities. Your work will intersect with almost every team at Zyphra. You will be involved in collecting large-scale datasets and implementing and optimizing highly parallel data pipelines. You’ll Work Across: Large-scale data collection across a variety of modalities (text, audio, image) Designing and working with highly efficient, parallelized data processing pipelines across modalities Designing and running rigorous experimental ablations to demonstrate the impact of new data improvements What We're Looking For / Requirements: Strong implementation and prototyping ability Can take an idea from conception to experimentation quickly The ability to work well with others in a high-paced research setting Can rapidly learn new fields and are excited to implement new ideas Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale. Qualifications / Additional Skills: Experience collecting, handling, and processing large datasets Experience with parallel Python programming frameworks such as Dask Understanding of the state-of-the-art in dataset curation across modalities A generally meticulous nature and a strong interest in actually looking at data and sanity checking things Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing Understanding of and interest in large-scale, highly parallel data processing pipelines. Proficiency with PyTorch and Python. Experience contributing to large pre-existing codebases and rapidly getting up to speed. Previously published machine learning research in well-respected venues. Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Mathematics, Physics, Machine Learning) Why Work at Zyphra: Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued We strongly value new and crazy ideas and are very willing to bet big on new ideas We move as quickly as we can; we aim to minimize the bar to impact as low as possible We all enjoy what we do and love discussing AI Benefits and Perks: Comprehensive medical, dental, vision, and FSA plans Competitive compensation and 401(k) plan Relocation and immigration support on a case-by-case basis In-office snacks and meals provided Unlimited PTO and company holidays In-person team in San Francisco with a collaborative, high-energy environment #J-18808-Ljbffr
- ...Zyphra, an AI company in San Francisco, is seeking a Data Engineer to enhance datasets and data pipelines across various modalities. In this collaborative role, you'll collect large-scale datasets and implement efficient processing pipelines. Ideal candidates are proactive...Suggested
- ...A pioneering AI firm based in San Francisco is seeking a Research Engineer, Distributed Data Systems. In this role, you will design and maintain infrastructure for large-scale multimodal training, ensuring scalability and reliability of data systems. Candidates should...SuggestedWork at officeRelocation package
- ...Cartesia is looking for a Software Engineer to build the data infrastructure for its AI models in San... ...implement scalable data pipelines for multimodal data, particularly audio. Candidates should have experience with ML data systems and demonstrate modern engineering...SuggestedWork at office
$325k - $405k
...leading AI research company in San Francisco seeks a Software Engineer for their Data Acquisition team. You'll lead projects in data collection,... ...with various teams, and develop scalable distributed systems. Candidates should hold a BS/MS/PhD in computer science with...Suggested- ...About the Team The Sora team is pioneering multimodal capabilities for OpenAI’s foundation models. We’re a hybrid research... ...of broad societal benefit. About the Role As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers...SuggestedWork at officeRelocation package
$350k
...candidate will demonstrate proficiency in Python and a strong academic background in relevant fields. This role blends research and engineering, requiring both theoretical knowledge and practical skills. Compensation ranges from $350,000 to $475,000 based on experience,...- ...As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements...
$255k - $405k
Slope is seeking a Software Engineer for its team in San Francisco, CA... ...for large-scale multimodal training. Responsibilities include managing distributed data pipelines and collaborating closely... ...strong experience in distributed systems and possess excellent organizational...$50 - $100 per hour
...A forward-thinking technology company in San Francisco is seeking a network engineer for a contract role. This position blends network engineering with data science to structure and annotate data for autonomous infrastructure. The ideal candidate will have experience...Hourly payContract work- ...About the Team The Safety Systems team is dedicated to ensuring the safety, robustness, and reliability... ...safety About the Role As an Analytics Engineer in Safety Systems, you will play a pivotal role in building a data-centric culture, enhancing decision-making...Work at officeRelocation package
- ...Handshake in San Francisco is looking for a Software Engineer to work on their Coding Pod. This position requires building scalable data infrastructure and pipelines for frontier AI coding models. You will develop systems for task generation, dataset quality assurance,...Flexible hours
$248.4k - $310.5k
...solutions company in San Francisco is looking for a Software Engineer specializing in robotics and autonomous vehicles. This role involves architecting data processing pipelines, building machine learning training systems, and collaborating directly with clients. The ideal...- ...Every breakthrough Physical AI system — humanoid robots, autonomous... ...video, lidar, radar, and sensor data. But today's data platforms (... ...‑like analytics, not the multimodal corpora that power AI. As a result... ...to close it. Our open‑source engine, Daft, is the distributed data...Hourly payWork at officeFlexible hoursNight shift1 day per week
$255k - $405k
...conditions. About the Team The Sora team is pioneering multimodal capabilities for OpenAI’s foundation models. We’re a hybrid... ...of broad societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale the infrastructure that...Full timeWork at officeLocal areaRelocation packageFlexible hours- ...Chronicle Bio is developing a data-driven healthcare platform... ...conditions globally. We integrate multimodal data (clinical records,... ...our next-generation discovery engine. This role is central to building... ...with a background in systems biology or related disciplines...
- Salesforce is seeking a Data Engineer to join the Data & Analytics organization in San Francisco. This role focuses on building robust data infrastructure and automating data delivery pipelines while mentoring a specialized team. Successful candidates will have 8+ years...
- ...About the Role We are seeking a Data Infrastructure Engineer to build and operate the infrastructure... ...raise the quality bar for production systems. You will define clear interfaces and... ...workflows Exposure to perception, multimodal, or geospatial systems, especially...Permanent employmentFull time
$227.33k - $312.58k
...We're looking for a Staff ML Data Engineer to join Procore's AI & Frontier Models organization... ...for designing and building the data systems that power frontierscale machine learning... ...focus on spatial intelligence and multimodal data. The primary goal of this role is...Work at officeLocal areaImmediate start3 days per week- ...We are seeking a talented Multimodal AI Systems Architect to develop and optimize AI systems that seamlessly integrate vision and audio models. This role focuses on enhancing our voice-to-voice interactions and multimodal retrieval capabilities, ensuring our systems are...
$140k - $180k
...Data Infrastructure Engineer Alljoined is creating a future where humans are fully understood and augmented... ...pipelines that process massive multimodal datasets (video, audio, text, time-... ...engineering experience with deep expertise in systems-level architecture and languages like...Local areaVisa sponsorship$180k - $250k
...conversational behaviour directly from data. MetaVoice is founded by: Sid, founding engineer at Wayve.ai ( $2B+ raised) We’re... ...TBs of data Experience working with multimodal data in the context of AI/ML products or systems Demonstrated ability to learn quickly...$250k - $380k
...models at massive scale. Our systems unify how researchers train and... ...Role We are looking for an engineer to design and implement the dataset... ...collaborate closely with the multimodal researchers, and other infra... ...for multimodal (MM) data that cannot fit in memory. Build...Full timeWork at officeLocal areaRelocation packageFlexible hours$320k - $405k
...interpretable, and steerable AI systems. We want AI to be safe and... ...group of committed researchers, engineers, policy experts, and business... ...strategy for turning raw data center capacity into reliable... ...Circuit‑Based Interpretability, Multimodal Neurons, Scaling Laws, AI &...Work at officeVisa sponsorshipFlexible hours$180k - $250k
...in San Francisco seeks a talented engineer to build infrastructure for voice AI... ...role involves creating distributed data pipelines and optimizing systems for large datasets. Ideal candidates... ...products, particularly in handling multimodal data, and possess strong problem-...- ...Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and API systems. The role involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring engineers...
$172.5k - $260.1k
...efforts. Job Category Software Engineering Job Details About Salesforce Salesforce... ...of Salesforce. The Enterprise Data & AI Solutions group is the... ...professionals with the technical depth to build systems from the ground up and the strategic vision...Shift work$260k - $288k
...undertaking brings the world's best scientists, engineers, and business professionals into one lab... ...OpenAI's models, and we believe our own systems should reflect that same intelligence. We... ...pipeline creation-supported by strong data enrichment and orchestration across the...- ...seeks a Member of Technical Staff focusing on Multimodal AI. Responsibilities include designing multimodal systems, conducting cutting-edge research, and collaborating... ...candidates should have exceptional software engineering skills, experience in multimodal applications,...Remote job
- ...time Location Type Hybrid Department Engineering Help define the future of hardware development... ...re passionate about building automation systems and want to shape the foundation of... ...at About the Role We’re looking for a Data & LLM Systems Engineer to help us design...Full timeHome officeFlexible hours
- A leading tech company based in San Francisco is seeking a Software Engineer to enhance its data and AI platform. The role involves developing high-performance distributed data systems and delivering on ambitious projects such as Delta Lake and performance engineering....
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Data Engineer - Multimodal Systems. Be the first to apply!
- senior data center engineer San Francisco, CA
- data engineer manager San Francisco, CA
- data science developer San Francisco, CA
- etl data engineer San Francisco, CA
- entry level big data engineer San Francisco, CA
- data engineer San Francisco, CA
- big data cloud engineer San Francisco, CA
- junior big data engineer San Francisco, CA
- remote data engineer San Francisco, CA
- senior data engineer San Francisco, CA

