Data Engineer - Multimodal Systems
Zyphra
Zyphra is an artificial intelligence company based in San Francisco, California. The Role: As a Data Engineer - Multimodal Systems , you will be a core contributor to creating, collecting, and improving Zyphra’s datasets and data pipelines across a variety of modalities. Your work will intersect with almost every team at Zyphra. You will be involved in collecting large-scale datasets and implementing and optimizing highly parallel data pipelines. You’ll Work Across: Large-scale data collection across a variety of modalities (text, audio, image) Designing and working with highly efficient, parallelized data processing pipelines across modalities Designing and running rigorous experimental ablations to demonstrate the impact of new data improvements What We're Looking For / Requirements: Strong implementation and prototyping ability Can take an idea from conception to experimentation quickly The ability to work well with others in a high-paced research setting Can rapidly learn new fields and are excited to implement new ideas Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale. Qualifications / Additional Skills: Experience collecting, handling, and processing large datasets Experience with parallel Python programming frameworks such as Dask Understanding of the state-of-the-art in dataset curation across modalities A generally meticulous nature and a strong interest in actually looking at data and sanity checking things Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing Understanding of and interest in large-scale, highly parallel data processing pipelines. Proficiency with PyTorch and Python. Experience contributing to large pre-existing codebases and rapidly getting up to speed. Previously published machine learning research in well-respected venues. Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Mathematics, Physics, Machine Learning) Why Work at Zyphra: Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued We strongly value new and crazy ideas and are very willing to bet big on new ideas We move as quickly as we can; we aim to minimize the bar to impact as low as possible We all enjoy what we do and love discussing AI Benefits and Perks: Comprehensive medical, dental, vision, and FSA plans Competitive compensation and 401(k) plan Relocation and immigration support on a case-by-case basis In-office snacks and meals provided Unlimited PTO and company holidays In-person team in San Francisco with a collaborative, high-energy environment #J-18808-Ljbffr Zyphra
- Zyphra, an AI company in San Francisco, is seeking a Data Engineer to enhance datasets and data pipelines across various modalities. In this collaborative role, you'll collect large-scale datasets and implement efficient processing pipelines. Ideal candidates are proactive...Suggested
- A pioneering AI firm based in San Francisco is seeking a Research Engineer, Distributed Data Systems. In this role, you will design and maintain infrastructure for large-scale multimodal training, ensuring scalability and reliability of data systems. Candidates should...SuggestedWork at officeRelocation package
- Cartesia is looking for a Software Engineer to build the data infrastructure for its AI models in San... ...implement scalable data pipelines for multimodal data, particularly audio. Candidates should have experience with ML data systems and demonstrate modern engineering...SuggestedWork at office
- ...As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements...Suggested
$350k
...candidate will demonstrate proficiency in Python and a strong academic background in relevant fields. This role blends research and engineering, requiring both theoretical knowledge and practical skills. Compensation ranges from $350,000 to $475,000 based on experience,...Suggested$255k - $405k
Slope is seeking a Software Engineer for its team in San Francisco, CA... ...for large-scale multimodal training. Responsibilities include managing distributed data pipelines and collaborating closely... ...strong experience in distributed systems and possess excellent organizational...$295k
...capabilities with the constraints of physical systems to improve peoples' lives. About the Role As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You'll...Work at officeRelocation package$248.4k - $310.5k
...solutions company in San Francisco is looking for a Software Engineer specializing in robotics and autonomous vehicles. This role involves architecting data processing pipelines, building machine learning training systems, and collaborating directly with clients. The ideal...- ...Every breakthrough Physical AI system — humanoid robots, autonomous... ...video, lidar, radar, and sensor data. But today's data platforms (... ...‑like analytics, not the multimodal corpora that power AI. As a result... ...to close it. Our open‑source engine, Daft, is the distributed data...Hourly payWork at officeFlexible hoursNight shift1 day per week
$50 - $100 per hour
A forward-thinking technology company in San Francisco is seeking a network engineer for a contract role. This position blends network engineering with data science to structure and annotate data for autonomous infrastructure. The ideal candidate will have experience with...Hourly payContract work- Handshake in San Francisco is looking for a Software Engineer to work on their Coding Pod. This position requires building scalable data infrastructure and pipelines for frontier AI coding models. You will develop systems for task generation, dataset quality assurance,...Flexible hours
$255k - $405k
...conditions. About the Team The Sora team is pioneering multimodal capabilities for OpenAI’s foundation models. We’re a hybrid... ...broad societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale the infrastructure that...Full timeWork at officeLocal areaRelocation packageFlexible hours- Salesforce is seeking a Data Engineer to join the Data & Analytics organization in San Francisco. This role focuses on building robust data infrastructure and automating data delivery pipelines while mentoring a specialized team. Successful candidates will have 8+ years...
- Chronicle Bio is developing a data-driven healthcare platform... ...conditions globally. We integrate multimodal data (clinical records,... ...our next-generation discovery engine. This role is central to building... ...with a background in systems biology or related disciplines...
- ...About the Role We are seeking a Data Infrastructure Engineer to build and operate the infrastructure... ...raise the quality bar for production systems. You will define clear interfaces and... ...workflows Exposure to perception, multimodal, or geospatial systems, especially...Permanent employmentFull time
$227.33k - $312.58k
...We're looking for a Staff ML Data Engineer to join Procore's AI & Frontier Models organization... ...for designing and building the data systems that power frontierscale machine learning... ...focus on spatial intelligence and multimodal data. The primary goal of this role is...Work at officeLocal areaImmediate start3 days per week$172k - $215k
A leading technology firm in San Francisco is seeking a Data Engineer to design and implement high-throughput data processing pipelines. This... ...that ensures reliability and scalability across distributed systems. Ideal candidates will have strong experience with...- ...We are seeking a talented Multimodal AI Systems Architect to develop and optimize AI systems that seamlessly integrate vision and audio models. This role focuses on enhancing our voice-to-voice interactions and multimodal retrieval capabilities, ensuring our systems are...
$140k - $180k
...Data Infrastructure Engineer Alljoined is creating a future where humans are fully understood and augmented... ...pipelines that process massive multimodal datasets (video, audio, text, time-... ...engineering experience with deep expertise in systems-level architecture and languages like...Local areaVisa sponsorship$260k - $288k
...undertaking brings the world's best scientists, engineers, and business professionals into one lab... ...OpenAI's models, and we believe our own systems should reflect that same intelligence. We... ...pipeline creation-supported by strong data enrichment and orchestration across the...$172.5k - $260.1k
...efforts. Job Category Software Engineering Job Details About Salesforce Salesforce... ...of Salesforce. The Enterprise Data & AI Solutions group is the... ...professionals with the technical depth to build systems from the ground up and the strategic vision...Shift work$320k - $405k
...interpretable, and steerable AI systems. We want AI to be safe and... ...group of committed researchers, engineers, policy experts, and business... ...strategy for turning raw data center capacity into reliable... ...Circuit‑Based Interpretability, Multimodal Neurons, Scaling Laws, AI &...Work at officeVisa sponsorshipFlexible hours$180k - $250k
...in San Francisco seeks a talented engineer to build infrastructure for voice AI... ...role involves creating distributed data pipelines and optimizing systems for large datasets. Ideal candidates... ...products, particularly in handling multimodal data, and possess strong problem-...$180k - $250k
...conversational behaviour directly from data. MetaVoice is founded by: Sid, founding engineer at Wayve.ai ( $2B+ raised) We’re... ...TBs of data Experience working with multimodal data in the context of AI/ML products or systems Demonstrated ability to learn quickly...- ...seeks a Member of Technical Staff focusing on Multimodal AI. Responsibilities include designing multimodal systems, conducting cutting-edge research, and collaborating... ...candidates should have exceptional software engineering skills, experience in multimodal applications,...Remote job
- A leading tech company based in San Francisco is seeking a Software Engineer to enhance its data and AI platform. The role involves developing high-performance distributed data systems and delivering on ambitious projects such as Delta Lake and performance engineering....
$250k - $380k
...models at massive scale. Our systems unify how researchers train and... ...Role We are looking for an engineer to design and implement the dataset... ...collaborate closely with the multimodal researchers, and other infra... ...for multimodal (MM) data that cannot fit in memory. Build...Full timeWork at officeLocal areaRelocation packageFlexible hours$200k - $240k
As an Analytics Engineer, you will be an early member of the Data Science & Analytics team building the foundation to... ...and GTM teams from a data systems perspective Become an expert in our... ...Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute...Work experience placementWork at officeHome officeVisa sponsorshipRelocation package- Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and API systems. The role involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring engineers...
- ...are hiring Machine Learning Engineers who want to work on frontier... ...modeling, Smart Sizing, and multimodal representation learning. The... ...Develop and improve multimodal AI systems involving image, video, and... ...Java, or similar). Strong data structures and algorithms fundamentals...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Data Engineer - Multimodal Systems. Be the first to apply!
- staff data engineer San Francisco, CA
- data visualization developer San Francisco, CA
- data science developer San Francisco, CA
- senior data center engineer San Francisco, CA
- sr information security engineer San Francisco, CA
- junior big data engineer San Francisco, CA
- entry level big data engineer San Francisco, CA
- data engineer contract San Francisco, CA
- aws data engineer San Francisco, CA
- data engineer manager San Francisco, CA

