Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Data Engineer - Multimodal Systems

Zyphra

Zyphra is an artificial intelligence company based in San Francisco, California. The Role: As a Data Engineer - Multimodal Systems , you will be a core contributor to creating, collecting, and improving Zyphra’s datasets and data pipelines across a variety of modalities. Your work will intersect with almost every team at Zyphra. You will be involved in collecting large-scale datasets and implementing and optimizing highly parallel data pipelines. You’ll Work Across: Large-scale data collection across a variety of modalities (text, audio, image) Designing and working with highly efficient, parallelized data processing pipelines across modalities Designing and running rigorous experimental ablations to demonstrate the impact of new data improvements What We're Looking For / Requirements: Strong implementation and prototyping ability Can take an idea from conception to experimentation quickly The ability to work well with others in a high-paced research setting Can rapidly learn new fields and are excited to implement new ideas Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale. Qualifications / Additional Skills: Experience collecting, handling, and processing large datasets Experience with parallel Python programming frameworks such as Dask Understanding of the state-of-the-art in dataset curation across modalities A generally meticulous nature and a strong interest in actually looking at data and sanity checking things Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing Understanding of and interest in large-scale, highly parallel data processing pipelines. Proficiency with PyTorch and Python. Experience contributing to large pre-existing codebases and rapidly getting up to speed. Previously published machine learning research in well-respected venues. Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Mathematics, Physics, Machine Learning) Why Work at Zyphra: Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued We strongly value new and crazy ideas and are very willing to bet big on new ideas We move as quickly as we can; we aim to minimize the bar to impact as low as possible We all enjoy what we do and love discussing AI Benefits and Perks: Comprehensive medical, dental, vision, and FSA plans Competitive compensation and 401(k) plan Relocation and immigration support on a case-by-case basis In-office snacks and meals provided Unlimited PTO and company holidays In-person team in San Francisco with a collaborative, high-energy environment #J-18808-Ljbffr Zyphra

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Data Engineer - Multimodal Systems in San Francisco, CA vacancy
  • Zyphra, an AI company in San Francisco, is seeking a Data Engineer to enhance datasets and data pipelines across various modalities. In this collaborative role, you'll collect large-scale datasets and implement efficient processing pipelines. Ideal candidates are proactive... 
    Suggested

    Zyphra

    San Francisco, CA
    1 day ago
  • A pioneering AI firm based in San Francisco is seeking a Research Engineer, Distributed Data Systems. In this role, you will design and maintain infrastructure for large-scale multimodal training, ensuring scalability and reliability of data systems. Candidates should... 
    Suggested
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  • Cartesia is looking for a Software Engineer to build the data infrastructure for its AI models in San...  ...implement scalable data pipelines for multimodal data, particularly audio. Candidates should have experience with ML data systems and demonstrate modern engineering... 
    Suggested
    Work at office

    Cartesia

    San Francisco, CA
    2 days ago
  •  ...As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements... 
    Suggested

    OpenAI

    San Francisco, CA
    3 days ago
  • $350k

     ...candidate will demonstrate proficiency in Python and a strong academic background in relevant fields. This role blends research and engineering, requiring both theoretical knowledge and practical skills. Compensation ranges from $350,000 to $475,000 based on experience,... 
    Suggested

    Thinking Machines Lab

    San Francisco, CA
    5 days ago
  • $255k - $405k

    Slope is seeking a Software Engineer for its team in San Francisco, CA...  ...for large-scale multimodal training. Responsibilities include managing distributed data pipelines and collaborating closely...  ...strong experience in distributed systems and possess excellent organizational... 

    Slope

    San Francisco, CA
    3 days ago
  • $295k

     ...capabilities with the constraints of physical systems to improve peoples' lives. About the Role As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You'll... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    5 days ago
  • $248.4k - $310.5k

     ...solutions company in San Francisco is looking for a Software Engineer specializing in robotics and autonomous vehicles. This role involves architecting data processing pipelines, building machine learning training systems, and collaborating directly with clients. The ideal... 

    Scale AI

    San Francisco, CA
    1 day ago
  •  ...Every breakthrough Physical AI system — humanoid robots, autonomous...  ...video, lidar, radar, and sensor data. But today's data platforms (...  ...‑like analytics, not the multimodal corpora that power AI. As a result...  ...to close it. Our open‑source engine, Daft, is the distributed data... 
    Hourly pay
    Work at office
    Flexible hours
    Night shift
    1 day per week

    Eventual

    San Francisco, CA
    4 days ago
  • $50 - $100 per hour

    A forward-thinking technology company in San Francisco is seeking a network engineer for a contract role. This position blends network engineering with data science to structure and annotate data for autonomous infrastructure. The ideal candidate will have experience with... 
    Hourly pay
    Contract work

    Meter

    San Francisco, CA
    3 days ago
  • Handshake in San Francisco is looking for a Software Engineer to work on their Coding Pod. This position requires building scalable data infrastructure and pipelines for frontier AI coding models. You will develop systems for task generation, dataset quality assurance,... 
    Flexible hours

    Handshake

    San Francisco, CA
    2 days ago
  • $255k - $405k

     ...conditions. About the Team The Sora team is pioneering multimodal capabilities for OpenAI’s foundation models. We’re a hybrid...  ...broad societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale the infrastructure that... 
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Slope

    San Francisco, CA
    3 days ago
  • Salesforce is seeking a Data Engineer to join the Data & Analytics organization in San Francisco. This role focuses on building robust data infrastructure and automating data delivery pipelines while mentoring a specialized team. Successful candidates will have 8+ years... 

    Salesforce

    San Francisco, CA
    2 days ago
  • Chronicle Bio is developing a data-driven healthcare platform...  ...conditions globally. We integrate multimodal data (clinical records,...  ...our next-generation discovery engine. This role is central to building...  ...with a background in systems biology or related disciplines... 

    ChronicleBio

    San Francisco, CA
    1 day ago
  •  ...About the Role We are seeking a Data Infrastructure Engineer to build and operate the infrastructure...  ...raise the quality bar for production systems. You will define clear interfaces and...  ...workflows Exposure to perception, multimodal, or geospatial systems, especially... 
    Permanent employment
    Full time

    Matter Intelligence

    San Francisco, CA
    4 days ago
  • $227.33k - $312.58k

     ...We're looking for a Staff ML Data Engineer to join Procore's AI & Frontier Models organization...  ...for designing and building the data systems that power frontierscale machine learning...  ...focus on spatial intelligence and multimodal data. The primary goal of this role is... 
    Work at office
    Local area
    Immediate start
    3 days per week

    ProCore CPA

    San Francisco, CA
    5 days ago
  • $172k - $215k

    A leading technology firm in San Francisco is seeking a Data Engineer to design and implement high-throughput data processing pipelines. This...  ...that ensures reliability and scalability across distributed systems. Ideal candidates will have strong experience with... 

    Unity3d

    San Francisco, CA
    3 days ago
  •  ...We are seeking a talented Multimodal AI Systems Architect to develop and optimize AI systems that seamlessly integrate vision and audio models. This role focuses on enhancing our voice-to-voice interactions and multimodal retrieval capabilities, ensuring our systems are... 

    Hyphen Connect Limited

    San Francisco, CA
    5 days ago
  • $140k - $180k

     ...Data Infrastructure Engineer Alljoined is creating a future where humans are fully understood and augmented...  ...pipelines that process massive multimodal datasets (video, audio, text, time-...  ...engineering experience with deep expertise in systems-level architecture and languages like... 
    Local area
    Visa sponsorship

    Alljoined

    San Francisco, CA
    2 days ago
  • $260k - $288k

     ...undertaking brings the world's best scientists, engineers, and business professionals into one lab...  ...OpenAI's models, and we believe our own systems should reflect that same intelligence. We...  ...pipeline creation-supported by strong data enrichment and orchestration across the... 

    OpenAI

    San Francisco, CA
    4 days ago
  • $172.5k - $260.1k

     ...efforts. Job Category Software Engineering Job Details About Salesforce Salesforce...  ...of Salesforce. The Enterprise Data & AI Solutions group is the...  ...professionals with the technical depth to build systems from the ground up and the strategic vision... 
    Shift work

    Salesforce.Com Inc

    San Francisco, CA
    3 days ago
  • $320k - $405k

     ...interpretable, and steerable AI systems. We want AI to be safe and...  ...group of committed researchers, engineers, policy experts, and business...  ...strategy for turning raw data center capacity into reliable...  ...Circuit‑Based Interpretability, Multimodal Neurons, Scaling Laws, AI &... 
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    3 days ago
  • $180k - $250k

     ...in San Francisco seeks a talented engineer to build infrastructure for voice AI...  ...role involves creating distributed data pipelines and optimizing systems for large datasets. Ideal candidates...  ...products, particularly in handling multimodal data, and possess strong problem-... 

    MetaVoice

    San Francisco, CA
    3 days ago
  • $180k - $250k

     ...conversational behaviour directly from data. MetaVoice is founded by: Sid, founding engineer at Wayve.ai ( $2B+ raised) We’re...  ...TBs of data Experience working with multimodal data in the context of AI/ML products or systems Demonstrated ability to learn quickly... 

    MetaVoice

    San Francisco, CA
    4 days ago
  •  ...seeks a Member of Technical Staff focusing on Multimodal AI. Responsibilities include designing multimodal systems, conducting cutting-edge research, and collaborating...  ...candidates should have exceptional software engineering skills, experience in multimodal applications,... 
    Remote job

    Cohere

    San Francisco, CA
    2 days ago
  • A leading tech company based in San Francisco is seeking a Software Engineer to enhance its data and AI platform. The role involves developing high-performance distributed data systems and delivering on ambitious projects such as Delta Lake and performance engineering.... 

    Databricks Inc.

    San Francisco, CA
    2 days ago
  • $250k - $380k

     ...models at massive scale. Our systems unify how researchers train and...  ...Role We are looking for an engineer to design and implement the dataset...  ...collaborate closely with the multimodal researchers, and other infra...  ...for multimodal (MM) data that cannot fit in memory. Build... 
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Slope

    San Francisco, CA
    1 day ago
  • $200k - $240k

    As an Analytics Engineer, you will be an early member of the Data Science & Analytics team building the foundation to...  ...and GTM teams from a data systems perspective Become an expert in our...  ...Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute... 
    Work experience placement
    Work at office
    Home office
    Visa sponsorship
    Relocation package

    Anthropic

    San Francisco, CA
    1 day ago
  • Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and API systems. The role involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring engineers... 

    Arena Intelligence, Inc.

    San Francisco, CA
    2 days ago
  •  ...are hiring Machine Learning Engineers who want to work on frontier...  ...modeling, Smart Sizing, and multimodal representation learning. The...  ...Develop and improve multimodal AI systems involving image, video, and...  ...Java, or similar). Strong data structures and algorithms fundamentals... 

    SpreeAI

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Data Engineer - Multimodal Systems. Be the first to apply!