Data Engineer
$241k - $338kBiohub
Data Engineer
New York, NY (Hybrid); Redwood City, CA (Hybrid)
Biohub is the first large-scale initiative bringing frontier AI models, massive compute, and frontier experimental capabilities under one roof. We're building a general-purpose system to accelerate scientific discovery, integrating frontier AI models, biological foundation models, and lab capabilities, with the ultimate goal of curing disease. Our technology powers scientists around the world, translating AI capabilities into tools that accelerate research everywhere.
The Opportunity
The role is part of the Data Engineering team, which focuses on owning the strategy, sourcing and implementation for data supporting AI research and development. Our goal is to maximize the speed, agility, and capability of biological AI research by connecting public data resources and Biohub's experimental platforms to AI systems. The data that trains biological frontier models comes in dozens of modalities (sequences, images, spatial coordinates, time series, molecular structures, metadata, publication artifacts) each with its own noise characteristics, biases, and information content. The question of how to represent this data for learning is one of the most important open problems in biological AI.
As a data engineer at Biohub, you'll be designing systems that ingest data from public repositories, transform heterogeneous biological formats into AI-ready datasets, combine that with proprietary datasets, and deliver training datasets to researchers pushing the boundaries of what's possible in biological AI. The infrastructure you build will directly shape what our models can learn.
We're a small team with significant resources and long time horizons. We use AI tools aggressively in our own work—Claude Code, agents for workflow automation, LLMs for metadata extraction. We care about code quality, operational reliability, and building systems that scale. And we care about the biology: we want engineers who can recognize when a pipeline output is technically correct but scientifically wrong.
If you want to work at the intersection of large-scale infrastructure and frontier science, with real autonomy and the chance to build something genuinely new, we'd like to talk.
What You'll Do
- Design and build data pipelines that process genomic and imaging data at petabyte scale
- Solve performance and bandwidth challenges with creative engineering
- Build agent-based systems for automated dataset curation, quality control, and workflow generation
- Create tooling for data cataloging and registration that makes datasets discoverable and accessible
- Collaborate with AI Research teams to translate model requirements into data specifications, and with our scientists to integrate public and internal data into large-scale AI-ready datasets
- Improve pipeline reliability and observability, working toward 99%+ success rates without manual intervention
What You'll Bring
- 8+ years experience building reliable, operable data systems at scale (100s terabytes to petabytes)
- Strong software engineering fundamentals
- Experience deploying distributed computing frameworks like Databricks, Spark, or Ray for large-scale data processing
- Experience with cloud infrastructure (AWS preferred) and HPC environments
- Comfort with ambiguity; ability to make progress when requirements are evolving
- Interest in AI-native development practices and tooling
- Nice to have: Background in computational biology, bioinformatics, or life sciences and experience with genomics datasets and formats (FASTQ, BAM, VCF) or imaging formats (OME-Zarr, HDF5)
Compensation
The future anticipated base pay range for a role in this field is $241,000–$338,000 annually. Compensation ranges will vary based on job-related skills, level of experience, and knowledge. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process.
Better Together
As we grow, we're excited to strengthen in-person connections and cultivate a collaborative, team-oriented environment. This role is a hybrid position requiring you to be onsite for at least 60% of the working month, approximately 3 days a week, with specific in-office days determined by the team's manager. The exact schedule will be at the hiring manager's discretion and communicated during the interview process.
Benefits for the Whole You
We're thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible.
- Provides a generous employer match on employee 401(k) contributions to support planning for the future.
- Paid time off to volunteer at an organization of your choice.
- Funding for select family-forming benefits.
- Relocation support for employees who need assistance moving
- ...About Us GridCARE solves data center developers' most urgent bottleneck - immediate access to power - through a pioneering physics... .... As a fast-growing startup, we are seeking a skilled Data Engineer to help develop data-driven solutions that have real-world...SuggestedImmediate start
$170k - $250k
...Founding Senior Data Engineer At Retell Ai Retell AI is using the first principles to reimagine the call center with cutting edge voice AI. Since launching 18 months ago, thousands of companies now utilize Retell's AI voice agents to handle sales, support, and logistics...SuggestedH1bRelocation$120k - $140k
...Capital, and 8VC, Samara is positioned for significant growth and market impact. We are growing quickly and are looking to hire a Data Engineer to help scale our data and analytics capabilities. As the company grows, we need dedicated ownership of our data platform to...SuggestedWork at officeFlexible hours3 days per week- A leading housing solutions company is seeking a Data Engineer to take ownership of their data platform in Redwood City, California. This role includes building and maintaining core datasets, designing scalable data pipelines, and ensuring accessible data for decision-making...SuggestedFlexible hours
$126.8k - $169k
...Senior/Lead Data Solution Engineer Location: Redwood City, CA, United States What We're Looking For: We're thrilled to embark on the search for a seasoned Senior/Lead Data Solution Engineer to join our vibrant team. This pivotal role offers an exciting opportunity...SuggestedLocal areaRemote workFlexible hours2 days per week$125k - $150k
...core capabilities are our top-tier program and project management, data analytics, and audit services, the backbone of which is our... ...trusted results. We are looking for a dynamic Data Scientist/ML Engineer to join our team. The Data Scientist/ML Engineer will work...Temporary workWork experience placement$138.4k - $163.99k
...Design and develop applications that may involve sophisticated data manipulation. Implementation and maintenance of complex Data Store... ...complexity. Serve as a technical resource for all Data Engineering applications.. Compare, evaluate, and implement new features...Hourly payWeekend workAfternoon shift- ...everyday pieces to one-of-a-kind vintage and luxury. The Big Data team is a central player in the Poshmark organization. Our... ...fuel existing and new business critical initiatives. The Data Engineering team at Poshmark is looking for an experienced software engineer...
- ...AI Chopping Block, Inc. is looking for a skilled data engineer in San Carlos, California, to design, build, and maintain large-scale data pipelines for training and evaluation of robotics foundation models. The ideal candidate will possess excellent software engineering...
$150k - $300k
Array Labs Inc. is seeking a Staff Software Engineer for data infrastructure to design and implement systems that process large datasets from satellite constellations. You will collaborate with a team in building reliable backend infrastructure that handles extensive data...- ...Data Infrastructure Architect Significant experience in software engineering with a specialization in architecting, optimizing, and scaling data infrastructure, ideally as a founding or principal engineer. Proven track record of architecting and scaling data stores...
$125k - $150k
A leading government contracting firm in Redwood City is seeking a Data Scientist/ML Engineer to develop and deploy machine learning algorithms. The ideal candidate will have a Bachelor's degree in a related field and strong programming skills in Python. Responsibilities...- Job Title Must have: Proven experience writing production quality software. Experience with Python and/or C++. Experience with ROS. Experience with different types of sensors. Experience with OpenCV, PCD, or Open3D. Experience with TCP/IP and USB protocol.
- ...Lead Data Engineer Minimum 10 years experience in Data Engineering and lead experience at least 3-4 years. Experience with Hadoop ecosystem (HDFS, Hive, Yarn, file formats like Avro/Parquet), Kafka, Spark Streaming, PySpark, Airflow. Experience with programming in languages...
- ...Mountain View, CA Contract Job Description: ~ The Data Science Engineering Analyst will play a pivotal role in analysing complex datasets, designing and implementing predictive models, and deriving actionable insights to drive business decisions. This position...Contract work
- ...we are looking for entry-level software programmers, Java Full stack developers, Python/Java developers, data analysts/data scientists, machine learning engineers for full time positions with clients. Who Should Apply Recent computer science/engineering/mathematics...Full timeH1bRemote work
- ...Summary: The main function of a data analyst is to coordinate changes to computer databases, test, and implement the database applying... ...required Disqualifiers ideal profile- big tech, think data engineer with analyst ability big plus. Must have a proven record of...
- ...Job Title Required Qualifications: 5+ years of experience in data engineering or data science, with a focus on financial data. Strong expertise in SQL, with the ability to write simple queries efficiently. Proficiency in Snowflake, with intermediate to expert...
- ...Only Local candidates, please don't ask for relocation. Proper LinkedIn ID About the Role We are seeking Senior Data Engineers to join an experienced team building robust data infrastructure to support high-impact analytics and business needs. This role...Local areaImmediate startRelocation
- • 10+ years of overall experience in data management space and at least 5 years of working in large data sets in a data lake environment : • Highly proficient in SQL, • Solid understanding of Spark including performance tuning. • Solid understanding of the AWS Platform...
- ...Data Engineer Role: Data Engineer Location: Palo Alto, CA Duration: Long Term Job Description: We are looking for experienced developer resources with the below key skillset. Python with oops concept exposure. Implementation knowledge covering SQL and Teradata...
$240k - $280k
...Data Engineer Palo Alto, CA About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization...Temporary workInternship- ...Data Engineer Job Location: Menlo Park, CA Job Type: Full-Time / Contract Required Skills: 7 to 10+ Years of experience. Experience in advance SQL, Python, ETL, Data Modelling, Tableau or any BI tool Should be well versed in creating data pipelines using...Full timeContract work
$72 per hour
...Our client, a leader in the autonomous vehicle industry, is seeking a Data Engineer to join their team. As a Data Engineer, you will be part of the Repair & Overhaul Department supporting engineering and shop floor operations. The ideal candidate will demonstrate meticulous...Weekly payTemporary workFlexible hoursShift work- ...AV Safety Data Engineer (Contract) The Safety Strategy & Operations team is responsible for developing and implementing safety processes from comprehensive safety planning and risk management to operational safety in the field, data analysis, and regulatory reporting,...Contract work
- ...and challenging projects supporting the US Navy- Serco has a great opportunity for you! Serco has an exciting opportunity for a Data Engineer/Scientist to support U.S. Navy's Team Submarine Program Offices at the Washington Navy Yard in Washington, DC! This position...Full timeContract workPart timeInternshipWork at officeLocal areaFlexible hours
- ...Data Engineer Location: Mountainview, CA (Hybrid 3 days' work from office) C2C pay rate: $70-75/hr for both the roles. Role Summary: Responsible for designing and building graph data models, knowledge graphs, and data pipelines optimized for GraphRAG and reasoning...Work at office
$72 per hour
...Technical Service Data Engineer Our client, a leader in the autonomous vehicle industry, is seeking a Data Engineer to join their team. As a Data Engineer, you will be part of the Repair & Overhaul Department supporting engineering and shop floor operations. The ideal...Weekly payTemporary workFlexible hoursShift work$65 - $70 per hour
...Insight Global is seeking a talented Data Engineer to join our team within the Edge Network Services division. This role will support the M360 project, focusing on data analytics and metrics. As part of the ENS Analytics team, you will enable data-driven decision-making...$75k - $260k
...Position Summary : GEICO is looking for an experienced engineer who enjoys building fast, reliable platforms and applications that... ...grounded in engineering excellence. This role supports our Finance Data Warehouse. Position Description : As an Engineer II, you...Hourly payWork experience placementInternshipLocal areaFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Data Engineer. Be the first to apply!
- data developer Redwood City, CA
- data engineer Redwood City, CA
- data infrastructure engineer Redwood City, CA
- data engineer analytics Redwood City, CA
- senior data center engineer Redwood City, CA
- finance data engineer Redwood City, CA
- data engineering intern summer Redwood City, CA
- data center engineer Redwood City, CA
- data science developer Redwood City, CA
- aws data engineer Redwood City, CA

