Staff Data Engineer- Data Lake
$170k - $190kH1
At H1, we believe access to the best healthcare information is a basic human right. Our mission is to provide a platform that can optimally inform every doctor interaction globally. This promotes health equity and builds needed trust in healthcare systems. To accomplish this, our teams harness the power of data and AI technology to unlock groundbreaking medical insights and convert those insights into actions that result in optimal patient outcomes and accelerate an equitable and inclusive drug development lifecycle. Visit h1.co to learn more about us.
Data Engineering is responsible for the development and delivery of our most important asset-our data. With thousands of data sources from around the world, the team ensures that data is accurate, normalized, and delivered at a velocity that keeps up with real-world changes. As we expand our markets and the scope of data we provide to our customers, our team must scale to meet that demand. WHAT YOU'LL DO AT H1 As a Staff Data Engineer on the Data Lake team at H1, you will play a critical role in shaping the architecture, scalability, reliability, and long-term direction of our core data platform. This role is designed for a highly technical engineer who is excited to grow into an Engineering Manager track while remaining deeply hands-on technically. The Data Lake is the foundation of H1's platform, responsible for the validation, accuracy, standardization, and quality of the data powering every downstream product and team across the organization. You will help lead the evolution of this platform while supporting and mentoring a growing team of engineers. You will:- Architect, build, and scale distributed ETL/ELT pipelines and large-scale ingestion frameworks across structured and unstructured healthcare datasets.
- Lead the evolution of H1's Data Lake architecture with a focus on scalability, observability, reliability, and cost optimization.
- Own and improve data quality, validation, normalization, and standardization workflows across thousands of global data sources.
- Design and optimize batch and near real-time data processing frameworks using cloud-native distributed systems.
- Optimize distributed compute and storage systems, including Spark workloads, query performance, partitioning strategies, and infrastructure efficiency.
- Drive improvements in monitoring, governance, operational excellence, and production reliability across the platform.
- Troubleshoot complex production data and infrastructure issues across distributed systems.
- Partner closely with Product, Infrastructure, Security, Compliance, and downstream engineering teams to support scalable and secure data delivery.
- Mentor engineers through technical leadership, architecture reviews, and engineering best practices.
- Help define technical roadmap priorities and contribute to long-term platform strategy and execution planning.
- Support production operations, incident response, and platform health as part of overall ownership of the Data Lake ecosystem. ABOUT YOU You are a highly technical data engineer who thrives in lean, high-ownership environments and enjoys solving complex distributed systems challenges. You are excited by the opportunity to influence technical direction, mentor engineers, and grow into broader engineering leadership responsibilities while remaining hands-on. - You have deep experience designing and scaling distributed data platforms and large-scale pipelines in cloud-native environments.
- You excel at building reliable, observable, and maintainable data systems supporting critical business and analytics workloads.
- You have strong expertise in distributed processing, performance optimization, and modern data architecture patterns.
- You are comfortable leading technical initiatives and influencing architecture decisions across teams.
- You communicate effectively with both technical and non-technical stakeholders.
- You enjoy mentoring engineers and helping raise the engineering bar across teams.
- You are energized by ownership, autonomy, and solving ambiguous technical challenges. REQUIREMENTS - 8+ years of experience in data engineering, software engineering, or related fields with significant experience building and scaling distributed data platforms.
- Demonstrated technical leadership experience with interest in or experience mentoring and leading engineers.
- Strong proficiency in Python (PySpark), Java, Scala, or similar programming languages.
Advanced SQL expertise, including performance tuning and optimization across large datasets.
- Deep experience with Apache Spark and cloud-native big data platforms, preferably within AWS environments (EMR, Glue, S3, Athena, Redshift, or similar).
- Experience designing and scaling modern cloud-native data lake architectures and large-scale ingestion frameworks.
- Experience with orchestration and workflow management tools such as Argo, Airflow, or similar technologies.
- Strong understanding of distributed storage systems, partitioning strategies, and file formats such as Parquet, Avro, and ORC.
- Experience with Docker, Kubernetes, and modern containerization technologies.
- Experience implementing monitoring, observability, and data quality frameworks within production environments.
- Experience with large-scale data cleaning, parsing, normalization, and validation workflows preferred.
- Experience working with healthcare, life sciences, publication, or large-scale entity-resolution datasets preferred.
- Exposure to ML/AI-driven data enrichment, parsing, or validation workflows is a plus. - Experience using AI-assisted coding tools (e.g., GitHub Copilot, Claude Code) to accelerate development while maintaining quality is encouraged COMPENSATION This role pays $170,000 to $190,000 per year, based on experience, in addition to stock options. Anticipated role close date: 8/1/2026 H1 OFFERS - Full suite of health insurance options, in addition to generous paid time off - Pre-planned company-wide wellness holidays - Retirement options - Health & charitable donation stipends - Impactful Business Resource Groups - Flexible work hours & the opportunity to work from anywhere - The opportunity to work with leading biotech and life sciences companies in an innovative industry with a mission to improve healthcare around the globe H1 is proud to be an equal opportunity employer that celebrates diversity and is committed to creating an inclusive workplace with equal opportunity for all applicants and teammates. Our goal is to recruit the most talented people from a diverse candidate pool regardless of race, color, ancestry, national origin, religion, disability, sex (including pregnancy), age, gender, gender identity, sexual orientation, marital status, veteran status, or any other characteristic protected by law. H1 is committed to working with and providing access and reasonable accommodation to applicants with mental and/or physical disabilities. If you require an accommodation, please reach out to your recruiter once you've begun the interview process. All requests for accommodations are treated discreetly and confidentially, as practical and permitted by law. We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
$165k - $300k
...Key Responsibilities Design and implement data streaming solutions to process real‑time... ...industry trends and advancements in data engineering, data science and machine learning. Write... ...technologies (e.g., Apache Iceberg, Delta Lake, Apache Trino). Experience with data...SuggestedH1bRemote work- ...Microsoft Fabric (OneLake, Lakehouse, Data Factory), and dbt. Define... ..., governance frameworks, and engineering best practices across the data... ..., jobs, Unity Catalog, Delta Lake) for reliability, performance,... ...operating in a senior or staff‑level engineering role. Deep hands...SuggestedRemote work
- ...change in what an AI employee can do. The engineering problems are hard and the surface area is... ...Role overview You'll be the first Data Engineer on the Artisan team! We're managing... ...GCP, or Azure) ~ Familiarity with data lakes, warehouses, and vector databases ~...SuggestedRemote work
- ...Tel. Role Summary You will own the data infrastructure that powers Twenty's cyber... ...a durable, high-performance data lake and the pipelines, schemas, and query patterns... ...economical. You'll partner closely with engineers and intelligence analysts to turn messy,...SuggestedFull timeWork at officeFlexible hours
- ...Senior Data Lake Engineer/Developer-Tech Lead Duration: 6 months CTH Location: New York Job Description: Hands-on tech lead responsible for designing a large data lake, managing data flows that integrate information from various sources into a common pool implementing...Suggested
- ...S P A Enterprise Info Services is looking for a Data Lake Data Engineer in the United States. The engineer will design, build, and operate high-performance data lake solutions using Oracle technologies. Responsibilities include data ingestion from multiple sources, transformation...Shift work
- ...week ago | Openings: 3 | Applicants: 100+ Responsibilities Lead Databricks data pipelines on GCP using PySpark & SQL Design pipelines with BigQuery, Spanner, GCS, Pub/Sub Implement Delta Lake & Medallion architecture Ensure data quality, performance, CI/CD Mentor team...Full timeContract workTemporary workRemote work
- ...Synoptek is seeking an experienced Data Engineer in the United States to design and manage scalable... ...strong expertise in Databricks, Delta Lake, and extensive experience with PySpark... ...collaboration and mentorship of junior staff. Join us to contribute to innovative data...
$106.61k - $284.28k
...extensive repository of healthcare data spanning over 150 million... ...unparalleled foundation for ambitious engineers. In this high-impact, high-... ...tools you build. As a Staff Data Engineer, you will: Architect... ...patterns, including Data Lake, Data Mesh, and Iceberg, along...Hourly payFull timeTemporary workLocal areaShift work$80k - $120k
Silversmith Capital Partners is looking for a Senior Data Engineer to join their remote Data Lake Team. In this role, you'll maintain data pipelines and validate data quality while collaborating closely with data scientists and ML engineers. The position requires 4+ years...Remote job$190k - $250k
Hebbia is seeking a Data Engineer in New York City to refine its data infrastructure. The ideal candidate will architect ETL pipelines, manage a central data lake, and collaborate with engineering and business teams. This role requires strong experience in data engineering...- ...A leading AI Time platform provider is seeking an individual to join their infrastructure and data team, focusing on scalable and secure data processes. The role emphasizes strong Python skills along with experience in Airflow, Kubernetes, and AWS. Successful candidates...Remote work
- ...Nerdleveltech is seeking a Staff Data Engineer specializing in Python and AI/ML to lead data architecture efforts. This remote role involves designing scalable data systems, integrating multi-platform data, and mentoring teams. Candidates should have 8+ years in data engineering...Remote work
$180k - $200k
...you've found the right job posting. About the Team: Our Data Team is a highly dynamic and innovative group that excels in... ...scale-up company and build on an entirely new system to set the engineering foundation and principles across the org. You will partner closely...- ...About AirOps AirOps is the first end-to-end content engineering platform built for the AI era. In a world where discovery is shifting... ...generated answers every day across every major AI provider. The data platform that powers that is growing fast: 2 million answers...Daily paidFlexible hoursShift work
$177k - $237k
...Staff, Data Center Augmentation Engineer Livingston, NJ / New York, NY / Sunnyvale, CA / San Francisco, CA / Bellevue, WA/Richmond, VA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and...Temporary workFor contractorsCasual workWork at officeRemote workFlexible hoursShift work- ...improve more than 1 billion patient encounters annually. Join us in improving lives during pivotal care moments! Summary The Staff Data Engineer role is part of the Bamboo Health Engineering Team. You would serve as a lead engineer responsible for building and supporting...Local areaRemote work
$212k - $265k
...Staff Data Engineer New York, New York, United States; San Francisco, California, United States; Seattle, Washington, United States 1 in 4 people in the US have a treatable mental health condition, but most providers don't accept insurance, making therapy too expensive...Work at officeWork from homeFlexible hours$170k - $190k
...accomplish this our teams harness the power of data and AI-technology to unlock... ...Visit h1.co to learn more about us. Data Engineering is responsible for the development and delivery... .... WHAT YOU'LL DO AT H1 As a Staff Data Engineer on the Real World Evidence...Flexible hours$180k - $220k
...Senior / Staff Data Engineer — Direct Hire Location: New York City, NY (Hybrid — 3–4 days/week onsite) Compensation: $180K–$220K base salary + equity Employment Type: Full-time About the Company Our client is building software that improves the accessibility...Full time3 days per week- ...traditionally manual channel. Our modern logistics and fulfillment engine helps businesses to build and scale high‑quality, personalized... ...of direct mail. About the Role We’re seeking an experienced Staff Data Engineer to lead the design and buildout of Lob’s next‑generation...Work experience placementLocal areaRemote work
- ...Overview OKX is a leading crypto exchange and developer of OKX Wallet and related products. The Data Engineering team handles the full data scope of OKG, including data ingestion, storage, ETL, data warehousing, and business intelligence. This role focuses on designing...
$168k - $240k
...greater scale, reach, and impact. The Department: Data At Gemini, our Data Team is the engine that powers insight, innovation, and trust across the... ...growth, efficiency, and customer impact. The Role: Staff Data Engineer The Data team is responsible for...Contract workWork at officeRemote workFlexible hours- ...A leading data-driven company in the United States seeks a Staff Data Engineer to lead governance engineering and cost optimization. This role demands expertise in GCP and building reusable frameworks while ensuring compliance and data quality. The ideal candidate will...
$208k - $282k
...Staff Data Engineer At Komodo Health, our mission is to reduce the global burden of disease. We believe that smarter use of data is essential to this mission. That's why we built the Healthcare Map — the industry's largest, most complete, precise view of the U.S. healthcare...Work experience placementLocal areaFlexible hours$193k - $242k
...BlackLine! Make Your Mark: We're looking for a Lead Data Engineer to design, build, and optimize data pipelines that power our... ...knowledge base. Coach and technically train junior staff on design and development standards and best practices. Design...Temporary workWork at officeShift work3 days per week$170k - $190k
...accomplish this our teams harness the power of data and AI-technology to unlock... ...co to learn more about us. Data Engineering is responsible for the development and delivery... ...demand. WHAT YOU'LL DO AT H1 As a Staff Data Engineer on the Emerald team, you will...Flexible hours- ...Staff Data Engineer - IT (Sales & Marketing) Posted: 02/05/2026 Employment Type: Full-time Industry: Other Job Number: 6762 Job Description We strive to be Your Future, Your Solution to accelerate your career! Contact Dani Edgington at ****@*****.***. You...Weekly payFull timeWork at officeRemote work
$241k - $338k
...how to correct it. With our compute capacity, AI research and engineering, and state-of-the-art technology for measuring, imaging, and programming... ...of human health. The Opportunity The role is part of the Data Engineering team, which focuses on owning the strategy,...Work at officeWorldwideRelocation package3 days per week- ...products OKX, OKX Wallet, OKLink and more. About the Team OKX data team is responsible for the whole data scope of OKG, from technical... ...to business intelligence and data science. We are data engineers, data analysts and data scientists. The team has end-to-end ownership...Work experience placement
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Data Engineer- Data Lake. Be the first to apply!
- assistant engineering manager New York, NY
- staff data engineer New York, NY
- staff design engineer New York, NY
- engineering aide New York, NY
- software engineer staff New York, NY
- staff devops engineer New York, NY
- assistant chief engineer New York, NY
- staff automation engineer New York, NY
- project engineer assistant project manager New York, NY
- technology administrator New York, NY

