Data Engineering Lead
$170kHark
Data Engineering Lead
San Jose
About Hark
Hark is an artificial intelligence company building advanced, personalized intelligence. One that is proactive, multimodal, and capable of interacting with the world through speech, text, vision, and persistent memory.
We're pairing that intelligence with next-generation hardware to create a universal interface between humans and machines. While today's AI largely operates through chat boxes and decade-old devices, Hark is focused on what comes next: agentic systems that interact naturally with people and the real world.
To get there, we're developing multimodal models and next-generation AI hardware together - designed from the ground up as a single, unified interface for a new era of intelligent systems.
About the Role
You'll build the data infrastructure that turns raw signals into the training data Hark's models learn from, and the pipelines that keep it flowing at scale.
That means owning the full data engineering stack: ingestion, transformation, quality filtering, and delivery to training and evaluation systems. The models we ship are only as good as the data behind them, and this role owns that foundation.
This is a high-ownership role on a small team. You'll work directly with model researchers, data collection leads, and infrastructure engineers, and the systems you build will directly shape the quality and pace of model development.
Responsibilities
- Design and build scalable data pipelines that ingest, process, and deliver training data across multiple modalities: text, audio, vision, and structured feedback signals.
- Own the data infrastructure stack end-to-end: ingestion, transformation, deduplication, quality filtering, versioning, and delivery to model training and evaluation systems.
- Collaborate closely with model researchers and data collection leads to understand data requirements and translate them into reliable, auditable pipelines.
- Build tooling and frameworks that make it easy for the team to inspect, evaluate, and iterate on data quality. The insights surfaced should feed back into collection and curation decisions.
- Define and enforce data quality standards. Instrument pipelines for correctness, freshness, and coverage. Catch regressions before they reach training.
- Design data systems for reproducibility and scale. The pipelines you build need to handle growing volumes across modalities without becoming a bottleneck.
- Identify gaps in the current stack and drive concrete improvements to throughput, quality, and reliability.
Requirements
- Strong data engineering fundamentals. You are comfortable designing and operating large-scale batch and streaming pipelines, and you care about correctness and reliability.
- Experience building data systems for machine learning. You understand the difference between a data pipeline for analytics and one that feeds model training, and you know what it takes to get the latter right.
- Fluency with the modern data stack. You've worked with tools like Spark, Beam, or Flink, and you know how to make tradeoffs between them. Experience with data versioning systems (e.g., DVC, Delta Lake, Iceberg) is a strong plus.
- Systems thinking. You reason about schema evolution, backfills, and failure modes before they become production incidents. You build for the day-2 case, not just the demo.
- A quality instinct. You don't just move data. You understand what's in it, catch problems early, and close the feedback loop with the people who need clean data.
- Strong communication. You can work closely with model researchers and engineers, explain data tradeoffs clearly, and make good decisions across team boundaries.
- 5+ years of relevant data engineering experience. Experience at a fast-growing AI or research-driven company is a strong plus.
Bonus Qualifications
- Experience building data infrastructure for large language model or multimodal model training.
- Familiarity with multimodal data formats and processing pipelines (audio, video, image).
- Experience with human feedback or preference data pipelines (RLHF, DPO, or similar).
- Hands-on experience with data quality evaluation frameworks or annotation tooling.
- Background in distributed systems, stream processing, or large-scale ETL.
- Experience at a fast-moving AI lab or research-driven company.
Compensation
The US base salary range for this full-time position is between $170,000 - $450,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
$55 per hour
...Data Center Engineering Lead Department: Infrastructure, Planning & Process (IPP) Location: Santa Clara, CA | On-Site Role Type: Contract (6 months) About Our Client Our client is a global technology infrastructure leader with over $9 billion in...SuggestedContract workWorldwide- ...Data Engineer Lead Job Location: Markham, CA Duration: Contract Responsibility: Design, build, and operationalize large-scale enterprise data solutions in Hadoop, Postgres, and Snowflake. Demonstrate outstanding understanding of AWS cloud services especially...SuggestedContract work
- ...algorithms, and frameworks. Proficiency in programming languages commonly used in AI/ML development (e.g., Python). Lead and mentor a team of AI engineers and data scientists, fostering collaboration and professional growth Develop and implement agent based systems using...Suggested
- ...KCS – Krish Compusoft Services) is an AI, Data & Digital innovation partner of choice... ...companies to outperform competition and lead the change in their industry. Rysun partners... ...with product managers, data scientists, and engineering teams to design robust, production-ready...Suggested
$124k - $280k
...Specialty/Competency: Data, Analytics & AI Industry/Sector: Not Applicable Time... ...At PwC, our people in data and analytics engineering focus on leveraging advanced technologies... ..., knowledge, and experiences you need to lead and deliver value at this level include but...SuggestedFull timeH1b$99k - $232k
...Specialty/Competency: Data, Analytics & AI Industry/Sector: Not Applicable Time... ...At PwC, our people in data and analytics engineering focus on leveraging advanced technologies... ...success of our Firm. You are expected to lead with integrity and authenticity, articulating...Full timeH1b- ...today's fast-moving world. Position Summary: The Data Center Integration Program Lead is responsible for leading and expanding the company's hardware... ...systems, and emerging liquid-cooling technologies. The engineer serves as the technical authority for integration...Full time
- ...Job type: Contract JCW has partnered with a leading enterprise technology company seeking a Data Strategy Lead for a 3- 6 month contract engagement... ...to move from ambiguity to action. You can talk to engineers, influence executives, and translate data strategy...Contract workWork at office
$179.2k - $374.4k
...Data Science Lead, Quant Modeling Location: San Jose Employment Type: Regular Job Code: A84153A Responsibilities About Team... ...diverse and highly collaborative teams of product managers, engineers, and other data analysts to drive product impact. You will be...Temporary workLocal area$168k - $252k
...Sr. Martech & Data Platform Lead Santa Clara, California We're in an unbelievably exciting area of tech and are fundamentally reshaping... ...The Role Drive the evolution of Everpure's go-to-market engine by architecting and scaling our Customer Data Platform (CDP)...Work at officeFlexible hours$228.4k - $342.6k
...Job Title Senior Data Lead Company Qualcomm Technologies, Inc. Job Area Engineering Group, Engineering Group ADAS R&D Software General Summary As a leading technology innovator, Qualcomm pushes the boundaries of what's possible to enable next-generation...Work experience placementWork from home$125.5k - $230.2k
...want it to go. Join EY and help to build a better working world. We are looking for a dynamic and experienced Manager of Data Engineering to lead our team in designing and implementing complex cloud analytics solutions with a strong focus on Databricks. The ideal...Summer holidayFlexible hours$179.3k - $354.6k
...EMPLOYEE ROLE People Manager The Opportunity We’re looking for a data authority and hands-on leader to lead our Operational Intelligence initiative, which is comprised of teams of engineers collecting, analyzing, and providing insights related to data across critical...Temporary workWork at officeLocal areaRelocationFlexible hours$186.5k - $358.25k
...Adobe Experience Platform (AEP) Product Success Engineering (PSE) team is seeking an innovative Director of Data Science & Engineering to build the data foundation... ...value and drive product success. You will lead a multidisciplinary team of Data Scientists, Analytics...Temporary workLocal areaWorldwide- ...Data Engineering Competency Lead The Data Engineering Competency Lead is a senior leadership role responsible for driving excellence across the data engineering discipline. This individual will lead a team of talented data engineers, set strategic direction, ensure...Casual work
- ...Designation: Manager - Data Engineering Level: L4 Experience: 10 to 12 years Location: San Jose, California, United States... ...diverse global business units. Medallion Architecture Mastery: Lead the design of the Gold layer (Fact & Dimension tables),...
- ...Full-time Description This role is ideal for seasoned lead technicians with 5+ years in the industry. Our technicians & electricians... ...limited to Service Technicians (POS Technicians, Electricians, Data Technicians, and other service technicians), and other positions....Weekly payFull timeTemporary workWork at officeRemote workFlexible hoursShift work
- Qcells North America is seeking a Lead Data Engineer to architect and build scalable, cloud-based data platforms that support enterprise analytics and operational reporting. The ideal candidate will provide technical leadership and has extensive experience with Azure data...
- ...Reports to: Global AI Center of Excellence Lead Why Us NewRocket is the AI-first Elite... ...Role Overview NewRocket is seeking an AI Data Intelligence Lead to design and build AI-... ...learning, generative AI, and advanced data engineering techniques to enrich, harmonize, and...Remote work
- A leading cloud technology company is seeking a Site Selection Manager in Sunnyvale, CA. This role involves identifying and evaluating data center opportunities, conducting due diligence, and supporting lease negotiations. The ideal candidate has over 6 years of experience...
$125k - $145k
Selectek is looking for a Mission Critical Construction Manager in Santa Clara, CA. The role involves managing internal and subcontractor crews for datacenter construction and upgrades, ensuring safety and quality standards while coordinating various trades including mechanical...For subcontractorVisa sponsorshipWork visa$204.72k - $255.9k
Hitachi America, Ltd. is searching for a Head of Data Center Segment in Santa Clara, CA (remote possible). This leadership role demands... .... Key responsibilities include defining global strategies, leading market segmentation efforts, and building comprehensive go-to-market...Remote job- A leading technology firm is seeking a Lease Administrator in Santa Clara, CA, to manage lease portfolios and oversee financial aspects. The role involves cross-functional leadership, data management, and payment processing. Candidates should have over 8 years of relevant...
- Palo Alto Networks, Inc. is seeking a Technical Operations Specialist to lead data-driven initiatives within the LATAM Technical Solutions organization. This role focuses on transforming data analytics into actionable insights to drive technical sales growth. The ideal...Remote job
- Salt Digital Recruitment is seeking a Wrike Data Migration Specialist for a project with a leading global technology company. This remote role involves transitioning and restructuring data within Wrike while ensuring accuracy and integration. The ideal candidate has 1-...Remote jobContract work
$120k - $180k
...Data Quality Partner Lead Figure is an AI Robotics company developing a general purpose humanoid. Our Humanoid is designed for corporate tasks targeting labor shortages and jobs that are undesirable or unsafe. We are based in San Jose, CA and require 5 days/week in-...Full timeWork at office$181.1k - $318.4k
Apple Inc. is seeking an experienced incident manager/analyst in Cupertino, California, to tackle complex problems surrounding data and analytics products. The role demands 5+ years in analytics, strong problem-solving skills, and proficiency in SQL or Python. Candidates...- ...LLC is seeking a Technical Program & Integration Manager (TPIM) to lead integration efforts for their client’s Global Supply Chain... ...critical role involves designing system architecture and ensuring data integrity across various platforms. The TPIM will connect multiple...
- NVIDIA Gruppe is seeking a dedicated Environmental Health and Safety (EHS) Engineer to ensure compliance with EHS regulations at our facilities in Santa Clara, California. This role requires coordinating safety programs and conducting training sessions to meet safety standards...
$114k
Cornell Dubilier is seeking an Item Configuration & Master Data Management Lead. This role requires overseeing master item data management across Engineering and Operations. You will ensure data accuracy and lead a team in the Philippines, driving continuous improvement...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Data Engineering Lead. Be the first to apply!
- data engineering intern summer San Jose, CA
- senior data integration developer San Jose, CA
- data engineer contract San Jose, CA
- data science developer San Jose, CA
- senior data center engineer San Jose, CA
- software data engineer San Jose, CA
- hadoop big data developer San Jose, CA
- data developer San Jose, CA
- remote data engineer San Jose, CA
- sr data engineer San Jose, CA


