Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff ML Data Engineer (Datagrid)

$227.33k - $312.58k

Procore Technologies

Staff ML Data Engineer (Datagrid)

We're looking for a Staff ML Data Engineer to join Procore's AI & Frontier Models organization. In this role, you'll be responsible for designing and building the data systems that power frontier-scale machine learning research and applied AI products, with a particular focus on spatial intelligence and multimodal data. The primary goal of this role is to ensure that researchers and engineers can reliably discover, curate, transform, and operate on large-scale datasets that move from experimentation to production.

As a Staff ML Data Engineer, you'll work closely with ML researchers, applied ML engineers, and system architects to turn ambiguous research needs into scalable, production-ready data pipelines. You'll remain deeply hands-on while providing technical leadership in data architecture, quality, and operational excellence. This is an opportunity to shape how Procore builds, evaluates, and deploys frontier models by ensuring the underlying data systems are robust, observable, and designed for iteration.

This role reports into the Manager, Software Engineering, and is based in our San Francisco office, supporting Procore's Datagrid AI Division. Given the collaborative and fast moving nature of this work, we are seeking candidates who are available to work onsite in a hybrid model at a minimum of 3 days per week. This is an immediate opening!

What You'll Do
  • Act as the technical lead for data engineering efforts supporting frontier model research and applied ML systems.
  • Design, build, and maintain scalable batch and streaming pipelines for multimodal data (e.g., documents, images, spatial metadata).
  • Partner closely with researchers and architects to translate experimental workflows into reliable, repeatable data systems.
  • Lead the development of dataset curation, versioning, and lineage workflows that support rapid experimentation and reproducibility.
  • Establish and uphold standards for data quality, validation, observability, and cost efficiency across AI data pipelines.
  • Contribute to data architecture decisions spanning research environments and production systems.
  • Identify gaps or inefficiencies in existing data workflows and run proofs-of-concept to evaluate improvements.
  • Mentor other engineers through code reviews, design discussions, and hands-on collaboration.
What We're Looking For
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • 8+ years of experience designing and operating complex data systems in production or research-adjacent environments.
  • Strong proficiency in SQL and Python; experience with data-intensive or distributed systems.
  • Proven experience building scalable data pipelines that support machine learning training, evaluation, or inference workflows.
  • Solid understanding of data modeling, dataset lifecycle management, and data quality best practices.
  • Comfort operating in highly ambiguous problem spaces and collaborating closely with researchers and architects.
  • Demonstrated ability to lead through direct technical contribution, mentorship, and setting engineering standards.
  • Strong communication skills, with the ability to explain technical tradeoffs to both research and engineering audiences.

Nice to have experience with technologies such as:

  • ML & Research Data: Large-scale dataset curation, annotation workflows, experiment tracking, reproducibility tooling
  • Data Platforms: Databricks, Spark, lakehouse architectures, cloud data warehouses
  • Streaming & Pipelines: Kafka, Pub/Sub, event-driven data architectures
  • Orchestration & Observability: Airflow, Dagster, data quality and lineage tools
  • Cloud & Infrastructure: AWS or GCP, containerized data workloads, CI/CD, infrastructure-as-code
  • Performance & Cost: Optimizing data pipelines for GPU-backed training and large-scale inference workloads
Additional Information

Base Pay Range: 227,332.00 - 312,581.50 USD Annual

This role may also be eligible for Equity Compensation and/or Bonus Incentive Compensation. Procore is committed to offering competitive, fair, and commensurate compensation. Actual compensation will be based on a candidate's job-related skills, experience, education or training, and location.

For Los Angeles County (Unincorporated) Candidates:

Procore will consider for employment all qualified applicants, including those with arrest or conviction records, in accordance with the requirements of applicable federal, state, and local laws, including the City of Los Angeles' Fair Chance Initiative for Hiring Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act.

A criminal history may have a direct, adverse, and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment: 1. appropriately managing, accessing, and handling confidential information including proprietary and trade secret information, as well as accessing Procore's information technology systems and platforms; 2. interacting with and occasionally having unsupervised contact with internal/external customers, stakeholders, and/or colleagues; and 3. exercising sound judgment.

Vacancy posted 12 hours ago
Similar jobs that could be interesting for youBased on the Staff ML Data Engineer (Datagrid) in San Francisco, CA vacancy
  • $227.33k - $312.58k

     ...We're looking for a Staff ML Data Engineer to join Procore's AI & Frontier Models organization. In this role, you'll be responsible for designing...  ...is based in our San Francisco office, supporting Procore's Datagrid AI Division. Given the collaborative and fast moving nature... 
    Suggested
    Work at office
    Local area
    Immediate start
    3 days per week

    ProCore CPA

    San Francisco, CA
    4 days ago
  •  ...machine learning models Architecting ML training, validation and inference pipelines...  ...to maximizing the potential of data in AI models Defining creative solutions...  ...field of study Strong ML research and engineering utilizing established and emerging NLP technologies... 
    Suggested

    NovumTech Partners

    San Francisco, CA
    12 hours ago
  •  ...Data Science & ML Ops Engineer We are seeking a hybrid Data Science & ML Ops Engineer to drive the full lifecycle of machine learning solutions—from data exploration and model development to scalable deployment and monitoring. This role bridges the gap between data... 
    Suggested

    Apolis

    San Francisco, CA
    12 hours ago
  • $181.1k - $318.4k

     ...Senior ML Data Engineer, MLO Do you believe Machine Learning and AI can change the world? We truly believe it can! We are the ML Data Team of the Intelligent System Experience (ISE) group at Apple. We are responsible for building high quality datasets at scale. Every... 
    Suggested
    Temporary work
    Relocation

    Apple

    San Francisco, CA
    4 days ago
  •  ...Job Description: Job brief Join our San Francisco office as an ML Engineer focused on Data Engineering. Visa sponsorship available for global talent ~3 days ago A machine learning engineer with an emphasis on data engineering is needed by this organization... 
    Suggested
    Full time
    H1b
    Work at office
    Immediate start
    Visa sponsorship

    Dfbooking Recruitment Services

    San Francisco, CA
    12 hours ago
  •  ...company in San Francisco is seeking a Machine Learning Engineer to ensure the quality and coverage of data across diverse languages. You will design large-scale...  ...datasets and a strong background in applied ML. This full-time role offers competitive benefits, including... 
    Full time
    Work at office

    Cartesia

    San Francisco, CA
    12 hours ago
  • DFBooking Careers is seeking a Machine Learning Engineer to manage massive datasets and improve data handling procedures in San Francisco. The role focuses on implementing distributed storage solutions, optimizing data operations, and building backend systems for versioning... 

    DFBooking Careers

    San Francisco, CA
    1 day ago
  • Jaide Health is seeking a Machine Learning Engineer specializing in pretraining data to enhance AI performance. This role involves creating data pipelines and conducting experiments on data quality. Candidates should have strong Python skills and experience with data processing... 
    Remote job

    Jaide Health

    San Francisco, CA
    1 day ago
  • $156.5k - $227.5k

     ...LiveRamp is the data collaboration platform of choice for the world's most innovative companies. A groundbreaking leader in consumer...  ...and privacy requirements. The Principal Analytics & ML Engineer plays a critical role in delivering trusted, scalable analytics... 
    Work at office
    Work from home
    Flexible hours
    Night shift

    LiveRamp

    San Francisco, CA
    5 days ago
  •  ...on the internet. We are looking for exceptional research engineers and applied researchers to help push the frontier of interactive...  .... The Role We're looking for a Member of Technical Staff - Data & ML Infrastructure Engineer to help build and optimize the... 

    Moonlake AI

    San Francisco, CA
    3 days ago
  • About the Role We are seeking a Data Infrastructure Engineer to build and operate the infrastructure that turns drone, aerial, and orbital sensing...  ...What You’ll Do Design, build, and operate scalable data and ML infrastructure on AWS, including workloads running on Kubernetes... 
    Permanent employment
    Full time

    Matter Intelligence

    San Francisco, CA
    3 days ago
  • A progressive technology company in San Francisco is looking for a Data Infrastructure Engineer to design and operate data and ML infrastructure on AWS. The ideal candidate will have strong software engineering fundamentals and experience building production systems, particularly... 

    Matter Intelligence

    San Francisco, CA
    3 days ago
  • Cartesia is looking for a Software Engineer to build the data infrastructure for its AI models in San Francisco. In this hands-on role, you will design...  ...particularly audio. Candidates should have experience with ML data systems and demonstrate modern engineering execution.... 
    Work at office

    Cartesia

    San Francisco, CA
    2 days ago
  • An innovative networking company in San Francisco is seeking a Data Operations Engineer to develop systems that convert network engineering expertise into high-quality training data. This role calls for creativity in prototyping and a passion for user experience, where... 

    Meter

    San Francisco, CA
    4 days ago
  •  ...professional for foundation model development. The ideal candidate will focus on gathering and generating high-quality text data through advanced data engineering techniques. Candidates should have strong expertise in machine learning frameworks, data curation, and Python... 

    Liquid AI

    San Francisco, CA
    3 days ago
  • $200k

     ...Glocomms is looking for a hands-on Software Engineer to join its early-stage team in San Francisco or London. This role focuses on building and scaling core systems for data, research, and machine learning. The ideal candidate will design data pipelines, develop Kubernetes... 
    Remote work

    Glocomms

    San Francisco, CA
    12 days ago
  •  ...leading hospitality platform in San Francisco is seeking a Staff Machine Learning Engineer to enhance guest and host experiences through cutting-edge...  ..., and contribute to improving product experiences using ML. Ideal candidates should have extensive experience in applied... 

    airbnb, Inc.

    San Francisco, CA
    12 hours ago
  •  ...Seeking Founding Data Scientists and Machine Learning Engineers Imagine Multiplying Your Impact You've unlocked major wins in your career - you've shipped...  ..., and more. You'll help extend those domains: building ML and AI models to detect and surface product... 

    Palladio AI, Inc

    San Francisco, CA
    1 day ago
  • $185k - $235k

     ...mechanism. The real product is a scalable risk engine. We stay when traditional insurers exit...  ...on coarse proxies, backward-looking data, and manual processes, then accepts damage...  ...far less friction. Role Summary: This ML Engineer role owns tooling surrounding Stand... 
    Full time
    Temporary work
    H1b
    Work at office
    Remote work
    Visa sponsorship
    Work visa
    Flexible hours

    Stand Insurance

    San Francisco, CA
    4 days ago
  • $205k - $316k

    Quizlet is looking for a Data Platform Engineer to design and build the infrastructure that supports large-scale data processing and machine learning...  .... The role entails building and maintaining the data and ML infrastructure, improving platform usability, and partnering... 
    Work at office
    3 days per week

    Quizlet

    San Francisco, CA
    2 days ago
  • Nerdleveltech is seeking an L4 Machine Learning Engineer to join our Trust Intelligence Platform...  .... You will design, build, and operate ML infrastructure that enables real-time intelligence...  .... Responsibilities include architecting data pipelines, building ML workflows, and... 
    Remote work

    Nerdleveltech

    San Francisco, CA
    3 days ago
  • $181.1k - $318.4k

     ...Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms Work Locations (...  ...Familiarity with one of the popular ML Frameworks like Pytorch, Tensorflow....  ...Machine Learning, Information Retrieval, Data Science or related field... 
    Relocation

    Apple

    San Francisco, CA
    2 days ago
  • A leading financial technology company in San Francisco is seeking a Senior Research Scientist to lead applied research on their foundation model. You will design model architectures and develop end-to-end machine learning systems that enhance product capabilities. Candidates...

    Plaid Inc

    San Francisco, CA
    1 day ago
  • Slope is seeking a Machine Learning Engineer in San Francisco to own tooling for data annotation for computer vision. You will improve automation and reduce costs...  .... The ideal candidate has robust experience in ML and computer vision, competent in managing and optimizing... 
    Flexible hours

    Slope

    San Francisco, CA
    3 days ago
  • $185k - $235k

    Stand Insurance is seeking an ML Engineer to enhance its data annotation pipeline using computer vision and machine learning systems. The role involves significant ownership of operational processes to improve automation and reduce costs in policy delivery. Key responsibilities... 

    Stand Insurance

    San Francisco, CA
    2 days ago
  • A leading AI company in San Francisco is seeking a Machine Learning Engineer focused on Anonymization. This key role involves designing and implementing advanced ML models for data privacy, building robust backend infrastructure, and ensuring compliance with privacy standards... 

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  •  ...ML Engineer - Data Scientist (Enterprise) Hilbert is building the ML systems that power demand intelligence for the world's largest consumer companies - recommendation engines, demand forecasting, customer lifecycle models, and activation systems that must work across... 
    Live in
    Flexible hours
    Shift work

    Hilbert\'s AI

    San Francisco, CA
    1 day ago
  • $102.4k - $181.2k

    Staff Data & AI Technical Solutions Engineer P-1398 Note: this is a hybrid role and requires ~3 days in the office in Plano, Tx. Mission As a Staff Data...  ...Delta, Data Ingestion, Data Streaming applications, or ML/AI applications for industry use cases. Around 3 years... 
    For contractors
    Work at office
    Local area
    Worldwide

    Databricks Inc.

    San Francisco, CA
    12 hours ago
  • Airbnb, Inc. is hiring a Senior Staff Machine Learning Engineer, focusing on driving evaluation strategies and data infrastructure for CSxAI initiatives. This role requires a...  ...in a relevant field, extensive experience in ML/AI systems, and strong leadership in technical... 
    Remote job

    airbnb, Inc.

    San Francisco, CA
    1 day ago
  • Rime Labs is hiring a Machine Learning Engineer to manage the operational data pipeline for voice AI. This role demands strong software engineering fundamentals and expertise in managing production data systems, particularly within GCP environments. Candidates should be... 
    Remote job

    Rime Labs

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff ML Data Engineer (Datagrid). Be the first to apply!