Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Data Engineer: Scalable Pipelines for ML Workflows

New Groyp Talentoj

Roles and Responsibility - Design, build, and maintain scalable and reliable data pipelines for dataset creation, transformation, and benchmarking Own and optimize Airflow pipelines on AWS for data processing, orchestration, and evaluation workflows Write efficient, production-grade SQL and Python code for large-scale data processing and analysis Partner closely with ML engineers to enable model training, evaluation, and benchmarking pipelines Improve pipeline performance, reliability, and observability, ensuring high data quality in production Build and maintain systems to support model performance tracking and data drift monitoring Troubleshoot and resolve data issues across pipelines, ensuring minimal impact on ML workflows Contribute to data architecture decisions and best practices across the platform Collaborate cross-functionally with ML, platform, and data teams to support scalable ML infrastructure What Were Looking For 35 years of experience in Data Engineering, Data Platforms, or related roles Strong proficiency in Python and SQL with experience in production systems Hands-on experience with AWS services (S3, EC2, SageMaker or similar) Solid experience building and managing Airflow (or similar orchestration tools) Strong understanding of data engineering fundamentals (ETL/ELT, data modeling, pipeline design) Experience working with large-scale datasets and distributed data systems Experience supporting ML workflows, datasets, or evaluation pipelines Strong problem-solving skills and ability to work independently in a fast-paced environment Nice to Have Experience with ML infrastructure, MLOps, or model evaluation workflows Exposure to biometric systems or computer vision datasets Familiarity with data quality frameworks, monitoring, and observability tools Experience working in SaaS or high-scale production environments #J-18808-Ljbffr New Groyp Talentoj

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Data Engineer: Scalable Pipelines for ML Workflows in New York, NY vacancy
  •  ...P Global in New York City is seeking a ML Data Engineer to build and maintain evolving data infrastructure...  ...and engineers to optimize data workflows and ensure high-quality data processing...  ..., with a strong emphasis on delivering scalable solutions within an Agile framework.... 
    Pipeline

    S&P Global

    New York, NY
    3 days ago
  • TryApplyNow is seeking a Data & AI/ML Engineer in New York, NY. The role focuses on designing and maintaining scalable data pipelines and infrastructure for deploying ML models. Ideal candidates will have a Bachelor's or Master's degree and over 4 years of experience in... 
    Pipeline

    TryApplyNow

    New York, NY
    3 days ago
  •  ...Brokers Group, Inc. based in New York is seeking a Data Engineer to design, build, and maintain scalable data infrastructure and analytics solutions. The role...  ...involve collaborating with teams to architect data pipelines that support enterprise-wide AI initiatives. The... 
    Pipeline

    I did my part and supported the Regular Toilet

    New York, NY
    2 days ago
  •  ...leading technology research organization is seeking an AI Data Engineer to build scalable data pipelines and collaborate with data scientists on impactful AI...  ...The position offers the opportunity to contribute to innovative projects in AI and ML. #J-18808-Ljbffr ProSearch
    Pipeline
    Remote job

    ProSearch

    New York, NY
    2 days ago
  •  ...Job Title- Data Engineer Location- New York, NY...  ...advanced data science and AI/ML solutions in an agile...  ...with data engineering pipelines, while leveraging...  ...pipelines, and ensure scalable data infrastructure. Experience...  ...data science workflows. Collaborate with... 
    Pipeline

    campus4tech

    New York, NY
    2 days ago
  •  ...Data Foundations Engineer The Data Foundations Engineer designs and...  ...high-performance data pipelines and enabling analytics and ML use cases, with strong...  ...fundamentals in data modeling and scalable systems. Key...  ...pipelines. Develop ETL/ELT workflows optimized for... 
    Pipeline

    Apolis

    New York, NY
    4 days ago
  •  ...Data Engineer – Gen AI & RAG We are seeking an experienced Data...  ...be responsible for building scalable data pipelines, supporting AI-driven applications...  ...optimizing enterprise data workflows. Key Responsibilities:...  ...Collaborate with AI/ML teams to integrate RAG and... 
    Pipeline

    Group Nine LLC

    New York, NY
    18 hours ago
  •  ...Architect and own scalable, secure, cloud-native data platforms on Google...  ...and real-time data pipelines using BigQuery, Dataflow...  ...efficiency) Orchestrate workflows using Cloud Composer...  ...) Enable Al/ML and GenAl...  ...practices Mentor engineers, conduct design/code... 
    Pipeline

    Mphasis

    New York, NY
    18 hours ago
  •  ...Data Engineer - AI Data Platform The Role We're partnering...  ..., experimentation, and ML-driven insights....  ...reliable, production-grade data pipelines Design scalable batch and real-time ingestion...  ...features Support ML workflows (MLOps) with clean, structured... 
    Pipeline

    ORBIS

    New York, NY
    18 hours ago
  •  ...bring deep expertise in Data Science, Machine...  ...an experienced Data Engineer to join our data team...  ...building, and maintaining scalable data pipelines, data integration...  ...processing and transformation workflows using Databricks,...  ...and ingestion for AI/ML and Generative AI... 
    Pipeline
    Local area

    Tiger Analytics

    Jersey City, NJ
    8 days ago
  •  ...Data Engineer, Gen AI New York, New York, United States...  ...data infrastructure and pipelines necessary to enable...  ...Design and build scalable data pipelines to ingest...  ...and feature engineering workflows using Spark and Delta...  .... Collaborate with ML engineers and data scientists... 
    Pipeline

    Inizio Partners

    New York, NY
    1 day ago
  •  ...Job Title: Senior Data Engineer We're hiring a Data...  ...and manage robust data pipelines that ingest, process,...  ...optimize batch and real-time workflows. Ensure data...  ...business problems into scalable, actionable data solutions...  ...analytics layers, and ML-ready datasets. ~... 
    Pipeline
    Work at office

    Verse Medical

    New York, NY
    3 days ago
  •  ...looking for a Senior Data Engineer that is innovative, curious...  ...data-driven workflows and a broad range of media...  ...with an emphasis on scalable, open data architectures...  ..., reliable data pipelines that support sell-side...  ...requirements, support ML feature productionization... 
    Pipeline
    Work experience placement

    datafuelX Inc.

    New York, NY
    2 days ago
  • $170k - $250k

     ...platform for infinitely scalable clinical capacity. We...  ...for a highly driven Data Engineer to help design, build,...  ..., engineering, and AI/ML teams to ensure Thesis...  ...maintain scalable, data pipelines and architectures...  ...integrations and ETL workflows to reliably onboard and... 
    Pipeline
    Work at office
    3 days per week

    Thesis

    New York, NY
    3 days ago
  •  ...are looking for a Senior Data Engineer to help design, build, and...  ...transformation to reliability, scalability, and ML enablement. You will...  ...maintain scalable, reliable data pipelines and datasets that power...  ...Support machine learning workflows: Build and maintain feature... 
    Pipeline
    Work at office
    Relocation package

    Nelo Mobile

    New York, NY
    2 days ago
  •  ...AI Engineer Company description Zenith is one of...  ..., and future-proofed data capabilities. Overview...  ...and production-grade ML infrastructure. You...  ...can build reliable, scalable AI systems from...  ...applications, agentic workflows, multi-model pipelines, and advanced retrieval... 
    Pipeline

    Digitas

    New York, NY
    2 days ago
  •  ...Senior Staff Data Engineer The Senior Staff Data Engineer...  ...and delivering data pipelines that process billions...  ...modernization of legacy workflows into modular cloud-...  ...latency, reliability, and scalability in cloud-native...  ...Science to operationalize ML models for forecasting... 
    Pipeline

    Warner Bros.

    New York, NY
    1 day ago
  • $150k - $200k

     ...organization is looking for a Data Engineer to join the team and help...  ...and AI-enabled workflows. In this role, you will be...  ...building, and maintaining scalable data pipelines, data lake platforms, and analytics...  ...data for analytics and AI/ML applications, supporting multiple... 
    Pipeline
    Work at office
    Remote work

    Interactive Brokers

    New York, NY
    1 day ago
  • $180k - $260k

     ...updates on our news and engineering blogs and join us...  .... Role Data is crucial to Whatnot...  ..., ship resilient pipelines, and create the...  ...that balance cost, scalability, and consistency....  ...streaming and batch data workflows that process high-...  ...for analytics, ML, and real-time... 
    Pipeline
    Full time
    Work at office
    Local area
    Remote work
    Work from home
    Home office

    Whatnot

    New York, NY
    1 day ago
  • $90k - $120k

     ...at Position Title: Data Engineer Reports to: Manager...  ..., build, and optimize scalable data pipelines and lake house architectures...  ...orchestration using Databricks Workflows and CI/CD tools such as...  ...for business users • Apply ML techniques for data insights... 
    Pipeline
    Local area
    Remote work

    Public Partnerships

    New York, NY
    3 days ago
  • $176k - $238k

     ...Senior Data Engineer, Knowledge & Information United States...  ...large-scale data pipelines and foundational data...  ...products, and downstream AI/ML-enabled use cases....  ...pipeline performance, scalability, and system efficiency...  ...computationally intensive workflows. Partner with Data... 
    Pipeline
    For contractors
    Work experience placement
    Work at office
    Local area
    Remote work
    Flexible hours

    Komodo Health

    New York, NY
    18 hours ago
  • $230k

     ...Description Senior Data Engineer Location: Remote...  ...pivotal role in building scalable systems that turn...  ...maintain robust data pipelines that scale across high...  ...and dbt to manage data workflows and transformation...  ...with data science and ML teams to support model... 
    Pipeline
    Full time
    Remote work

    SW5 Consulting

    New York, NY
    18 days ago
  • Title: Sr. Data Engineer Location: NYC, NY Duration: long term Position...  ..., and maintain a hybrid ML infrastructure that...  ...ML engineers to streamline workflows and ensure scalable, reliable model deployment...  ...deployment targets. Optimize data pipelines and model deployment... 
    Pipeline
    Contract work

    Accord Technologies Inc

    New York, NY
    18 hours ago
  •  ...looking for a mid-to-senior Data Engineer to build the data...  .... You'll own end-to-end pipeline design and development, co-build ML model pipelines for propensity...  ...ll do Build robust and scalable data pipelines and...  ...development of orchestration workflows - from ingesting raw... 
    Pipeline

    FountAI, Inc.

    New York, NY
    18 hours ago
  • $135k - $145k

     ...SUMMARY The Data Engineer will be responsible...  ...involving Informatica data pipelines, API integrations,...  ...virtual assistants, and workflow automation. This...  ...and IT teams to deliver scalable, high-quality data and...  ...Experience with AI/ML frameworks, LLM-based... 
    Pipeline
    Work experience placement
    Summer work
    Work at office
    Remote work
    Flexible hours

    Empire State Realty Trust

    New York, NY
    1 day ago
  •  ...Senior Rust Full-Stack Engineer - AI Data & Infrastructure...  ...high-performance data pipelines, annotation tooling,...  ...pipelines and evaluation workflows Develop full-stack...  ..., and implement scalable fixes Participate...  ...Familiarity with AI/ML workflows, model training... 
    Pipeline
    Hourly pay
    Ongoing contract
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    New York, NY
    2 days ago
  •  ...background Build and Maintain Data Pipelines: Design, build, and maintain scalable, efficient, and reliable...  .... Implement Feature Engineering: Develop and manage feature...  ...for machine learning workflows, utilizing tools like Vertex AI, BigQuery ML, and custom Python libraries... 
    Pipeline
    Contract work
    Remote work

    LeadStack Inc.

    New York, NY
    2 days ago
  •  ...generations. Description The Data Engineer is responsible for...  ..., and optimizing scalable data platforms and AI-...  ...implementation, data pipeline automation, and advanced...  ...pipelines, models, and workflows using Snowflake...  ...developing or implementing AI/ML solutions or AI agents... 
    Pipeline

    Arhaus, LLC.

    Brooklyn, NY
    1 day ago
  • $175k - $250k

     ...Data Engineer - Trading Technology Infrastructure The Trading...  ...Design, develop, and maintain scalable real-time and batch data pipelines for reference data...  .... Familiarity with AI/ML techniques for data quality...  ...matching, anomaly detection, or workflow automation. Experience... 
    Pipeline

    Millennium Management Corp

    New York, NY
    3 days ago
  • $110k - $140k

     ...WEBSITE : TITLE: Data Engineer LOCATION: New York...  ...designing and building scalable, reliable, and...  ...Work closely with AI / ML Engineers to build intelligence...  ...data models, data pipelines, and data architectures...  ...consuming, and costly manual workflows related to alternative... 
    Pipeline
    Local area
    Remote work
    Work from home
    Home office
    Flexible hours

    Canoe Intelligence

    New York, NY
    more than 2 months ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Data Engineer: Scalable Pipelines for ML Workflows. Be the first to apply!