Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Data Engineer

Fusemachines

About Fusemachines

Founded in 2013, Fusemachines is a global provider of enterprise AI products and services, on a mission to democratize AI. Leveraging proprietary AI Studio and AI Engines, the company helps drive the clients’ AI Enterprise Transformation, regardless of where they are in their Digital AI journeys. With offices in North America, Asia, and Latin America, Fusemachines provides a suite of enterprise AI offerings and specialty services that allow organizations of any size to implement and scale AI. Fusemachines serves companies in industries such as retail, manufacturing, and government.

Fusemachines continues to actively pursue the mission of democratizing AI for the masses by providing high-quality AI education in underserved communities and helping organizations achieve their full potential with AI.

 

Important: Immigration Sponsorship Policy

This position is not elegible for employment visa sponsorship or transfer sponsorship now or in the future.
  • Direct Company Sponsorship: Such as H-1B, J-1, or TN visas.
  • Employer of Record: Listing Fusemachines as the immigration employer on any government documentation.
  • Written Documentation: Providing letters or other support for any work authorization (e.g., OPT, STEM OPT, CPT).
 

About the role

This is a remote full-time consulting position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics).

We are looking for a skilled Senior Data Engineer with a strong background in Python, SQL, PySpark, Azure, Databricks, Synapse, Azure Data Lake, DevOps and cloud-based large scale data applications with a passion for data quality, performance and cost optimization. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products, including migration from Synapse to Azure Data Lake. This role involves hands-on coding, mentoring junior staff and collaboration with multi-disciplined teams to achieve project objectives.

Qualification & Experience

  • Must have a full-time Bachelor's degree in Computer Science or similar
  • At least 3 years of experience as a data engineer with strong expertise in Databricks, Azure, DevOps, or other hyperscalers.
  • 3+ years of experience with Azure DevOps, GitHub.
  • Proven experience delivering large scale projects and products for Data and Analytics, as a data engineer, including migrations.
  • Following certifications:
    • Databricks Certified Associate Developer for Apache Spark
    • Databricks Certified Data Engineer Associate
    • Microsoft Certified: Azure Fundamentals
    • Microsoft Certified: Azure Data Engineer Associate
    • Microsoft Exam: Designing and Implementing Microsoft DevOps Solutions (nice to have)

Required skills/Competencies

  • Strong programming Skills in one or more languages such as Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, migration, storage, processing and manipulation.
  • Strong understanding and experience with SQL and writing advanced SQL queries.
  • Thorough understanding of big data principles, techniques, and best practices.
  • Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have: experience with Azure Databricks), DBT and Kafka, to be able to handle large volumes of data.
  • Solid Databricks development experience with significant Python, PySpark, Spark SQL, Pandas, NumPy in Azure environment.
  • Strong experience in designing and implementing efficient ELT/ETL processes in Azure and Databricks and using open source solutions being able to develop custom integration solutions as needed.
  • Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.
  • Expertise in data cleansing, transformation, and validation.
  • Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table).
  • Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions.
  • Strong experience in designing and implementing Data Warehousing, data lake and data lake house, solutions in Azure and Databricks.
  • Good experience with Delta Lake, Unity Catalog, Delta Sharing, Delta Live Tables (DLT).
  • Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies.
  • Strong knowledge of SDLC tools and technologies Azure DevOps and GitHub, including project management software (Jira, Azure Boards or similar), source code management (GitHub, Azure Repos or similar), CI/CD system (GitHub actions, Azure Pipelines, Jenkins or similar) and binary repository manager (Azure Artifacts or similar).
  • Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC – Terraform, ARM including hands-on experience), configuration management, automated testing, performance tuning and cost management and optimization.
  • Strong knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics, such as Azure Data Factory, Azure Databricks, Azure Synapse Analytics, Azure Data Lake, Azure Stream Analytics, SQL Server, Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, etc.
  • Experience in Orchestration using technologies like Databricks workflows and Apache Airflow.
  • Strong knowledge of data structures and algorithms and good software engineering practices.
  • Proven experience migrating from Azure Synapse to Azure Data Lake, or other technologies.
  • Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures.
  • Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines.
  • Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent.
  • Experience with BI solutions including PowerBI is a plus.
  • Strong written and verbal communication skills to collaborate and articulate complex situations concisely with cross-functional teams, including business users, data architects, DevOps engineers, data analysts, data scientists, developers, and operations teams.
  • Ability to document processes, procedures, and deployment configurations.
  • Understanding of security practices, including network security groups, Azure Active Directory, encryption, and compliance standards.
  • Ability to implement security controls and best practices within data and analytics solutions, including proficient knowledge and working experience on various cloud security vulnerabilities and ways to mitigate them.
  • Self-motivated with the ability to work well in a team, and experienced in mentoring and coaching different members of the team.
  • A willingness to stay updated with the latest services, Data Engineering trends, and best practices in the field.
  • Comfortable with picking up new technologies independently and working in a rapidly changing environment with ambiguous requirements.
  • Care about architecture, observability, testing, and building reliable infrastructure and data pipelines.

Responsibilities

  • Architect, design, develop, test and maintain high-performance, large-scale, complex data architectures, which support data integration (batch and real-time, ETL and ELT patterns from heterogeneous data systems: APIs and platforms), storage (data lakes, warehouses, data lake houses, etc), processing, orchestration and infrastructure. Ensuring the scalability, reliability, and performance of data systems, focusing on Databricks and Azure.
  • Contribute to detailed design, architectural discussions, and customer requirements sessions.
  • Actively participate in the design, development, and testing of big data products..
  • Construct and fine-tune Apache Spark jobs and clusters within the Databricks platform.
  • Migrate out of Azure Synapse to Azure Data Lake or other technologies.
  • Assess best practices and design schemas that match business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive).
  • Design and implement data models and schemas that support efficient data processing and analytics.
  • Design and develop clear, maintainable code with automated testing using Pytest, unittest, integration tests, performance tests, regression tests, etc.
  • Collaborating with cross-functional teams and Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components meeting product deliverables.
  • Evaluating and implementing new technologies and tools to improve data integration, data processing, storage and analysis.
  • Evaluate, design, implement and maintain data governance solutions: cataloging, lineage, data quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns.
  • Continuously monitor and fine-tune workloads and clusters to achieve optimal performance.
  • Provide guidance and mentorship to junior team members, sharing knowledge and best practices.
  • Maintain clear and comprehensive documentation of the solutions, configurations, and best practices implemented.
  • Promote and enforce best practices in data engineering, data governance, and data quality.
  • Ensure data quality and accuracy.
  • Design, Implement and maintain data security and privacy measures.
  • Be an active member of an Agile team, participating in all ceremonies and continuous improvement activities, being able to work independently as well as collaboratively.

Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local.

Vacancy posted 2 hours ago
Similar jobs that could be interesting for youBased on the Data Engineer in New York, NY vacancy
  • $170k - $210k

     ..., and we are therefore committed to embracing diversity of thought and experience within our team. We are looking for a Lead Data Engineer to join our team. This is a high-impact, strategic role focused equally on technical leadership and hands-on execution. You will... 
    Suggested
    Remote work
    Work visa
    Shift work

    Mark43

    New York, NY
    13 days ago
  •  ...Job Title: Lead Data Engineer Location: Jersey City Type: Contract Key Skills: Snowflake, SQL, Python, Spark, AWS- Glue, Big Data Concepts Responsibilities: ~ Lead the design, development, and implementation of data solutions using AWS and Snowflake... 
    Suggested
    Contract work
    Work experience placement

    VBeyond

    Jersey City, NJ
    2 days ago
  •  ...Lead Data Engineer - Palantir & PySpark (Lead Data engineer role where experience should be on both hands on and leading the team.) Candidate preference : Should be in the United States and from East coast. Experience : 8-15 Years Location... 
    Suggested
    Remote work

    VBeyond

    Jersey City, NJ
    3 days ago
  •  ...Role: Lead Data Engineer Location: New York Mode of Work: Onsite 60/hour. Responsibilities - Lead the design and implementation of a robust, scalable, and reusable data ingestion framework using Microsoft Fabric - Building a... 
    Suggested

    Maintec Technologies

    New York, NY
    19 hours ago
  • TBD Gen is proud to be an equal-opportunity employer, committed to diversity and inclusivity. We base employment decisions on merit, experience, and business needs, without considering race, color, national origin, age, religion, sex, pregnancy, genetic information,...
    Suggested

    Gen Digital Inc

    New York, NY
    4 days ago
  • $175k - $255k

     ...deserve, we'd love to meet you. About the Role The Revenue Engine team at Charlie Health services all parts of our business...  ...externally sourced datasets, and building user-facing products. As a data engineer on the Revenue Engine team, you will be responsible for... 
    Full time
    Work at office
    Local area

    Charlie Health Outreach

    New York, NY
    1 day ago
  •  ...Lead Snowflake Data Engineer Our client is seeking a Lead Snowflake Data Engineer to design, own, and deliver end-to-end data engineering solutions in modern cloud environments. This role requires full lifecycle ownership across Snowflake pipelines, data modeling, and... 

    TheStaffed

    New York, NY
    1 day ago
  • $220k - $300k

     ...Overview The Lead Data Engineer on the Nebula team plays a significant technical leadership role in shaping and scaling the data foundation that powers analytics, reporting, AI development, and operational decision-making across the organization. This role combines... 
    Local area
    Remote work
    Flexible hours

    Bayview Asset Management

    New York, NY
    2 days ago
  • Job Description: Must Have : DBT Core , Snowflake and Python, AWS Airflow , Container services(Docker) , Terraform Good to Have : BI tooling experience(Power BI or Tableau), understanding of Agile processes( Kanban or Jira) . We need strong person as this...

    ECHO IT SOLUTIONS INC .

    New York, NY
    6 days ago
  • $220k - $230k

     ...Who We Are: Galaxy is a global leader in digital assets and data center infrastructure, delivering solutions that accelerate progress...  ...Teams. Who You Are: You're a senior, hands-on data engineer with 8+ years of experience designing, building, and operating production... 
    Local area
    Flexible hours

    Galaxy USA

    New York, NY
    4 days ago
  •  ...Hi, Position: Lead Data Engineer Experience Required: 12 years+ Location: Jersey City, NJ | Onsite Employment Type: Full-Time NOTE - Must have taken care of team size of Minimum of 10 or more people and sole contributer required... 
    Full time

    Centraprise

    Jersey City, NJ
    1 day ago
  •  ...the vision and achieving the goals of our three core lines of business: Indexing, Digital Distribution, and Data & Analytics. Made up of developers, data engineers, designers, and project managers, the platform team is the engine that drives forward the technical... 

    TMX Group

    New York, NY
    19 hours ago
  • $160k - $220k

     ...Lead Data Engineer Deliberate AI | Hybrid (NYC or Boston) | Full-Time About Deliberate AI: We're a venture-backed company at the frontier of precision mental health. In partnerships with some of the world's top ranked medical schools and psychiatric hospitals, we... 
    Full time
    Worldwide
    Relocation
    Flexible hours
    Shift work
    Night shift
    Day shift

    Deliberate AI

    New York, NY
    1 day ago
  •  ...Proposition: The position sits within the newly consolidated Data and Analytics (D&A) organization supporting the U.S. Business...  ...data science, from data infrastructure, data governance, data engineering, data modeling, data analysis to business intelligence, data science... 

    MetLife

    New York, NY
    19 hours ago
  • $140k - $180k

     ...Data Engineer We are looking for someone whose mindset is about data models and who speaks fluent SQL to join our data team. As a data engineer at Cerity Partners you will play a leading role maintaining the current--and architecting the future--state of our data pipelines... 
    Local area
    Flexible hours

    Cerity Partners

    New York, NY
    20 hours ago
  • $73.15k - $95k

     ...AI Engineer Zenith is one of Publicis Groupe's largest media agencies, spanning 95 markets globally with US offices in New York, Burbank...  ...including content, commerce, multicultural, and future-proofed data capabilities. We are looking for an AI Engineer to lead the... 
    Temporary work
    Freelance
    Flexible hours

    MSLGROUP

    New York, NY
    3 days ago
  •  ...Data Engineer As a Data Engineer you will get to play a key and a collaborative role in the delivery of powerful data-driven products that support 32BJ Health Fund's mission of providing high-quality and low-cost healthcare to its union members. The Data Engineer will... 
    Flexible hours

    Building Service 32BJ Benefit Funds

    New York, NY
    1 day ago
  • $120k - $135k

     ...Data Engineer II US - East Coast (Remote) At Magnite, we cultivate an environment of continuous growth and collaboration. Our work impacts what millions of people read, watch, and buy, and we're looking for people to help us tackle that responsibility with creativity... 
    Work experience placement
    Work at office
    Local area
    Remote work
    Monday to Friday

    Magnite

    New York, NY
    3 days ago
  •  ...Job Title : Data Engineer with Mongo DB Job Location : Charlotte NC / Jersey City NJ / Plano TX (ONSITE) Job Type : Full-Time Job Description: Must Have Technical/Functional Skills Primary Skill: Data Engineer Secondary... 
    Full time

    Centraprise

    Jersey City, NJ
    3 days ago
  •  ...I have an opportunity for " Data Migration Engineer " and looking for a candidate who can join Immediately if you are interested reply me with your updated resume or consultant's contact details and if you could refer someone i really appreciate it. Job Role: Data... 
    Immediate start
    Remote work

    Navtech

    Jersey City, NJ
    1 day ago
  • $115k - $130k

     ...JOB PURPOSE: The Data Engineer III will play a pivotal role in designing, developing, and maintaining the data architecture for our organization. This individual will be responsible for T-SQL development, ETL processes, and Python scripting to ensure efficient data... 

    CenterLight Health System

    New York, NY
    1 day ago
  •  ...Responsibilities: Build and maintain data pipelines using Python and SQL. Create and manage workflows in Databricks and...  ...and fix issues to meet service goals. Work with analytics, engineering, and business teams to deliver clean, ready-to-use datasets.... 
    Work at office

    Texas State Library and Archives Commision

    New York, NY
    3 days ago
  •  ...About the job Data Engineer - Palantir Job Description: Palantir Data Engineer We are seeking a seasoned Data Engineer with 3-7 years of experience in data engineering, cloud platforms and distributed systems, along with 2+ years of hands-on experience with... 
    Work experience placement

    Inizio Partners

    Jersey City, NJ
    19 hours ago
  • $125k - $163.8k

     ...THE POSITION Our roster has an opening with your name on it We are looking for a Data Engineer to join our growing data platform team and take end-to-end ownership of designing, building, and scaling the foundational data infrastructure that powers analytics, machine... 
    Temporary work
    Local area
    Worldwide

    FanDuel

    New York, NY
    19 hours ago
  •  ...Role : Data Engineer Location : New York (Hybrid) Skills - Apptio, Big Data, ETL-Big Data/Data Warehousing, MS SQL, Oracle PL/SQL, Stored Proc Coding, Python, PySpark, Node/Node.JS, RDBMS, Data Analytics Job Description • Individual contributor to... 

    Inficare

    New York, NY
    19 hours ago
  •  ...Data Engineer Our client is a process driven investment management group consisting of a team of researchers, traders and technologists who harness and apply the power of technology and automation to identify, model and trade global financial markets. This division... 

    Quanta Search

    New York, NY
    6 days ago
  •  ...Data Engineer Employment Type: Full-Time, Mid-level CGS is seeking a passionate and driven Data Engineer to support a rapidly growing Data Analytics and Business Intelligence platform focused on providing solutions that empower our federal customers with the tools... 
    Full time
    Flexible hours

    Contact Government Services LLC

    New York, NY
    1 day ago
  •  ...Job Title: ETL Data Engineer Job location : Jeresy,NJ (Onsite|) Job Type: Fulltime Job Description • Design and develop technology solutions that meet the business requirements for Cloud data platform. • Design and develop scalable data... 
    Full time

    Texas State Library and Archives Commision

    Jersey City, NJ
    3 days ago
  •  ...Data Engineer The Department of City Planning (DCP) plans for the future of New York City, working to create thriving and dynamic neighborhoods with access to housing and jobs, resilient infrastructure, and a vibrant public realm. The Department engages communities... 

    New York City | Jobs

    New York, NY
    1 day ago
  • $140k - $180k

     ...Forward Deployed Data Engineer Chassi Overview PE-backed companies have one priority: a successful exit. Three strategies determine whether they achieve it; Customer Growth & Retention, EBITDA Maximization, and Board-Level Alignment. Chassi builds the operating view... 
    Full time
    For contractors
    Work at office
    Relocation

    Chassi

    New York, NY
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Data Engineer. Be the first to apply!