Data Engineer

Data Engineer Job Description Template

Our company is looking for a Data Engineer to join our team.

Responsibilities:

  • Participate in design discussions to improve our existing frameworks;
  • Gather requirements, assess gaps and build roadmaps and architectures to help the analytics driven organization achieve its goals;
  • May begin to develop sphere of influence with other teams;
  • Handle multiple projects and meet deadlines;
  • Research and evaluate new methods and tools to improve data gathering processes;
  • Create and maintain data pipelines;
  • Create and maintain ETL processes to be used by reporting;
  • Recommend ways to improve data reliability, efficiency and quality;
  • Collaborate with ML, data scientists, and Legal to design and implement compliant, secure, and robust feature stores;
  • Work with new data models that provide intuitive analytics;
  • Health: Medical, dental and vision;
  • Prior experience with HR data is a plus;
  • Transforming existing ETL logic into Hadoop Platform;
  • Analyze and track forecast accuracy to target areas for improvement;
  • Building end-to-end data integration and data warehousing solutions for analytics teams.

Requirements:

  • Bachelor’s Degree in Computer Science, Computer Engineering or a closely related field;
  • Ability to work in a team environment that promotes collaboration;
  • Munging poorly formatted or unstructured data;
  • 2+ years of experience implementing scalable data architectures;
  • Excellent communication and collaboration skills;
  • Strong skills Python programming language, extensive knowledge in Python libraries/frameworks to create pipelines to cleanse and manipulate data;
  • Experience in building machine learning models;
  • Fluent in Python and experience containerizing their code for deployment;
  • Understanding the importance of picking the right data store for the job. (columnar, logging, OLAP, OLTP etc.,);
  • 3+ years of experience with SQL and relational databases;
  • You know how to work with high volume heterogeneous data, preferably with distributed systems such as Hadoop;
  • Experience operating a workflow manager such as Airflow;
  • Knowledge of Machine Learning concepts, applications, and libraries, particularly recommender systems, NLP, classification and clustering techniques;
  • Solid analytical skills and demonstrated problem-solving ability;
  • Experience with Agile implementation methodologies.