Mid-Level Data Engineer

Vantage Data Centers

About Vantage Data Centers Vantage Data Centers powers, cools, protects and connects the technology of the world’s well‑known hyperscalers, cloud providers and large enterprises. Developing and operating across North America, EMEA and Asia Pacific, Vantage has evolved data center design in innovative ways to deliver dramatic gains in reliability, efficiency and sustainability in flexible environments that can scale as quickly as the market demands. Technology and Systems The Technology & Systems department drives technological innovation for the company and advances the technology strategy to support global growth. This includes IT, Software Development, OT / Automation Systems, and business process improvement. At Vantage, we are very hands on; we specify, purchase, configure and maintain all networking and server hardware and work closely with partner VARs to learn about the latest technology changes. The department participates in designing each new data center building’s networking infrastructure. Position Overview This position will be based on site in Denver, CO, in alignment with our flexible work policy (3 days on site required, 2 days flexible). Vantage Data Centers is seeking a MidLevel Data Engineer to build, operate and scale our enterprise data platform. The role is designed for an engineer who can operate independently, execute reliably in a fast‑paced environment, and take ownership of data pipelines and datasets with minimal ramp‑up. Essential Job Functions Design, build, and maintain reliable, scalable data pipelines using Python and PySpark on the Microsoft Azure data platform. Develop and operate batch and incremental data pipelines leveraging Azure Data Factory for orchestration and Azure Data Lake Storage Gen2 as the primary data store. Implement SQL‑ and Spark‑based transformations to produce curated datasets that support enterprise reporting, analytics, and downstream consumption. Take ownership of assigned data pipelines and datasets, monitoring, troubleshooting, and performance optimization in production environments. Work with Azure Synapse (dedicated or serverless where applicable) to support analytical workloads and data consumption patterns. Collaborate with business analysts and cross‑functional stakeholders to translate data requirements into practical, working data solutions. Prepare and structure data to support advanced analytics and AI‑enabled use cases by ensuring data quality, consistency, and documentation. Apply established data governance, security, and engineering standards to ensure compliant, maintainable, and scalable solutions. Participate in code reviews, technical discussions, and platform improvement initiatives as an active contributor. Proactively identify data quality issues, pipeline risks, and improvement opportunities, and communicate them clearly in a fast‑paced environment. Duties Develop and maintain PySpark notebooks and jobs to ingest, transform, and curate data within the enterprise data platform. Build and modify Azure Data Factory pipelines for batch and incremental data ingestion. Implement Spark‑based transformations that write curated datasets to Azure Data Lake Storage Gen2 using established folder structures and naming conventions. Create and maintain SQL views and tables in Azure Synapse to support analytics and reporting use cases. Respond to pipeline failures, data validation issues, and operational alerts. Perform basic performance tuning of Spark jobs (e.g., partitioning, filtering, incremental logic) within established architectural patterns and standards. Validate data outputs with business partners and address data defects or discrepancies. Commit code using Git, follow branching standards, and participate in pull request reviews. Update documentation for pipelines, datasets, and operational runbooks as changes are made. Execute assigned backlog items within sprint timelines and raise risks or blockers early. Additional duties as assigned by management. Job Requirements Education & Experience Bachelor’s degree in Engineering, Computer Science, Data Analytics, or a related field, or equivalent experience. Minimum of 3‑5 years of experience in data engineering or analytics engineering. Proficiency in Python for building and maintaining data pipelines, automation, and data processing workflows, including use of PySpark. Proficiency in SQL for querying, transformation, and analytical data processing. Solid understanding of ETL/ELT pipelines, data transformation patterns, and data integration concepts. Experience analyzing enterprise data sources to identify data relationships, transformations, and business rules. Experience building solutions on the Microsoft Azure platform with exposure to services such as Azure Data Factory, Azure Synapse, Azure Data Lake Storage Gen2, and related analytics services. Experience working with source control and CI/CD workflows using tools such as GitHub or Azure DevOps. Working knowledge of data modeling fundamentals, including fact and dimension tables. Strong communication and interpersonal skills with the ability to collaborate across teams in a fast‑paced environment. Experience working in Agile development environments. Experience using collaboration and project tracking tools such as Jira or similar tools. Travel required is expected to be up to 10% but may increase over time as the business evolves. Desired Qualifications Experience working with distributed data processing frameworks, including Apache Spark. Exposure to advanced analytics or AI‑adjacent data use cases, including preparing data for machine learning or intelligent applications. Familiarity with additional Azure services such as Azure Functions or Logic Apps in support of data workflows. Experience supporting data platform enhancement, refactoring, or modernization initiatives. Familiarity with data quality, reliability, and operational best practices in production environments. Experience working in a scaling or fast‑paced organization where priorities evolve quickly. Physical Demands and Special Requirements The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions. The employee may occasionally stand, walk, sit, use hands to handle or feel objects, reach with hands and arms, climb stairs, balance, stoop or kneel, talk and hear, and lift or move up to 25 pounds. Additional Details Salary Range: $100,000‑$115,000 + Bonus (based on Colorado market data; may vary in other locations). This position is eligible for company benefits including medical, dental, vision coverage; life and AD&D; short and long‑term disability coverage; paid time off; employee assistance; participation in a 401(k) program with company match; and many other voluntary benefits. Compensation for the role will depend on a number of factors, including your qualifications, skills, competencies, and experience and may fall outside of the displayed range. Vantage Data Centers is an Equal Opportunity Employer. #J-18808-Ljbffr

Apply

Vacancy posted more than 2 months ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Mid-Level Data Engineer. Be the first to apply!