Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Data Infrastructure Engineer

Bright Vision Technologies

Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications.

As we continue to grow, we're looking for a skilled AI Data Infrastructure Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology.

This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential.

AI Data Infrastructure Engineer
Job Title: AI Data Infrastructure Engineer
Location: 100% Remote (Continental United States)
Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor)
Experience: 6+ years
Salary: 100K - 150K
Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.
Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party)
Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap
Compensation: Competitive base salary commensurate with experience, plus benefits.
Employment Terms & Visa Policy
This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies.
This role is part of Bright Vision Technologies' in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies - there is no third-party client, vendor, or implementation partner involved.
We do not engage in C2C, 1099, or third-party arrangements for this role.
BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE.
Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables.
No new H1B sponsorship is available for this role.
However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates.
For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience.

Job Summary
We are seeking an AI Data Infrastructure Engineer to build and operate the large-scale data systems that power modern AI training and evaluation pipelines. The role combines deep data engineering expertise with a strong understanding of AI workloads, focusing on ingestion, transformation, quality assurance, lineage, and high-throughput delivery of data to training jobs across diverse modalities. The ideal candidate has experience operating petabyte-scale data systems, strong software engineering fundamentals, and clear understanding of how data infrastructure choices propagate into model quality and training efficiency.

Key Responsibilities
  • Design and operate large-scale data pipelines supporting AI training, evaluation, and continual improvement workflows.
  • Build ingestion systems for diverse modalities including text, image, audio, video, and structured signals.
  • Implement data cleaning, deduplication, filtering, and quality assurance at petabyte scale.
  • Develop dataset versioning, lineage, and provenance tracking systems suitable for reproducible training.
  • Build high-throughput data loading systems that maximize GPU utilization during training.
  • Implement labeling workflows, active learning pipelines, and human-in-the-loop data improvement systems.
  • Design storage architectures balancing cost, throughput, and latency across data tiers.
  • Build evaluation dataset construction pipelines with strict integrity and contamination controls.
  • Implement data privacy, redaction, and consent enforcement throughout the pipeline.
  • Collaborate with ML researchers and engineers to align data systems with model development needs.
  • Drive observability of data quality, drift, and pipeline health across the AI data estate.
  • Optimize cost and performance through compression, format selection, and caching strategies.
  • Document data systems, schemas, and operational procedures for broad internal use.
  • Stay current with AI data infrastructure research and emerging open-source tools.

Required Qualifications
  • Bachelor's or Master's degree in Computer Science or a related field.
  • Six or more years of data engineering experience, with significant work supporting ML or AI workloads.
  • Strong proficiency in Python and at least one JVM or systems language.
  • Deep experience with modern data processing frameworks such as Spark, Ray, or Beam.
  • Hands-on experience operating petabyte-scale storage and pipeline systems.
  • Strong understanding of distributed systems, data modeling, and storage formats.
  • Experience with dataset versioning, lineage, and reproducibility for ML workflows.
  • Familiarity with high-throughput data loading for accelerator-based training.
  • Strong software engineering practices including testing, CI/CD, and code review.
  • Excellent communication and cross-functional collaboration skills.

Preferred Qualifications
  • Experience with multimodal datasets at large scale.
  • Familiarity with data quality tooling and dataset evaluation methodology.
  • Exposure to privacy-preserving data systems and regulated data handling.
  • Open-source contributions to data infrastructure projects.
  • Experience supporting frontier model training pipelines.

How to Apply
Would you like to know more about this opportunity?
For immediate consideration, please send your resume to [email protected]
Learn more about Bright Vision Technologies at
We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company.
We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs.
Bright Vision Technologies is an Equal Opportunity Employer, including Disability/Veterans.
Position offered by "No Fee Agency."

Equal Employment Opportunity (EEO) Statement

Bright Vision Technologies (BV Teck) is committed to equal employment opportunity (EEO) for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall.

BV Teck expressly prohibits any form of workplace harassment or discrimination. Any improper interference with employees' ability to perform their job duties may result in disciplinary action up to and including termination of employment.
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the AI Data Infrastructure Engineer in United States vacancy
  • $140k - $200k

     ...Clutch Canada is looking for a skilled Software Engineer to join their AI team at Speechify. This role focuses on data collection to support model training with...  ...that include finding audio data, extending cloud infrastructure, and collaborating with scientists. Candidates... 
    Suggested
    Remote work

    Clutch Canada

    Columbia, SC
    4 days ago
  •  ...Data Infrastructure Engineer (Rust) - High Performance Computing About the Role What if your mastery of Rust could directly shape the infrastructure powering the next generation of AI? We're looking for a Senior Rust Engineer to build the high-performance data... 
    Suggested
    Hourly pay
    Ongoing contract
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    Denver, CO
    3 days ago
  • $140k - $200k

     ...Speechify is seeking a Software Engineer to handle data collection for model training. Responsibilities include identifying audio data sources, managing cloud infrastructure on GCP, and working with the AI team on data strategies. Ideal candidates have a BS/MS/PhD in Computer... 
    Suggested
    Remote work

    Clutch Canada

    Cambridge, MA
    4 days ago
  •  ...The Data Center Infrastructure Management (DCIM) Engineer II is responsible for the management and administration of the Electric Power Monitoring Systems (EPMS...  ...as a global leader in digital infrastructure. As AI and cloud technologies fuel the demand for increased... 
    Suggested
    For contractors
    Work at office
    Immediate start
    Worldwide
    Flexible hours

    Quality Technology Services, LLC

    Atlanta, GA
    4 days ago
  • $140k - $200k

     ...Clutch Canada is seeking a Software Engineer to join the AI team at Speechify in Fort Worth, Texas. This full-time role focuses on data collection and infrastructure for model training, offering a salary between $140,000-$200,000 plus bonus and equity based on experience... 
    Suggested
    Full time
    Remote work

    Clutch Canada

    Fort Worth, TX
    4 days ago
  • $122.43k - $183.64k

     ...Lyric is an AI-first, platform-based healthcare technology company, committed to simplifying the business of care by preventing...  ...support are not available for this position. The Senior Data Infrastructure Engineer designs, builds, and scales reliable data platforms that... 
    Full time
    Visa sponsorship

    Color Employer, LLC

    New York, NY
    4 days ago
  •  ...Data Infrastructure Engineer Los Angeles, Palo Alto, San Francisco, Toronto About HeyGen At HeyGen, our mission is to make visual storytelling...  ...of developing applications powered by our cutting-edge AI research. As a Data Infrastructure Engineer, you will lead... 

    HeyGen

    Palo Alto, CA
    2 days ago
  • $250k - $380k

     ...s LLM training and inference infrastructure that powers frontier models at...  ...Role We are looking for an engineer to design and implement the dataset...  ...for multimodal (MM) data that cannot fit in memory. Build...  ...data About OpenAI OpenAI is an AI research and deployment... 
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Slope

    San Francisco, CA
    4 days ago
  • $140k - $200k

     ...Clutch Canada is seeking a Software Engineer for our AI team to enhance data collection processes that support model training operations. The role includes operating cloud infrastructure, collaborating with scientists to improve data quality and finding new sources of... 

    Clutch Canada

    Lexington, KY
    19 hours ago
  •  ...Data Infrastructure Engineer As a Data Infrastructure Engineer at zaimler.ai, you will build the foundational systems that power next-generation AI applications. You will tackle complex challenges across ingestion, storage, and query engines, ensuring extreme performance... 
    Remote work

    Jack and Jill AI

    United States
    2 days ago
  • $135.3k - $178.35k

     ...Data Infrastructure Engineer Berkeley, CA About Glyphic: At Glyphic Biotechnologies, we plan to create the protein revolution for which scientists...  ...to make information easier to find, access, and use AI-Augmented Development Contribute to the development... 
    Work at office

    Glyphic Biotechnologies

    Berkeley, CA
    4 days ago
  • $140k - $200k

     ...Clutch Canada, operating through Speechify, is seeking a Software Engineer to enhance their AI team focused on data collection. Responsibilities include developing the ingestion pipeline on GCP, collaborating with scientists to optimize data quality, and sourcing new audio... 
    Remote work

    Clutch Canada

    Harahan, LA
    4 days ago
  • $140k - $200k

     ...Speechify is hiring a Software Engineer to enhance data collection for its AI team. This role, based in Chicago, involves building scalable datasets and managing cloud infrastructure. The ideal candidate should hold a degree in Computer Science, have over 5 years of experience... 
    Remote work

    Clutch Canada

    Chicago, IL
    19 hours ago
  • $140k - $200k

     ...Speechify is looking for a skilled Software Engineer to join their AI team in Seattle, Washington. The ideal candidate will manage data collection processes for model training and operate the cloud infrastructure on GCP. Required qualifications include a BS/MS/PhD in... 
    Remote work

    Clutch Canada

    Seattle, WA
    19 hours ago
  • $100k - $180k

     ...Data Infrastructure Engineer Cloud/Infrastructure We are seeking a Data Infrastructure Engineer to build and operate the data platform that powers AI/ML analytics modules. You will design and implement scalable data ingestion pipelines, robust ETL/ELT, and a modern... 
    Contract work
    Work experience placement

    Blu Omega

    McLean, VA
    3 days ago
  • $129k - $209k

     ...The Elevator Pitch Join Evolv as Senior Data Infrastructure Engineer in the Machine Learning & Sensors organization, responsible for building and...  ..., secure, and reliable data pipelines that power our AI/ML research and production systems. In this role, you will... 
    Full time
    Work at office
    Flexible hours
    3 days per week

    Evolv Technology

    Watertown, MA
    15 days ago
  • $700 per month

     ...Data Infrastructure Engineer Intern Colombia, Remote With a mission to financially empower the next generation, Sezzle is revolutionizing the...  ...tools is required; candidates must be comfortable leveraging AI to enhance productivity, research, and communication.... 
    Internship
    Remote work

    Sezzle

    United States
    2 days ago
  •  ...Clutch Canada is hiring a Software Engineer to enhance its AI data operations. Located in the United States, this role involves managing data collection and cloud infrastructure improvements to support next-generation AI models. Ideal candidates possess a BS/MS/PhD and... 

    Clutch Canada

    Oakland, CA
    4 days ago
  •  ...Neura Market in New York is seeking a Senior Data Infrastructure Engineer to design and build crucial data systems for our AI products. You will manage data pipelines and storage layers, ensuring performance and reliability. The ideal candidate has over 5 years of experience... 
    Flexible hours

    Neura Market

    New York, NY
    4 days ago
  • $140k - $200k

     ...Clutch Canada is looking for a Software Engineer to join Speechify's AI team in Silver Spring, Maryland. This role involves managing the data ingestion pipeline, developing high-quality datasets, and collaborating with scientists for model training. Candidates should have... 

    Clutch Canada

    Silver Spring, MD
    4 days ago
  •  ...Clutch Canada is seeking a Software Engineer to support data collection for model training operations at Speechify. This role involves finding new audio data sources and enhancing the cloud infrastructure on GCP. The ideal candidate has a BS/MS/PhD in Computer Science... 
    Remote work

    Clutch Canada

    Kirkland, WA
    19 hours ago
  • $140k - $200k

     ...Speechify is seeking a Software Engineer for its AI team. The role involves finding new audio data sources, operating cloud infrastructure on GCP, and collaborating on data models. The ideal candidate has a BS/MS/PhD in Computer Science and over 5 years of software development... 
    Remote work

    Clutch Canada

    Washington DC
    19 hours ago
  • $140k - $200k

     ...Clutch Canada is seeking a Software Engineer to join Speechify's AI team in Fort Lauderdale, Florida. This role involves operating and enhancing our cloud infrastructure and nurturing data acquisition for model training. Ideal candidates should have a degree in Computer... 
    Remote work

    Clutch Canada

    Fort Lauderdale, FL
    19 hours ago
  • $140k - $200k

     ...Clutch Canada is looking for a skilled Software Engineer to join our AI team at Speechify in Baton Rouge. This role will focus on data collection to support model training operations, requiring creative approaches and teamwork. The ideal candidate will have a BS/MS/PhD... 

    Clutch Canada

    Baton Rouge, LA
    4 days ago
  •  ...About the Role We are seeking a Data Infrastructure Engineer to build and operate the infrastructure that turns drone, aerial, and orbital sensing...  ...~ Opportunity to work on novel sensing, data, and AI systems with real-world deployment paths across drone, aerial... 
    Permanent employment
    Full time

    Matter Intelligence

    San Francisco, CA
    1 day ago
  •  ...TITLE: ML Data Infrastructure Engineer LOCATION: Sunnyvale CA or Remote Duration: 12+ Months Rate: DOE Key skills - GCP ML Infrastructure...  ..., BigQuery, Dataflow, Airflow ( Cloud composer), Vertext AI , Datapipeline, ML Training Role Overview: We're... 
    Remote work

    Redolent

    Sunnyvale, CA
    19 hours ago
  • $122.13k - $183.2k

     ...JOB DESCRIPTION Job Description: Data Infrastructure & ML Engineer (Hybrid Role) Role Summary We are seeking a Senior Data Infrastructure...  ...workflows. Enable scalable data foundations for AI/ML integration into production systems. Required Qualifications... 
    Local area

    Axcelis Technologies

    Beverly, MA
    4 days ago
  •  ...for the world's most dynamic AI companies, like Cursor, Notion...  ...applied AI research, flexible infrastructure, and seamless developer tooling...  ...us and help build the platform engineers turn to to ship AI products. As a Network Engineer (Data Centers) at Baseten, you’ll... 
    Flexible hours

    Baseten

    New York, NY
    19 hours ago
  •  ...Transfyr is building physical AI for science, and the world’s...  ...Transfyr, we are building the infrastructure to make real-world scientific...  ...and analyzes multimodal data about how scientific work is...  ...writings here. The Role Data engineers at Transfyr design and build... 

    REACH INDUSTRIES

    Cambridge, MA
    4 days ago
  • A leading AI research organization located in San Francisco is seeking an experienced data infrastructure engineer to design and operate data infrastructure supporting extensive compute fleets. You will manage the lifecycle ownership and ensure high performance, scalability... 
    Relocation package

    OpenAI

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Data Infrastructure Engineer. Be the first to apply!