Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Data Engineer for AI Benchmark Evaluation [Remote]

$50 per hour

SaidGig

Remote
  • Remote job

Role Overview

This role involves contributing to benchmark-driven evaluation projects that focus on real-world data engineering and data science workflows. As a Software Engineer specializing in Data Engineering and Data Science, you will engage in hands-on work with production-like datasets, data pipelines, and data science tasks to evaluate and enhance the performance of advanced AI systems. The ideal candidate will possess a strong foundation in both data engineering and data science, with the capability to navigate data preparation, analysis, and model-related workflows within real-world codebases.

Key Responsibilities

  • Work with structured and unstructured datasets to support SWE Bench-style evaluation tasks.
  • Design, build, and validate data pipelines used in benchmarking and evaluation workflows.
  • Perform data processing, analysis, feature preparation, and validation for data science use cases.
  • Write, run, and modify Python code to process data and support experiments locally.
  • Evaluate data quality, transformations, and outputs for correctness and reproducibility.
  • Create clean, well-documented, and reusable data workflows suitable for benchmarking.
  • Participate in code reviews to ensure high standards of code quality and maintainability.
  • Collaborate with researchers and engineers to design challenging, real-world data engineering and data science tasks for AI systems.

Qualifications

  • Minimum 3+ years of overall experience as a Data Engineer, Data Scientist, or Software Engineer (data-focused).
  • Strong proficiency in Python for data engineering and data science workflows.
  • Demonstrable experience with data processing, analysis, and model-related workflows.
  • Solid understanding of machine learning and data science fundamentals.
  • Experience working with structured and unstructured data.
  • Ability to understand, navigate, and modify complex, real-world codebases.
  • Experience writing readable, reusable, maintainable, and well-documented code.
  • Strong problem-solving skills, including experience with algorithmic or data-intensive problems.
  • Excellent spoken and written English communication skills.

Work Terms

  • Commitments Required: At least 4 hours per day and a minimum of 20 hours per week with 4 hours of overlap with PST.
  • Engagement Type: Contractor assignment (no medical/paid leave).
  • Duration of Contract: 3 months (adjustable based on engagement).

Compensation

Compensation details will be discussed during the interview process.

Eligibility

  • This position is fully remote.
  • Opportunity to work on cutting-edge AI projects with leading LLM companies.
Vacancy posted 11 days ago
Similar jobs that could be interesting for youBased on the Data Engineer for AI Benchmark Evaluation [Remote] in Remote vacancy
  •  ...Applied Data Scientist, LLM Evaluation Introduction At Driver, we're building...  ...a core compiler-like engine, a heavily asynchronous/distributed...  ...layer for employees and AI agents alike to use in...  ...and readability. Build benchmarking and experimentation infrastructure... 
    Suggested
    Remote work
    Flexible hours

    Driver AI Inc.

    United States
    1 day ago
  • $150k - $200k

     ...are seeking a seasoned Senior Data Engineer to architect, enhance, and...  ...autonomy to define engineering benchmarks, mentor fellow engineers,...  ...Lead data platform and vendor evaluations, guiding build vs. buy...  ...support analytics, reporting, AI/ML, and operational decision... 
    Suggested
    Remote work
    Flexible hours
    Night shift

    Ursa Space Systems Inc

    United States
    17 days ago
  • $1,000 per month

     ...Senior Data Engineer Spellbook is seeking a Senior Data Engineer to...  ...both internal analytics and AI-driven product capabilities,...  ...scheduling workflows. All candidate evaluations, interviews, and hiring...  ...Spellbook uses industry benchmark data to establish compensation... 
    Suggested
    Contract work
    Remote work
    Flexible hours

    Spellbook

    United States
    1 day ago
  • $160k - $174k

     ...growing team of world-class engineering, operations, medical...  ...through value-based, AI-driven precision diagnostic...  ...the Team The BI & Data team at Cleerly...  ...architecture and help evaluate trade-offs across build...  ...and is aligned to market benchmarks. Candidates located... 
    Suggested
    Remote work

    Cleerly

    New York, NY
    20 days ago
  •  ...Senior Data Engineer At Inchcape, our vision is to have a connected...  ...compliance. Research and evaluate new features and patterns in...  ...recommendations for adoption, enabling an AI-driven data strategy....  ...self-service. Performance benchmarks and tuning reports... 
    Suggested
    For contractors
    Local area
    Remote work
    Worldwide

    ISS Group

    United States
    1 day ago
  •  ...Data Infrastructure Engineer (Rust) - High Performance Computing About the...  ...powering the next generation of AI? We're looking for a...  ..., annotation tooling, and evaluation systems that leading AI labs...  ...workflows, model training, or benchmarking pipelines Experience... 
    Hourly pay
    Ongoing contract
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    Denver, CO
    1 day ago
  • $164.2k - $229.9k

     ...information, visit Analytics Engineer - Consumer Data Science Check out our r/...  ...is a big plus. Agentic AI-assisted development experience...  ..., and country location, benchmarked against similar stage growth...  ...use this information to evaluate your application for employment... 
    For contractors
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Reddit

    United States
    5 days ago
  • $155k

     ...About the Team The Data Platform team sits...  ...Databricks, to the semantic and AI layers that sit on top....  ...work for everyone - engineers, analysts, and business...  ...trained, aligned and evaluated (RLHF, fine-tuning, prompt...  ...local cost of labor benchmarks for each specific role,... 
    For contractors
    Local area
    Home office
    Flexible hours

    Scribd

    San Francisco, CA
    1 day ago
  • $204k - $259k

     ...The mission of the Waymo AI Foundations team is to...  ...learning, and robust evaluation. This role follows a...  ...Senior Staff Software Engineer. You will: Work...  ...evaluation systems and benchmarks for Waymo Foundation models...  ...large large scale data and evaluation pipelines... 
    Full time
    Temporary work
    Remote work

    Waymo

    Kirkland, WA
    5 days ago
  •  ...Senior Software Developer – Ai Data Engineer Caseware is one of Canada's original Fintech...  ...AI system signals (tracing, feedback, evaluation, and usage data) to support observability...  ...AI systems, enabling offline testing, benchmarking, and continuous improvement of... 
    Local area
    Remote work
    Home office
    Flexible hours

    CaseWare

    United States
    5 days ago
  •  ...leader in sustainability data for real estate-the...  ...sustainability. Data and AI are at the center of...  ...a Director of Engineering, Data & AI to lead the...  ...across the organization Evaluate and adopt modern data...  ...NLP), and intelligent benchmarking Champion the responsible... 
    Local area
    Remote work
    Flexible hours

    Measurabl

    United States
    4 days ago
  • $182k - $260k

     ...resilient, and secure. As an AI-forward enterprise ,...  ...'s largest security data lake to power our cloud...  ...a Principal GenAI Data Engineer to join our team. This...  ...such as LangSmith, Evaluation Framework like Arize Phoenix...  ...'s salary ranges are benchmarked and are determined by role... 
    Full time
    Work at office
    Local area
    Remote work

    Zscaler

    United States
    5 days ago
  •  ...Data Platform Engineer (Python) What if your Python expertise could directly...  ...the world's most advanced AI systems? We're looking for...  ...pipelines, annotation tooling, and evaluation infrastructure that leading...  ..., model training, or benchmarking infrastructure... 
    Hourly pay
    Ongoing contract
    Contract work
    Remote work
    Flexible hours

    Alignerr

    United States
    5 days ago
  •  ...Data Platform Engineer (Python) What if your Python expertise could directly...  ...powering next-generation AI? We're looking for a Senior...  ...pipelines, annotation tooling, and evaluation systems that leading AI...  ...support model training and benchmarking Participate in... 
    Hourly pay
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    United States
    10 days ago
  • $126.8k - $169k

     ...for a seasoned Senior/Lead Data Solution Engineer to join our vibrant team. This...  ...predictive analytics and AI/ML approaches) to identify...  ...scale with your growth. We benchmark roles against external...  ...humans. We remain committed to evaluating candidates fairly,... 
    Local area
    Immediate start
    Remote work
    Flexible hours
    2 days per week

    Meltwater

    Redwood City, CA
    1 day ago
  • $166.8k

     ...threats to our nation and the world. The AI and Data Analytics Division, part of NSD,...  ...teams, we connect foundational research to engineering to operations, providing the tools to...  ...and innovative training strategies) and evaluation (T&E, robustness) for key modalities... 
    For contractors
    Work experience placement
    Work at office
    Local area
    Remote work
    Relocation package
    Flexible hours

    Pacific Northwest National Laboratory

    Seattle, WA
    5 days ago
  •  ...Autonomous Vehicle Metrics and Evaluation Data Scientist - Analytics Austin, TX About the...  ...-driving systems. We work closely with engineering teams to analyze real-world and...  ..., or to perform the essential functions of a job, please email ****@*****.***.ai.... 
    Remote work
    Relocation

    Avride

    Austin, TX
    3 days ago
  •  ...Python Infrastructure Engineer - Model Evaluation (AI Training) About the Role What if your Python...  ...Engineer to design and build the data pipelines, annotation tooling, and evaluation...  ...AI/ML workflows, model training, or benchmarking pipelines Experience with... 
    Hourly pay
    Ongoing contract
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    Seattle, WA
    1 day ago
  • $170k - $210k

     ...experience within our team. We are looking for a Lead Data Engineer to join our team. This is a high-impact, strategic role...  ...of the Data Engineering team Explored and evaluated new tools—including AI-assisted coding platforms like Claude and Windsurf—to improve... 
    Remote work
    Work visa
    Shift work

    Mark43

    New York, NY
    17 days ago
  •  ...Machine Learning / Data Science Engineer CapTech is an award-winning consulting firm that collaborates...  ...engineering, MCP and RAG, and agentic AI architectures Strong understanding of conversational UX and prompt evaluation metrics Experience with agentic frameworks... 
    Work at office
    Remote work
    Visa sponsorship
    Work visa
    Flexible hours

    CapTech Consulting

    Philadelphia, PA
    2 days ago
  •  ...Data Scientist / Machine Learning Engineer (Generative AI Focus) Strategic Staffing Solutions has an opening! This is a contract opportunity with our company...  ...data processing, feature engineering, and model evaluation. Develop and evaluate machine learning models,... 
    Contract work
    Remote work
    Visa sponsorship
    3 days per week

    Leading Utilities Organization

    Boston, MA
    1 day ago
  •  ...Lead Specializing In Machine Learning And Data Engineering Digital products play a central role...  ...: problem framing, approach selection, evaluation strategy, and iteration Data and...  ...alerting) Familiarity with responsible AI and data privacy considerations (PII... 
    Full time

    Coca-Cola Company

    Atlanta, GA
    2 days ago
  •  ...Analytics is seeking a highly motivated Data Scientist / Machine Learning Engineer to support our Department of...  ...defense challenges through the power of AI. Responsibilities Design...  ..., statistical analysis, and model evaluation to ensure high performance and... 
    Full time

    Praescient Analytics

    Arlington, VA
    3 days ago
  •  ...through advanced technologies like AI, computer vision, and facial...  ...Responsibilities Data Platform, Pipelines, & Quality...  ...collection/labeling workflows and evaluation. Implement and optimize...  ...pipelines from feature engineering to deployment and monitoring.... 
    Contract work
    Local area
    Remote work
    Flexible hours

    PEAK Technical Staffing USA

    United States
    1 day ago
  •  ...Senior Machine Learning Engineer, Data & Intelligence Products AcuityMD...  ...technology. We're backed by Benchmark, Redpoint, ICONIQ Growth,...  ...Health. We're a high-growth AI and Data company scaling...  ...experimental design, and model evaluation — and you know when each is... 
    For contractors
    Work at office
    Remote work
    Work from home
    Home office
    Flexible hours

    AcuityMD

    United States
    3 days ago
  •  ...Machine Learning Engineering Manager Join the team redefining...  ...the future of AI at scale. Your focus will...  .... Owning the evaluation infrastructure - Design...  ...-teaming, competitive benchmarking - to guarantee enterprise...  ...Excel at creating data-driven evaluation methodologies... 
    Remote work
    Home office
    Flexible hours

    Blackbird

    United States
    1 day ago
  • $135k - $155k

     ...'re looking for a product-minded Senior Data Engineer to lead the buildout of a new, graph-backed...  ...services (bonus). • Familiarity with AI-assisted development tools (e.g.,...  ...current state of data infrastructure and evaluate graph database and entity resolution options... 
    Remote work

    Fusion Risk Management

    United States
    4 days ago
  • $130k - $170k

     ...You'll Do We're looking for a Senior Data Engineer to join our Retirement Modernization Data...  ...that support analytics, reporting, and AI use cases Enable the end-to-end flow...  ...intelligence tools to assist in reviewing and evaluating job applications, fraud prevention, and... 
    Hourly pay
    Permanent employment
    Temporary work
    Work experience placement
    H1b
    Work at office
    Remote work
    Flexible hours

    Principal Financial Group

    Des Moines, IA
    4 days ago
  •  ...Senior Data Engineer Edelman is a voice synonymous with trust, reimagining a future where...  ...teams, and the application of Generative AI to real production workflows. You'll...  ...ML and Product teams on prompt design, evaluation, and governance, ensuring responsible and... 
    Remote work

    Daniel J Edelman Holdings

    United States
    5 days ago
  • $135k - $155k

     ...Data Engineer - Mid Location US-VA-Quantico ID 2026-4392 Category...  ...areas of Information Technology, Test & Evaluation, Program Mission Support, Engineering &...  ...of CD&I analytics, experimentation, and AI/ML initiatives. The engineer will work closely... 
    Full time
    For contractors
    Remote work

    American Systems

    Quantico, VA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Data Engineer for AI Benchmark Evaluation [Remote]. Be the first to apply!