Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Agent Evaluation Analyst

$55 per hour

Mindrift

3 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. We’re looking for curious and intellectually proactive contributors, the kind of person who double‑checks assumptions and plays devil’s advocate. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated? This is a flexible, project‑based opportunity well‑suited for analysts, researchers, or consultants with strong critical thinking skills; students (senior undergrads/grad students) looking for an intellectually interesting gig; people open to a part‑time and non‑permanent opportunity. About the project We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem‑solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases. You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit. What you’ll be doing Reviewing evaluation tasks and scenarios for logic, completeness, and realism Identifying inconsistencies, missing assumptions, or unclear decision points Helping define clear expected behaviors (gold standard) for AI agents Annotating cause‑effect relationships, reasoning paths, and plausible alternatives Thinking through complex systems and policies as a human would to ensure agents are tested properly Working closely with QA, writers, or developers to suggest refinements or edge case coverage How to get started Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone. Requirements Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements Familiarity with structured data formats: Can read, not necessarily write JSON/YAML Ability to assess scenarios holistically: What’s missing, what’s unrealistic, what might break? Good communication and clear writing (in English) to document your findings. We also value applicants who have: Experience with policy evaluation, logic puzzles, case studies, or structured scenario design Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research Exposure to LLMs, prompt engineering, or AI‑generated content Familiarity with QA or test‑case thinking (edge cases, failure modes, “what could go wrong”) Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.) Benefits Get paid for your expertise, with rates that can go up to $55/hour depending on your skills, experience, and project needs Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments Participate in an advanced AI project and gain valuable experience to enhance your portfolio Influence how future AI models understand and communicate in your field of expertise #J-18808-Ljbffr Mindrift

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the AI Agent Evaluation Analyst in Dallas, TX vacancy
  • $55 per hour

    A leading AI innovation firm in Dallas is seeking QAs for autonomous AI agents to ensure the quality of complex systems and scenarios. This flexible, remote project...  ...to detail. Candidates should be adept at evaluating scenarios and documenting findings. The role offers... 
    Suggested
    Remote job
    Flexible hours

    Mindrift

    Dallas, TX
    4 days ago
  • $65k - $70k

    Per Scholas is seeking a Data & Evaluation Analyst to manage learner data and create insightful reports to support organizational decision-making. The role combines analytics with collaboration across various teams, ensuring high-quality data tools are available for enhanced... 
    Suggested

    Per Scholas

    Dallas, TX
    4 days ago
  • $65k - $70k

    Position: Data & Evaluation Analyst (Salesforce & Learner Data) REPORTS TO: Director of Data & Insights LOCATION: Unted States TRAVEL: Periodic, approximately 5% Per Scholas preferred hires reside within the following states : AZ, CA, CO, FL, GA, IL, IN, KS, MD, MA,... 
    Suggested
    Full time
    Temporary work
    Part time
    Local area
    Flexible hours

    Per Scholas

    Dallas, TX
    5 days ago
  • $80 per hour

    Get AI‑powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the...  ...Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification... 
    Suggested
    Freelance
    Remote work
    Flexible hours

    Mindrift

    Dallas, TX
    5 days ago
  •  ...Calling US-based Google Wallet Users for an Exclusive AI Evaluation Project . Open to recent graduates, professionals, and anyone interested...  ...Role Overview : Turing is seeking detail-oriented AI Analysts based in the United States to support a Google Wallet... 
    Suggested
    Full time
    Contract work
    Remote work

    Turing

    Dallas, TX
    2 days ago
  • $86.6k - $106k

     ...future for all. You matter, and so does the impact you can make with us. We have an excellent opportunity for a Sr. Program Evaluation Analyst! This is a full-time, benefits-eligible, fixed-term opportunity. Current funding will expire on December 31, 2030. The... 
    Full time
    Fixed term contract
    Work at office
    Local area
    Night shift

    American Heart Association

    Dallas, TX
    4 days ago
  • $77.7k - $146.9k

     ...like RSM. Role Overview The Technology Risk Advisory - AI Risk Senior Associate will play a key role in helping clients strengthen...  ...the design, assessment, and governance of AI and GenAI systems, evaluate cloud and data controls, and contribute to emerging risk... 
    Work experience placement
    Internship
    Local area

    RSM US LLP

    Dallas, TX
    1 day ago
  • $75k - $90k

     ...Description Data Science Analyst AArete is one-of-a-kind when it...  ...requirements. In the new age of AI, this role will contribute to the design, evaluation, and responsible use of advanced...  ...concepts or hands-on experience with AI agents, multi-step AI workflows, tool/... 
    Temporary work
    Work experience placement
    Work at office
    Flexible hours

    AArete

    Dallas, TX
    1 day ago
  •  ...Senior Data Analyst Batch AI Job Category: Product Management Requisition Number: SENIO001999 Posted: April 9, 2026 Full-Time...  ...-Analyze batching data and device-level feed cycle logs to evaluate real-world system performance. -Build and maintain Python-based... 
    Full time
    Internship

    Command Alkon

    Dallas, TX
    14 days ago
  •  ...how we operate. From generative AI and cloud-native technologies...  ...role: The Senior Data Analyst - Agentic AI & GenAI Delivery...  ...real-world outcomes. Monitor agent behavior, performance, and reliability...  ...validation frameworks to evaluate AI system performance,... 
    Work experience placement
    Work at office
    Visa sponsorship
    2 days per week

    GM Financial

    Irving, TX
    1 day ago
  •  ...Government Services company , is seeking a Data Analyst I to support KPS and our government...  ...in textual data Leverage approved AI tools appropriately while adhering to federal...  ...childhood education policy or program evaluation Our Equal Employment Opportunity Policy... 
    Contract work
    Work at office
    Local area
    Remote work
    Flexible hours

    Koniag Government Services

    Dallas, TX
    2 days ago
  • $130k - $150k

     ...Opportunity for advancement Senior Product Manager, AI Agents Work Type: Full-Time, Onsite Location: Dallas, Texas...  ...partners and vendors in the AI ecosystem. Lead A/B testing, agent evaluation, and post-deployment analysis to drive continuous improvement.... 
    Full time

    SelectMinds

    Dallas, TX
    5 days ago
  •  ...in and operate cutting-edge data center infrastructure to power AI and high-performance computing, addressing the growing demand for...  ...and address risks early in project planning and execution. Evaluate and manage risks across pre-construction, procurement, and construction... 
    Contract work
    For contractors
    Local area

    Galaxy Services

    Dallas, TX
    4 days ago
  • Growth & Development Analyst (Data Centers) Galaxy is a global leader in digital assets and...  ...edge data center infrastructure to power AI and high‑performance computing,...  ...What You’ll Do: Deal Origination & Site Evaluation Identify and evaluate data center development... 
    For contractors
    Work at office
    Local area

    Galaxy

    Dallas, TX
    5 days ago
  • Senior Data Analyst (Machine Learning/Automation) - Full‑time Contract (Open Ended). Location...  ...scientists to identify and prioritize AI/ML use cases. Understand business use cases...  ...framework for large scale model testing, evaluation, and optimization. Assist with process... 
    Full time
    Contract work
    Temporary work
    Work experience placement
    Flexible hours

    Downtown Boulder Partnership

    Addison, TX
    5 days ago
  •  ...Senior Business Systems Analyst Company description A division of Publicis Groupe, Publicis...  ...members. Able to assemble, analyze and evaluate data and be able to make appropriate and...  ..., and MS Project Familiarity with AI and generative AI tools (e.g., Copilot, ChatGPT... 
    Local area
    Remote work
    Worldwide
    Shift work

    Digitas

    Irving, TX
    4 days ago
  • $88.54k - $127.16k

     ...Lead Business Systems Analyst The Lead Business Systems Analyst will contribute to, produce...  .... Able to assemble, analyze and evaluate data and be able to make appropriate and...  ...PowerPoint, and MS Project Familiarity with AI and generative AI tools (e.g., Copilot,... 
    Temporary work
    Freelance
    Work at office
    Local area
    Remote work
    Flexible hours
    Shift work

    MSLGROUP

    Irving, TX
    1 day ago
  •  ...AI Process Analyst / Business Transformation Analyst Locations: Jacksonville, FL / Addison,...  ...intellectually curious, innovative, and capable of evaluating business processes to identify...  ...techniques, workflows, and AI agents. Help business teams understand and... 
    Contract work

    Veterans Sourcing Group LLC

    Addison, TX
    1 day ago
  •  ...Vulnerability Management Analyst (AI Training) About the Role We're looking for experienced security professionals to help train and evaluate cutting-edge AI systems on real-world vulnerability management. Your hands-on knowledge of how security teams actually... 
    Hourly pay
    Ongoing contract
    Contract work
    Freelance
    Remote work
    Worldwide
    Flexible hours

    Alignerr

    Dallas, TX
    5 days ago
  • $109.55k - $156.5k

     ...experiences happen when humans and AI work together. Our agentic...  ...time Senior Business Systems Analyst (Netsuite) to join our...  ...operational metrics Critically evaluates information gathered from multiple...  ...autonomous voice-first AI agents that automate calls, assist in... 
    Full time
    Local area
    Flexible hours

    RingCentral

    Dallas, TX
    2 days ago
  •  ...to deliver practical workforce insights, develop predictive and AI-powered models, and lead the integration of advanced analytics across...  ...Data Engineer, or equivalent). Experience implementing or evaluating AI-enabled HR technologies or people analytics platforms.... 
    Work at office

    RS&H

    Dallas, TX
    4 days ago
  •  ...IT Business Analyst Hunton Andrews Kurth LLP, an international law firm, is actively recruiting...  ...improvement and innovation efforts, including AI initiatives, by capturing use cases and requirements, coordinating evaluation activities with requesting attorneys/teams... 
    Contract work
    Work at office

    Hunton Andrews Kurth LLP

    Dallas, TX
    3 days ago
  • $84k - $108k

     ...Senior Data Analyst Gartner is seeking a Senior Data Analyst within the Chief Data and...  ...impact the delivery of data products Evaluate proposed technical solutions to ensure alignment...  .... Gartner is the world authority on AI At Gartner, you'll join a company at... 
    Work at office
    Immediate start
    Work from home

    Gartner

    Irving, TX
    3 days ago
  • $85.6k - $149.4k

     ...Riverwoods, IL. As a Senior Business Systems Analyst, you will engage in advanced system...  ...: Conduct comprehensive evaluations of system requirements and enhancements....  ...in interviews without the assistance of AI tools or external prompts. Our interview... 
    Work at office

    Wolters Kluwer

    Dallas, TX
    4 days ago
  • $60 - $80 per hour

     ...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors...  ...workspace files such as business plans, reports, and presentations. Evaluate AI models on professional document understanding and instruction... 
    Contract work
    Summer work
    Remote work

    Mercor

    Dallas, TX
    5 days ago
  • Alignerr is seeking a Vulnerability Management Analyst to train AI systems on real-world vulnerability management. You will leverage your hands...  ...and autonomy as you analyze vulnerability reports and evaluate remediation strategies, with the potential for ongoing work.... 
    Remote job

    Alignerr

    Dallas, TX
    21 hours ago
  • $76.2k - $151k

     ...That's why we continuously invest in innovative ideas, such as AI-enabled insights and technology-powered solutions, to enhance our...  ...applicable. We are committed to a merit-based hiring process, evaluating all candidates consistently using objective, job-related... 
    Work at office
    Local area
    Worldwide
    Flexible hours

    Crowe

    Dallas, TX
    2 days ago
  • Business Intelligence Data Analyst, Data Insights (PRO00000367) Salary Range: Salary commensurate...  ...data-informed decision-making by evaluating student learning outcomes, academic and administrative...  ..., and evaluated across digital and AI‑enabled information environments.... 
    Work experience placement
    Work at office

    Southern Methodist University

    Dallas, TX
    4 days ago
  •  ...Description & Requirements Maximus is currently hiring for Quality Control Analysts to join our Veterans Evaluation Services (VES) team. This is a remote opportunity. The Quality Control Analyst is responsible for reviewing Medical Disability Examination (“MDE”) reports... 
    Full time
    Contract work
    Currently hiring
    Work at office
    Remote work
    Work from home
    Home office
    Monday to Friday

    Maximus

    Dallas, TX
    7 hours ago
  • $141.34k

    Citibank, N.A. seeks a Data Analytics Senior Analyst for its Irving, TX location. Duties:...  ...solve complex system issues through in-depth evaluation of business processes, systems, and...  ..., Clustering, and Gradient Boosting) and AI-driven insights using advanced Python libraries... 
    Full time
    Remote work

    Citi

    Irving, TX
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Agent Evaluation Analyst. Be the first to apply!