Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Agent Evaluation Analyst (Freelance)

$60 per hour

Mindrift

Location & Eligibility This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. About Mindrift At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. Who We’re Looking For We’re looking for curious and intellectually proactive contributors—people who double‑check assumptions and play devil’s advocate. If you thrive in ambiguity, enjoy remote asynchronous work, and want to learn how modern AI systems are tested and evaluated, we want to hear from you. Project Overview We are seeking QA experts for autonomous AI agents in a project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you will balance quality assurance, research, and logical problem‑solving. Responsibilities Review evaluation tasks and scenarios for logic, completeness, and realism. Identify inconsistencies, missing assumptions, or unclear decision points. Define clear expected behaviours (gold standards) for AI agents. Annotate cause‑effect relationships, reasoning paths, and plausible alternatives. Think through complex systems and policies as a human would to ensure agents are tested properly. Collaborate with QA, writers, or developers to suggest refinements or edge‑case coverage. Requirements Excellent analytical thinking: ability to reason about complex systems, scenarios, and logical implications. Strong attention to detail: spot contradictions, ambiguities, and vague requirements. Familiarity with structured data formats: read (not necessarily write) JSON/YAML. Ability to assess scenarios holistically: identify what’s missing, unrealistic, or potentially breaking. Good communication and clear writing (in English) to document findings. We also value applicants who have: Experience with policy evaluation, logic puzzles, case studies, or structured scenario design. Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research. Exposure to LLMs, prompt engineering, or AI‑generated content. Familiarity with QA or test‑case thinking (edge cases, failure modes, "what could go wrong"). Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.). Benefits Competitive pay up to $60/hour depending on skills, experience, and project needs. Flexible, remote, freelance project that fits around your primary professional or academic commitments. Advanced AI project experience to enhance your portfolio. Opportunity to influence how future AI models understand and communicate in your field of expertise. #J-18808-Ljbffr Mindrift

Vacancy posted 10 hours ago
Similar jobs that could be interesting for youBased on the AI Agent Evaluation Analyst (Freelance) in Austin, TX vacancy
  • $60 per hour

     ...firm is seeking legal consultants with US law experience for part-time, project-based opportunities. You will generate prompts for AI, evaluate solutions, and improve reasoning standards. Ideal candidates have a law degree and 2+ years of legal experience. Strong written... 
    Freelance
    Part time

    Mindrift

    Austin, TX
    1 day ago
  • Join to apply for the Online Data Analyst Odia role at TELUS Digital AI Data Solutions Are you a detail-oriented...  ...national and local geography? This freelance opportunity allows you to work at...  ...worldwide Completing research and evaluation tasks in a web-based environment... 
    Freelance
    Part time
    Local area
    Worldwide

    TELUS Digital AI Data Solutions

    Austin, TX
    1 day ago
  •  ...parties. Join Our Team Agentic AI Engineering Intern Engineering & Innovation Denver, CO Analyst/Sr. Analyst, Capital Markets...  ...Engineer, Enterprise Systems & Agent Integrations Operational...  ...Staff, Agent Workflow Systems and Evaluation Operational Excellence California... 
    Suggested
    Internship
    Remote work
    Night shift

    SB Energy

    Austin, TX
    4 days ago
  • Central Health is seeking a Compensation Analyst - Core Compensation in Austin, Texas. This role focuses on supporting compensation programs through market analysis and job evaluation, ensuring competitive and equitable pay practices across the organization. The ideal... 
    Suggested

    Central Health

    Austin, TX
    10 hours ago
  • $73 per hour

    A leading AI consultancy is seeking a Quantitative Statistics Expert to work flexibly as a freelance AI Trainer. This remote role requires a Bachelor's degree in Statistics and...  ...include generating AI prompts and evaluating model accuracy, allowing you to impact the... 
    Freelance
    Remote job

    Mindrift

    Austin, TX
    2 days ago
  • $55 per hour

    A leading AI firm is seeking a Freelance Biology Expert with proficiency in Python to contribute to advanced AI projects from the comfort of your home. The role involves generating prompts, evaluating AI responses, and leveraging your expertise in Biology. Candidates should... 
    Freelance
    Remote job

    Mindrift

    Austin, TX
    1 day ago
  • $55 per hour

     ...forward-thinking tech company is seeking an Electrical Engineer with Python experience to work as a freelance AI Trainer. The role involves generating prompts, evaluating AI models, and correcting responses based on your expertise. This part-time, remote position offers... 
    Freelance
    Part time
    Remote work

    Mindrift

    Austin, TX
    3 days ago
  • $35 - $65 per hour

    Invisible Agency is seeking a Pure Mathematics Specialist for a Freelance AI Trainer Project. This remote position requires a theoretical mathematics expert to construct and evaluate complex proofs, ensuring rigor and correctness. Ideal candidates are fluent in Lean 4... 
    Freelance
    Remote job
    Hourly pay

    Invisible Agency

    Austin, TX
    4 days ago
  • $8 - $30 per hour

    Invisible Agency is seeking a LaTeX Specialist for a freelance AI Trainer Project to audit and refine AI-generated text. The ideal candidate...  ...LaTeX expressions, rewriting text, and applying consistent evaluation rubrics. The pay ranges from $8 to $30 per hour, depending on... 
    Freelance
    Remote job
    Hourly pay

    Invisible Agency

    Austin, TX
    3 days ago
  • A global leader in customer experience is seeking a Freelance Luxury Brand Evaluator in Austin, TX. In this role, you will assess customer experiences with high-end brands by visiting stores or evaluating online. Enjoy flexible assignments and compensation based on your... 
    Freelance
    Flexible hours

    CXG

    Austin, TX
    10 hours ago
  •  ...& Acquisitions Attorney to support the development of AI tools. This senior-level freelance role requires at least 4 years of hands-on experience in...  ...corporate law. The attorney will review AI-generated content, evaluate financial and legal reasoning, and work closely with a... 
    Freelance
    Remote job
    Hourly pay
    For contractors

    Invisible Agency

    Austin, TX
    3 days ago
  • $8 - $65 per hour

    Invisible Agency is seeking a remote Hebrew Translator for an AI Trainer project. In this freelance role, you will evaluate and refine AI-generated translations and ensure accuracy, fluency, and cultural context. Ideal candidates will have a background in translation or... 
    Freelance
    Remote job
    Hourly pay

    Invisible Agency

    Austin, TX
    10 hours ago
  • Invisible Agency is looking for a PHP Coding Specialist for a freelance AI Trainer project. In this role, you will apply your PHP expertise to shape the future of AI by evaluating code quality, conversing with AI on engineering tasks, and suggesting improvements. Ideal... 
    Freelance
    Remote job
    Hourly pay
    Contract work

    Invisible Agency

    Austin, TX
    4 days ago
  • $11 - $30.65 per hour

    Invisible Agency is looking for an Audio Specialist as a Freelance AI Trainer to evaluate advanced audio models. In this remote role, you will create scenarios that simulate real customer service interactions and assess model performance based on various criteria. The... 
    Freelance
    Remote job
    Hourly pay

    Invisible Agency

    Austin, TX
    10 hours ago
  • $60 per hour

     ...and indicate your level of English proficiency. Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment. What This... 
    Freelance
    Permanent employment
    Temporary work
    Part time
    10 hours per week

    Mindrift

    Austin, TX
    2 days ago
  • $55 per hour

     ...and indicate your level of English proficiency. Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment. What This Opportunity... 
    Freelance
    Permanent employment
    Temporary work
    Part time
    10 hours per week

    Mindrift

    Austin, TX
    2 days ago
  • Drone Systems Specialist - Freelance AI Trainer Project United States of America We are seeking a highly skilled specialist with deep...  ...across large datasets. Typical day responsibilities include evaluating diverse drone images and applying strict annotation standards... 
    Freelance
    Hourly pay
    Contract work
    For contractors
    Remote work

    Invisible Agency

    Austin, TX
    10 hours ago
  • $26 - $28 per hour

     ...Data Labeling Analyst Welo Data is looking for detail-oriented and reliable individuals...  ...Labeling Analysts, supporting speech and voice AI systems. This is a high-impact...  ...this role is more execution-focused than evaluation-heavy roles, it still requires strong judgment... 
    Full time
    Work experience placement
    Remote work
    Visa sponsorship

    Welocalize

    Austin, TX
    5 days ago
  • $8 - $65 per hour

    STEM Specialist - AI Trainer (Contract, Remote) Location: United States of America (remote...  ...improvements to prompt engineering and evaluation metrics. You will also document failure...  ..., and aiding engineers, scientists, and analysts. Responsibilities Converse with the model... 
    Freelance
    Hourly pay
    Contract work
    Remote work

    Invisible Agency

    Austin, TX
    2 days ago
  • $26 - $28 per hour

     ...Data Labeling Analyst Welo Data is looking for detail-oriented and reliable individuals...  ...Labeling Analysts, supporting speech and voice AI systems. This is a high-impact...  ...this role is more execution-focused than evaluation-heavy roles, it still requires strong judgment... 
    Full time
    Work experience placement
    Remote work
    Visa sponsorship

    Welocalize

    Austin, TX
    3 days ago
  • $8 - $65 per hour

    Mongolian Language Specialist - AI Trainer United States of America Are you an experienced Mongolian language professional eager to...  ...to training data. You’ll work with cutting‑edge AI tools, evaluate and refine Mongolian text outputs, and provide expert feedback on... 
    Freelance
    Hourly pay
    Contract work
    For contractors
    Remote work

    Invisible Agency

    Austin, TX
    10 hours ago
  • $23 per hour

     ...curious people from around the world with freelance online tasks that train and improve...  ...Annotators connects individuals with Generative AI projects from leading tech innovators....  ...projects such as rating AI-generated content, evaluating factual accuracy, or comparing responses... 
    Freelance
    Part time
    Remote work

    Toloka Annotators

    Austin, TX
    6 days ago
  • $45 per hour

     ...intelligence to ethically shape the future of AI. What We Do The Mindrift platform...  ...Define comprehensive scoring criteria to evaluate the accuracy of the AI's answers. Correct...  ...with challenging, complex guidelines. Our freelance role is fully remote, so you just need a... 
    Freelance
    Part time
    Remote work

    Mindrift

    Austin, TX
    1 day ago
  • $8 - $30 per hour

    LaTeX Specialist - Freelance AI Trainer Project World Wide - Remote Project Overview We are sourcing independent LaTeX Specialists to provide their expertise for an AI benchmark evaluation project. As AI models increasingly generate complex long-form content that blends... 
    Freelance
    Hourly pay
    For contractors
    Remote work

    Invisible Agency

    Austin, TX
    4 days ago
  • $55 per hour

    Electrical Engineer with Python Experience - Freelance AI Trainer 1 week ago Be among the first 25 applicants This opportunity is only for...  ...that challenge AI. Define comprehensive scoring criteria to evaluate the accuracy of the AI's answers. Correct the model's... 
    Freelance
    Part time
    Remote work

    Mindrift

    Austin, TX
    10 hours ago
  • $110k - $160k

    Hellopatient is seeking a technical AI Agent Product Manager in Austin, Texas, to lead the delivery of AI agents tailored for healthcare...  ...and continuously enhance agent performance through structured evaluation and real-world feedback. The position offers a competitive... 

    Hellopatient

    Austin, TX
    1 day ago
  • $8 - $65 per hour

    Philosophy Specialist - Freelance AI Trainer Project United States of America Are you a philosophy expert eager to shape the future of...  ...traces, and suggest improvements to our prompt engineering and evaluation metrics. A master’s or PhD in philosophy or a closely... 
    Freelance
    Hourly pay
    Contract work
    For contractors
    Immediate start
    Remote work

    Invisible Agency

    Austin, TX
    4 days ago
  • $35 - $65 per hour

    Pure Mathematics Specialist - Freelance AI Trainer Project World Wide - Remote Are you a theoretical mathematics expert eager to shape the...  .... Responsibilities On a typical day, you will construct and evaluate complex proofs, substantiate the mathematical reasoning for... 
    Freelance
    Hourly pay
    Contract work
    For contractors
    Remote work

    Invisible Agency

    Austin, TX
    4 days ago
  • $8 - $65 per hour

     ...experienced Tajik language professional eager to shape the future of AI? Large‑scale language models are evolving rapidly, moving beyond...  ...to training data. You’ll work with cutting‑edge AI tools, evaluate and refine Tajik text outputs, and provide expert feedback on grammar... 
    Freelance
    Hourly pay
    Contract work
    For contractors
    Remote work
    Worldwide

    Invisible Agency

    Austin, TX
    3 days ago
  • German Voice Actor - Freelance AI Trainer Project United States of America Are you an experienced German voice actor eager to shape the...  ...training data. You’ll work with cutting‑edge AI tools, record and evaluate German speech samples, and provide expert feedback on... 
    Freelance
    Hourly pay
    Contract work
    For contractors
    Remote work

    Invisible Agency

    Austin, TX
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Agent Evaluation Analyst (Freelance). Be the first to apply!