AI Agent Evaluation Analyst (Freelance)
$60 per hourMindrift
Location & Eligibility This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. About Mindrift At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. Who We’re Looking For We’re looking for curious and intellectually proactive contributors—people who double‑check assumptions and play devil’s advocate. If you thrive in ambiguity, enjoy remote asynchronous work, and want to learn how modern AI systems are tested and evaluated, we want to hear from you. Project Overview We are seeking QA experts for autonomous AI agents in a project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you will balance quality assurance, research, and logical problem‑solving. Responsibilities Review evaluation tasks and scenarios for logic, completeness, and realism. Identify inconsistencies, missing assumptions, or unclear decision points. Define clear expected behaviours (gold standards) for AI agents. Annotate cause‑effect relationships, reasoning paths, and plausible alternatives. Think through complex systems and policies as a human would to ensure agents are tested properly. Collaborate with QA, writers, or developers to suggest refinements or edge‑case coverage. Requirements Excellent analytical thinking: ability to reason about complex systems, scenarios, and logical implications. Strong attention to detail: spot contradictions, ambiguities, and vague requirements. Familiarity with structured data formats: read (not necessarily write) JSON/YAML. Ability to assess scenarios holistically: identify what’s missing, unrealistic, or potentially breaking. Good communication and clear writing (in English) to document findings. We also value applicants who have: Experience with policy evaluation, logic puzzles, case studies, or structured scenario design. Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research. Exposure to LLMs, prompt engineering, or AI‑generated content. Familiarity with QA or test‑case thinking (edge cases, failure modes, "what could go wrong"). Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.). Benefits Competitive pay up to $60/hour depending on skills, experience, and project needs. Flexible, remote, freelance project that fits around your primary professional or academic commitments. Advanced AI project experience to enhance your portfolio. Opportunity to influence how future AI models understand and communicate in your field of expertise. #J-18808-Ljbffr Mindrift
$60 per hour
...firm is seeking legal consultants with US law experience for part-time, project-based opportunities. You will generate prompts for AI, evaluate solutions, and improve reasoning standards. Ideal candidates have a law degree and 2+ years of legal experience. Strong written...FreelancePart time- Join to apply for the Online Data Analyst Odia role at TELUS Digital AI Data Solutions Are you a detail-oriented... ...national and local geography? This freelance opportunity allows you to work at... ...worldwide Completing research and evaluation tasks in a web-based environment...FreelancePart timeLocal areaWorldwide
- ...parties. Join Our Team Agentic AI Engineering Intern Engineering & Innovation Denver, CO Analyst/Sr. Analyst, Capital Markets... ...Engineer, Enterprise Systems & Agent Integrations Operational... ...Staff, Agent Workflow Systems and Evaluation Operational Excellence California...SuggestedInternshipRemote workNight shift
- Central Health is seeking a Compensation Analyst - Core Compensation in Austin, Texas. This role focuses on supporting compensation programs through market analysis and job evaluation, ensuring competitive and equitable pay practices across the organization. The ideal...Suggested
$73 per hour
A leading AI consultancy is seeking a Quantitative Statistics Expert to work flexibly as a freelance AI Trainer. This remote role requires a Bachelor's degree in Statistics and... ...include generating AI prompts and evaluating model accuracy, allowing you to impact the...FreelanceRemote job$55 per hour
A leading AI firm is seeking a Freelance Biology Expert with proficiency in Python to contribute to advanced AI projects from the comfort of your home. The role involves generating prompts, evaluating AI responses, and leveraging your expertise in Biology. Candidates should...FreelanceRemote job$55 per hour
...forward-thinking tech company is seeking an Electrical Engineer with Python experience to work as a freelance AI Trainer. The role involves generating prompts, evaluating AI models, and correcting responses based on your expertise. This part-time, remote position offers...FreelancePart timeRemote work$35 - $65 per hour
Invisible Agency is seeking a Pure Mathematics Specialist for a Freelance AI Trainer Project. This remote position requires a theoretical mathematics expert to construct and evaluate complex proofs, ensuring rigor and correctness. Ideal candidates are fluent in Lean 4...FreelanceRemote jobHourly pay$8 - $30 per hour
Invisible Agency is seeking a LaTeX Specialist for a freelance AI Trainer Project to audit and refine AI-generated text. The ideal candidate... ...LaTeX expressions, rewriting text, and applying consistent evaluation rubrics. The pay ranges from $8 to $30 per hour, depending on...FreelanceRemote jobHourly pay- A global leader in customer experience is seeking a Freelance Luxury Brand Evaluator in Austin, TX. In this role, you will assess customer experiences with high-end brands by visiting stores or evaluating online. Enjoy flexible assignments and compensation based on your...FreelanceFlexible hours
- ...& Acquisitions Attorney to support the development of AI tools. This senior-level freelance role requires at least 4 years of hands-on experience in... ...corporate law. The attorney will review AI-generated content, evaluate financial and legal reasoning, and work closely with a...FreelanceRemote jobHourly payFor contractors
$8 - $65 per hour
Invisible Agency is seeking a remote Hebrew Translator for an AI Trainer project. In this freelance role, you will evaluate and refine AI-generated translations and ensure accuracy, fluency, and cultural context. Ideal candidates will have a background in translation or...FreelanceRemote jobHourly pay- Invisible Agency is looking for a PHP Coding Specialist for a freelance AI Trainer project. In this role, you will apply your PHP expertise to shape the future of AI by evaluating code quality, conversing with AI on engineering tasks, and suggesting improvements. Ideal...FreelanceRemote jobHourly payContract work
$11 - $30.65 per hour
Invisible Agency is looking for an Audio Specialist as a Freelance AI Trainer to evaluate advanced audio models. In this remote role, you will create scenarios that simulate real customer service interactions and assess model performance based on various criteria. The...FreelanceRemote jobHourly pay$60 per hour
...and indicate your level of English proficiency. Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment. What This...FreelancePermanent employmentTemporary workPart time10 hours per week$55 per hour
...and indicate your level of English proficiency. Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment. What This Opportunity...FreelancePermanent employmentTemporary workPart time10 hours per week- Drone Systems Specialist - Freelance AI Trainer Project United States of America We are seeking a highly skilled specialist with deep... ...across large datasets. Typical day responsibilities include evaluating diverse drone images and applying strict annotation standards...FreelanceHourly payContract workFor contractorsRemote work
$26 - $28 per hour
...Data Labeling Analyst Welo Data is looking for detail-oriented and reliable individuals... ...Labeling Analysts, supporting speech and voice AI systems. This is a high-impact... ...this role is more execution-focused than evaluation-heavy roles, it still requires strong judgment...Full timeWork experience placementRemote workVisa sponsorship$8 - $65 per hour
STEM Specialist - AI Trainer (Contract, Remote) Location: United States of America (remote... ...improvements to prompt engineering and evaluation metrics. You will also document failure... ..., and aiding engineers, scientists, and analysts. Responsibilities Converse with the model...FreelanceHourly payContract workRemote work$26 - $28 per hour
...Data Labeling Analyst Welo Data is looking for detail-oriented and reliable individuals... ...Labeling Analysts, supporting speech and voice AI systems. This is a high-impact... ...this role is more execution-focused than evaluation-heavy roles, it still requires strong judgment...Full timeWork experience placementRemote workVisa sponsorship$8 - $65 per hour
Mongolian Language Specialist - AI Trainer United States of America Are you an experienced Mongolian language professional eager to... ...to training data. You’ll work with cutting‑edge AI tools, evaluate and refine Mongolian text outputs, and provide expert feedback on...FreelanceHourly payContract workFor contractorsRemote work$23 per hour
...curious people from around the world with freelance online tasks that train and improve... ...Annotators connects individuals with Generative AI projects from leading tech innovators.... ...projects such as rating AI-generated content, evaluating factual accuracy, or comparing responses...FreelancePart timeRemote work$45 per hour
...intelligence to ethically shape the future of AI. What We Do The Mindrift platform... ...Define comprehensive scoring criteria to evaluate the accuracy of the AI's answers. Correct... ...with challenging, complex guidelines. Our freelance role is fully remote, so you just need a...FreelancePart timeRemote work$8 - $30 per hour
LaTeX Specialist - Freelance AI Trainer Project World Wide - Remote Project Overview We are sourcing independent LaTeX Specialists to provide their expertise for an AI benchmark evaluation project. As AI models increasingly generate complex long-form content that blends...FreelanceHourly payFor contractorsRemote work$55 per hour
Electrical Engineer with Python Experience - Freelance AI Trainer 1 week ago Be among the first 25 applicants This opportunity is only for... ...that challenge AI. Define comprehensive scoring criteria to evaluate the accuracy of the AI's answers. Correct the model's...FreelancePart timeRemote work$110k - $160k
Hellopatient is seeking a technical AI Agent Product Manager in Austin, Texas, to lead the delivery of AI agents tailored for healthcare... ...and continuously enhance agent performance through structured evaluation and real-world feedback. The position offers a competitive...$8 - $65 per hour
Philosophy Specialist - Freelance AI Trainer Project United States of America Are you a philosophy expert eager to shape the future of... ...traces, and suggest improvements to our prompt engineering and evaluation metrics. A master’s or PhD in philosophy or a closely...FreelanceHourly payContract workFor contractorsImmediate startRemote work$35 - $65 per hour
Pure Mathematics Specialist - Freelance AI Trainer Project World Wide - Remote Are you a theoretical mathematics expert eager to shape the... .... Responsibilities On a typical day, you will construct and evaluate complex proofs, substantiate the mathematical reasoning for...FreelanceHourly payContract workFor contractorsRemote work$8 - $65 per hour
...experienced Tajik language professional eager to shape the future of AI? Large‑scale language models are evolving rapidly, moving beyond... ...to training data. You’ll work with cutting‑edge AI tools, evaluate and refine Tajik text outputs, and provide expert feedback on grammar...FreelanceHourly payContract workFor contractorsRemote workWorldwide- German Voice Actor - Freelance AI Trainer Project United States of America Are you an experienced German voice actor eager to shape the... ...training data. You’ll work with cutting‑edge AI tools, record and evaluate German speech samples, and provide expert feedback on...FreelanceHourly payContract workFor contractorsRemote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Agent Evaluation Analyst (Freelance). Be the first to apply!
- freight agent no experience Austin, TX
- state farm agent Austin, TX
- work from home chat agent Austin, TX
- special agent Austin, TX
- commissioning agent Austin, TX
- agent assistant Austin, TX
- executive protection agent Austin, TX
- cruise agent Austin, TX
- telemarketer - state farm agent team member Austin, TX
- agent Austin, TX

