AI Agent Evaluation Analyst
$55 per hourMindrift
Overview 2 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe. Who we\'re looking for We\'re looking for curious and intellectually proactive contributors, the kind of person who double-checks assumptions and plays devil\'s advocate. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated? This is a flexible, project-based opportunity well-suited for: Analysts, researchers, or consultants with strong critical thinking skills Students (senior undergrads / grad students) looking for an intellectually interesting gig People open to a part-time and non-permanent opportunity About the project We\'re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you\'ll have to balance quality assurance, research, and logical problem-solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases. You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you\'ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit. What you\'ll be doing Reviewing evaluation tasks and scenarios for logic, completeness, and realism Identifying inconsistencies, missing assumptions, or unclear decision points Helping define clear expected behaviors (gold standards) for AI agents Annotating cause-effect relationships, reasoning paths, and plausible alternatives Thinking through complex systems and policies as a human would to ensure agents are tested properly Working closely with QA, writers, or developers to suggest refinements or edge case coverage How to get started Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone. Requirements Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements Familiarity with structured data formats: Can read, not necessarily write JSON/YAML Ability to assess scenarios holistically: What\'s missing, what\'s unrealistic, what might break? Good communication and clear writing (in English) to document your findings We also value applicants who have Experience with policy evaluation, logic puzzles, case studies, or structured scenario design Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research Exposure to LLMs, prompt engineering, or AI-generated content Familiarity with QA or test-case thinking (edge cases, failure modes, "what could go wrong") Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.) Benefits Get paid for your expertise, with rates that can go up to $55/hour depending on your skills, experience, and project needs Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments Participate in an advanced AI project and gain valuable experience to enhance your portfolio Influence how future AI models understand and communicate in your field of expertise We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr Mindrift
$60 per hour
...ethically shape the future of AI. What We Do The Mindrift platform... ...AI systems are tested and evaluated? This is a flexible, project-... ...opportunity well-suited for: Analysts, researchers, or consultants with... ...for QAs for autonomous AI agents for a new project focused on validating...SuggestedPermanent employmentPart timeFreelanceRemote workFlexible hours$55 per hour
A leading AI innovation firm is seeking QAs for autonomous AI agents to improve evaluation frameworks. Candidates should possess excellent analytical thinking and attention to detail to review tasks and define clear standards. This remote, flexible opportunity offers rates...SuggestedRemote jobPart timeFlexible hours$60 per hour
...innovation company is seeking QAs for autonomous AI agents to validate and improve task structures and evaluate logic. The role requires excellent analytical thinking... ...rates up to $60/hour. This position is ideal for analysts or students looking to contribute meaningfully to...SuggestedRemote jobFlexible hours$80 per hour
...-time opportunity focused on quality assurance for autonomous AI agents. You will analyze complex systems, review tasks for logic, and... ...analytical and detail-oriented skills, with experience in policy evaluation or logic puzzles preferred. Compensation can reach up to $80/...SuggestedRemote jobPart timeFlexible hours$80 per hour
2 days ago Be among the first 25 applicants Get AI‑powered advice on this job and more exclusive features. This opportunity is... ...Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You'll implement base methods for agent action verification...SuggestedPart timeFreelanceRemote workFlexible hours- ...AI Solutions Analyst Location: NC-RTP, US Contract Type: Regular Full-Time Would you like to join an international team working to improve... ...detection, and scoring workflows. Participate in model evaluation, validation, monitoring, and lifecycle maintenance. Prepare...Full timeContract workWork at office
$95k - $105k
...monitoring efforts of the organization’s Artificial Intelligence (AI) governance and compliance program. Works closely with... ...regulatory examinations related to AI governance and controls. Evaluate documentation and artifacts for completeness and accuracy (e.g.,...Work experience placementRemote workWork from home$40 per hour
A technology company is seeking experienced cybersecurity professionals to join their REMOTE team. The role involves evaluating AI-generated security content and solving technical cybersecurity problems. Candidates should have 2+ years in cybersecurity with some coding...Remote jobHourly payFlexible hours$50 - $60 per hour
A legal consulting firm in North Carolina is seeking a Legal Specialist to evaluate AI models by providing complex legal problems. Candidates must hold a law degree and have 5+ years of experience in various legal fields. This role allows flexibility in project selection...Hourly payFor contractorsFlexible hours- ...RWD Data Analyst Company: Norstella Location: Remote, United States Date Posted:... ...patient access. Each organization (Citeline, Evaluate, MMIT, Panalgo, and The Dedham Group)... ...solutions prior to escalation Leverage AI-powered tools (e.g., ChatGPT, Claude) to...Full timeTemporary workLocal areaRemote workFlexible hours
$67.12k - $114.47k
The State of North Carolina - Health & Human Services is seeking a Program Analyst II to oversee early intervention programs. The role requires advanced analytical skills, overseeing program compliance and leading collaborative efforts to improve service delivery. Ideal...$95k - $120k
...Use structured problem-solving to resolve business challenges, evaluate options, and deliver data-informed recommendations. Identify and... ...with IT, analytics, and project management teams. Experience with AI-enabled analytics, automation, or decision-support solutions,...Remote workWork from home- Join to apply for the Online Data Analyst Odia role at TELUS Digital AI Data Solutions Are you a detail-oriented individual with a passion for research... ...millions of people worldwide Completing research and evaluation tasks in a web-based environment such as verifying and...Part timeFreelanceLocal areaWorldwide
- ...solutions firm is seeking an Online Data Analyst for a fully remote part-time position. Candidates... ...map content through online research and evaluation tasks. This entry-level role offers... ...team making a difference in the world of AI and data solutions. #J-18808-Ljbffr TELUS...Remote jobPart timeFlexible hours
$40 per hour
A leading AI development team is seeking experienced quantitative professionals for a flexible remote role involving evaluation of AI-generated work. Ideal candidates have over 2 years of experience in quantitative analysis, strong coding skills, and a background in fields...Remote jobHourly payFlexible hours$80.9k - $103.95k
...operational excellence. This role is expected to leverage modern AI-enabled tools (e.g., Microsoft Copilot, Gemini, and similar... ...development. Applies advanced analytical and business judgment to evaluate complex and ambiguous problems, generating actionable insights that...Contract workTemporary workWork experience placementLocal areaFlexible hours$40 per hour
A leading AI development company is seeking experienced quantitative professionals to evaluate and validate AI-generated work. This fully remote position allows for a flexible schedule, with projects paying $40+ per hour. Candidates should have 2+ years of experience in...Remote jobHourly payFlexible hours$40 per hour
A leading AI development firm is seeking experienced quantitative professionals to evaluate AI-generated work and ensure its accuracy. This role involves designing problems to help train AI systems and requires a strong background in statistical methods and modeling. Enjoy...Remote jobHourly pay- ...strategies, policies, standards and control frameworks. The role evaluates established and emerging data practices and technologies to... ...Provisioning Point / Authorized Data Source program within the AI & Data Organization. Support the definition of a data domain based...Permanent employmentFull timePart timeH1bWork at officeWork visaShift workDay shift
$40 per hour
A cutting-edge AI company is looking for experienced quantitative professionals to evaluate AI-generated quantitative work. You will analyze statistical models, solve quantitative problems, and help validate AI outputs. Candidates should have 2+ years of experience in...Remote jobHourly payFlexible hours$60 per hour
...the DataAnnotation team and contribute to developing cutting‑edge AI systems, while enjoying the flexibility of remote work and... ...you'll work closely with state‑of‑the‑art AI models on tasks like evaluating AI‑generated quantitative analysis, solving technical problems,...Hourly payFull timeRemote workFlexible hours$60 per hour
A leading AI development firm is seeking experienced quantitative professionals for remote work. Responsibilities include evaluating AI-generated analysis, designing quantitative problems, and providing feedback to enhance AI models. Candidates should have 2+ years of...Remote jobHourly pay- ...Description & Requirements Maximus is currently hiring for Quality Control Analysts to join our Veterans Evaluation Services (VES) team. This is a remote opportunity. The Quality Control Analyst is responsible for reviewing Medical Disability Examination (“MDE”) reports...Full timeContract workCurrently hiringWork at officeRemote workWork from homeHome officeMonday to Friday
$40 per hour
A cutting-edge AI development firm is seeking experienced quantitative professionals to evaluate AI-generated quantitative analysis, solve technical problems, and provide valuable feedback. Enjoy a fully remote role with a flexible schedule, where you can work from anywhere...Remote jobHourly payFlexible hours$60 per hour
A pioneering AI development organization is seeking quantitative professionals to evaluate AI-generated analyses and conduct statistical work. You will work remotely, selecting projects at your convenience, with competitive pay up to $60/hour. Ideal candidates have at least...Remote job$50 - $60 per hour
A data solutions company is seeking a Risk Analyst to evaluate AI models and enhance their performance. The role is remote, offering flexible projects and competitive pay ($50-$60 per hour). Ideal candidates are fluent in English with expertise in corporate finance and...Remote jobHourly payFlexible hours- ...focused technology company is seeking a Value-Based Performance Reporting Analyst to train AI models in North Carolina. This remote position requires expertise in healthcare, enabling the evaluation and enhancement of AI chatbot performance. Candidates must be fluent in...Remote jobHourly payFlexible hours
$80.6k - $145k
Overview Sr RW Programmer/Sr Data Scientist/Analyst - Real World Data (US and UK Only) Syneos... ...analysis datasets, tables, and figures, evaluating programming processes, and suggesting... ...healthcare data into a common format). Use of AI/ML for LLM and workflow is ideal. Nice to...Flexible hours- A leading AI firm is seeking an Asset Management Analyst to join its remote team. This role offers flexibility to work part-time or full-time and allows... ...in shaping AI understanding. Responsibilities include evaluating AI outputs against complex financial scenarios and providing...Remote jobFull timePart time
- ...institutions. We specialize in leveraging advanced technologies such as AI, cloud, and data‑led innovation to help our clients accelerate... ...and consulting initiatives focusing on domain‑specific product evaluation, benchmarking studies, and domain‑specific system configurations...Temporary workRelocation
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Agent Evaluation Analyst. Be the first to apply!


