AI Agent Evaluation Analyst
$55 per hourMindrift
3 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. We’re looking for curious and intellectually proactive contributors, the kind of person who double‑checks assumptions and plays devil’s advocate. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated? This is a flexible, project‑based opportunity well‑suited for analysts, researchers, or consultants with strong critical thinking skills; students (senior undergrads/grad students) looking for an intellectually interesting gig; people open to a part‑time and non‑permanent opportunity. About the project We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem‑solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases. You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit. What you’ll be doing Reviewing evaluation tasks and scenarios for logic, completeness, and realism Identifying inconsistencies, missing assumptions, or unclear decision points Helping define clear expected behaviors (gold standard) for AI agents Annotating cause‑effect relationships, reasoning paths, and plausible alternatives Thinking through complex systems and policies as a human would to ensure agents are tested properly Working closely with QA, writers, or developers to suggest refinements or edge case coverage How to get started Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone. Requirements Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements Familiarity with structured data formats: Can read, not necessarily write JSON/YAML Ability to assess scenarios holistically: What’s missing, what’s unrealistic, what might break? Good communication and clear writing (in English) to document your findings. We also value applicants who have: Experience with policy evaluation, logic puzzles, case studies, or structured scenario design Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research Exposure to LLMs, prompt engineering, or AI‑generated content Familiarity with QA or test‑case thinking (edge cases, failure modes, “what could go wrong”) Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.) Benefits Get paid for your expertise, with rates that can go up to $55/hour depending on your skills, experience, and project needs Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments Participate in an advanced AI project and gain valuable experience to enhance your portfolio Influence how future AI models understand and communicate in your field of expertise #J-18808-Ljbffr Mindrift
$55 per hour
A leading AI innovation firm in Dallas is seeking QAs for autonomous AI agents to ensure the quality of complex systems and scenarios. This flexible, remote project... ...to detail. Candidates should be adept at evaluating scenarios and documenting findings. The role offers...SuggestedRemote jobFlexible hours$65k - $70k
Per Scholas is seeking a Data & Evaluation Analyst to manage learner data and create insightful reports to support organizational decision-making. The role combines analytics with collaboration across various teams, ensuring high-quality data tools are available for enhanced...Suggested$65k - $70k
Position: Data & Evaluation Analyst (Salesforce & Learner Data) REPORTS TO: Director of Data & Insights LOCATION: Unted States TRAVEL: Periodic, approximately 5% Per Scholas preferred hires reside within the following states : AZ, CA, CO, FL, GA, IL, IN, KS, MD, MA,...SuggestedFull timeTemporary workPart timeLocal areaFlexible hours$80 per hour
Get AI‑powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the... ...Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification...SuggestedFreelanceRemote workFlexible hours- ...Calling US-based Google Wallet Users for an Exclusive AI Evaluation Project . Open to recent graduates, professionals, and anyone interested... ...Role Overview : Turing is seeking detail-oriented AI Analysts based in the United States to support a Google Wallet...SuggestedFull timeContract workRemote work
$86.6k - $106k
...future for all. You matter, and so does the impact you can make with us. We have an excellent opportunity for a Sr. Program Evaluation Analyst! This is a full-time, benefits-eligible, fixed-term opportunity. Current funding will expire on December 31, 2030. The...Full timeFixed term contractWork at officeLocal areaNight shift$77.7k - $146.9k
...like RSM. Role Overview The Technology Risk Advisory - AI Risk Senior Associate will play a key role in helping clients strengthen... ...the design, assessment, and governance of AI and GenAI systems, evaluate cloud and data controls, and contribute to emerging risk...Work experience placementInternshipLocal area$75k - $90k
...Description Data Science Analyst AArete is one-of-a-kind when it... ...requirements. In the new age of AI, this role will contribute to the design, evaluation, and responsible use of advanced... ...concepts or hands-on experience with AI agents, multi-step AI workflows, tool/...Temporary workWork experience placementWork at officeFlexible hours- ...Senior Data Analyst Batch AI Job Category: Product Management Requisition Number: SENIO001999 Posted: April 9, 2026 Full-Time... ...-Analyze batching data and device-level feed cycle logs to evaluate real-world system performance. -Build and maintain Python-based...Full timeInternship
- ...how we operate. From generative AI and cloud-native technologies... ...role: The Senior Data Analyst - Agentic AI & GenAI Delivery... ...real-world outcomes. Monitor agent behavior, performance, and reliability... ...validation frameworks to evaluate AI system performance,...Work experience placementWork at officeVisa sponsorship2 days per week
- ...Government Services company , is seeking a Data Analyst I to support KPS and our government... ...in textual data Leverage approved AI tools appropriately while adhering to federal... ...childhood education policy or program evaluation Our Equal Employment Opportunity Policy...Contract workWork at officeLocal areaRemote workFlexible hours
$130k - $150k
...Opportunity for advancement Senior Product Manager, AI Agents Work Type: Full-Time, Onsite Location: Dallas, Texas... ...partners and vendors in the AI ecosystem. Lead A/B testing, agent evaluation, and post-deployment analysis to drive continuous improvement....Full time- ...in and operate cutting-edge data center infrastructure to power AI and high-performance computing, addressing the growing demand for... ...and address risks early in project planning and execution. Evaluate and manage risks across pre-construction, procurement, and construction...Contract workFor contractorsLocal area
- Growth & Development Analyst (Data Centers) Galaxy is a global leader in digital assets and... ...edge data center infrastructure to power AI and high‑performance computing,... ...What You’ll Do: Deal Origination & Site Evaluation Identify and evaluate data center development...For contractorsWork at officeLocal area
- Senior Data Analyst (Machine Learning/Automation) - Full‑time Contract (Open Ended). Location... ...scientists to identify and prioritize AI/ML use cases. Understand business use cases... ...framework for large scale model testing, evaluation, and optimization. Assist with process...Full timeContract workTemporary workWork experience placementFlexible hours
- ...Senior Business Systems Analyst Company description A division of Publicis Groupe, Publicis... ...members. Able to assemble, analyze and evaluate data and be able to make appropriate and... ..., and MS Project Familiarity with AI and generative AI tools (e.g., Copilot, ChatGPT...Local areaRemote workWorldwideShift work
$88.54k - $127.16k
...Lead Business Systems Analyst The Lead Business Systems Analyst will contribute to, produce... .... Able to assemble, analyze and evaluate data and be able to make appropriate and... ...PowerPoint, and MS Project Familiarity with AI and generative AI tools (e.g., Copilot,...Temporary workFreelanceWork at officeLocal areaRemote workFlexible hoursShift work- ...AI Process Analyst / Business Transformation Analyst Locations: Jacksonville, FL / Addison,... ...intellectually curious, innovative, and capable of evaluating business processes to identify... ...techniques, workflows, and AI agents. Help business teams understand and...Contract work
- ...Vulnerability Management Analyst (AI Training) About the Role We're looking for experienced security professionals to help train and evaluate cutting-edge AI systems on real-world vulnerability management. Your hands-on knowledge of how security teams actually...Hourly payOngoing contractContract workFreelanceRemote workWorldwideFlexible hours
$109.55k - $156.5k
...experiences happen when humans and AI work together. Our agentic... ...time Senior Business Systems Analyst (Netsuite) to join our... ...operational metrics Critically evaluates information gathered from multiple... ...autonomous voice-first AI agents that automate calls, assist in...Full timeLocal areaFlexible hours- ...to deliver practical workforce insights, develop predictive and AI-powered models, and lead the integration of advanced analytics across... ...Data Engineer, or equivalent). Experience implementing or evaluating AI-enabled HR technologies or people analytics platforms....Work at office
- ...IT Business Analyst Hunton Andrews Kurth LLP, an international law firm, is actively recruiting... ...improvement and innovation efforts, including AI initiatives, by capturing use cases and requirements, coordinating evaluation activities with requesting attorneys/teams...Contract workWork at office
$84k - $108k
...Senior Data Analyst Gartner is seeking a Senior Data Analyst within the Chief Data and... ...impact the delivery of data products Evaluate proposed technical solutions to ensure alignment... .... Gartner is the world authority on AI At Gartner, you'll join a company at...Work at officeImmediate startWork from home$85.6k - $149.4k
...Riverwoods, IL. As a Senior Business Systems Analyst, you will engage in advanced system... ...: Conduct comprehensive evaluations of system requirements and enhancements.... ...in interviews without the assistance of AI tools or external prompts. Our interview...Work at office$60 - $80 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...workspace files such as business plans, reports, and presentations. Evaluate AI models on professional document understanding and instruction...Contract workSummer workRemote work- Alignerr is seeking a Vulnerability Management Analyst to train AI systems on real-world vulnerability management. You will leverage your hands... ...and autonomy as you analyze vulnerability reports and evaluate remediation strategies, with the potential for ongoing work....Remote job
$76.2k - $151k
...That's why we continuously invest in innovative ideas, such as AI-enabled insights and technology-powered solutions, to enhance our... ...applicable. We are committed to a merit-based hiring process, evaluating all candidates consistently using objective, job-related...Work at officeLocal areaWorldwideFlexible hours- Business Intelligence Data Analyst, Data Insights (PRO00000367) Salary Range: Salary commensurate... ...data-informed decision-making by evaluating student learning outcomes, academic and administrative... ..., and evaluated across digital and AI‑enabled information environments....Work experience placementWork at office
- ...Description & Requirements Maximus is currently hiring for Quality Control Analysts to join our Veterans Evaluation Services (VES) team. This is a remote opportunity. The Quality Control Analyst is responsible for reviewing Medical Disability Examination (“MDE”) reports...Full timeContract workCurrently hiringWork at officeRemote workWork from homeHome officeMonday to Friday
$141.34k
Citibank, N.A. seeks a Data Analytics Senior Analyst for its Irving, TX location. Duties:... ...solve complex system issues through in-depth evaluation of business processes, systems, and... ..., Clustering, and Gradient Boosting) and AI-driven insights using advanced Python libraries...Full timeRemote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Agent Evaluation Analyst. Be the first to apply!




