AI Agent Evaluation Analyst
$55 per hourMindrift
3 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. We’re looking for curious and intellectually proactive contributors, the kind of person who double‑checks assumptions and plays devil’s advocate. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated? This is a flexible, project‑based opportunity well‑suited for analysts, researchers, or consultants with strong critical thinking skills; students (senior undergrads/grad students) looking for an intellectually interesting gig; people open to a part‑time and non‑permanent opportunity. About the project We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem‑solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases. You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit. What you’ll be doing Reviewing evaluation tasks and scenarios for logic, completeness, and realism Identifying inconsistencies, missing assumptions, or unclear decision points Helping define clear expected behaviors (gold standard) for AI agents Annotating cause‑effect relationships, reasoning paths, and plausible alternatives Thinking through complex systems and policies as a human would to ensure agents are tested properly Working closely with QA, writers, or developers to suggest refinements or edge case coverage How to get started Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone. Requirements Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements Familiarity with structured data formats: Can read, not necessarily write JSON/YAML Ability to assess scenarios holistically: What’s missing, what’s unrealistic, what might break? Good communication and clear writing (in English) to document your findings. We also value applicants who have: Experience with policy evaluation, logic puzzles, case studies, or structured scenario design Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research Exposure to LLMs, prompt engineering, or AI‑generated content Familiarity with QA or test‑case thinking (edge cases, failure modes, “what could go wrong”) Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.) Benefits Get paid for your expertise, with rates that can go up to $55/hour depending on your skills, experience, and project needs Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments Participate in an advanced AI project and gain valuable experience to enhance your portfolio Influence how future AI models understand and communicate in your field of expertise #J-18808-Ljbffr Mindrift
$55 per hour
A leading AI innovation firm in Dallas is seeking QAs for autonomous AI agents to ensure the quality of complex systems and scenarios. This flexible, remote project... ...to detail. Candidates should be adept at evaluating scenarios and documenting findings. The role offers...SuggestedRemote jobFlexible hours$80 per hour
...Get AI‑powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the... ...Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification...SuggestedFreelanceRemote workFlexible hours$75k - $90k
...Description Data Science Analyst AArete is one-of-a-kind when it... ...requirements. In the new age of AI, this role will contribute to the design, evaluation, and responsible use of advanced... ...concepts or hands-on experience with AI agents, multi-step AI workflows, tool/...SuggestedTemporary workWork experience placementWork at officeFlexible hours- ...improve investigative efficiency. Utilize advanced analytics and AI-assisted techniques to accelerate the identification of... ...techniques to improve overall team capability and maturity. Continuously evaluate and improve hunt processes, tooling, and methodologies to...Suggested
- Envoy Air Job Opportunity Would you like to work for a stable, secure, and fast-growing airline where you will be stimulated, challenged, and have the opportunity to develop your career? If so, read on! Come and work with the best of the best at Envoy Air where you ...SuggestedFull timeFlexible hoursNight shiftRotating shift
- ...technology vendor assessment activities. This role will focus on evaluating technology vendors from an IT risk perspective, supporting audit... ...improvements. Preferred experience with SaaS environments and AI-driven assessment processes. Preferred experience using Drata or...
- Growth & Development Analyst (Data Centers) Galaxy is a global leader in digital assets and... ...edge data center infrastructure to power AI and high‑performance computing,... ...What You’ll Do: Deal Origination & Site Evaluation Identify and evaluate data center development...For contractorsWork at officeLocal area
- Gradient Systematics is seeking an Associate Data Analyst for its offices in Dallas, TX. Job Description... ...data engineering, data visualization (Tableau), AI, and Azure. *Will accept educational equivalency evaluation prepared by a qualified evaluation service. **Knowledge...
- ...Technical Business Analyst Location: Hartford, Connecticut, OR Dallas... ...Intelligence / Agentic AI. Nice to Have: Prior experience... ...Experience with building AI Agents. *** is an Equal Employment... .... All applicants will be evaluated solely on the basis of their ability...Full timeContract workWork at office
- ...in and operate cutting‑edge data center infrastructure to power AI and high‑performance computing, addressing the growing demand for... ...and address risks early in project planning and execution. Evaluate and manage risks across pre‑construction, procurement, and construction...Contract workFor contractorsLocal area
$80k - $90k
...Teads drives value by leveraging predictive AI technology to connect quality media,... ...join our Data & Insights team as a Senior Analyst, Data & Insights. This is a hybrid role that... ...process to ensure an efficient and fair evaluation of candidates. #LI-HYBRID #LI-BAILEY...Full time$89.21k - $144.96k
...Launch your data science or technical analyst career by building AI and analytics solutions that help... ...life, and build equipment risk models Evaluating emerging technologies by evaluating new... ...models, developing new models, creating agents, assistants, and chatbots Ensure long...Full timePart timeInternshipRelocationFlexible hours- ...SquareTrade in Dallas is recruiting for a Strategy & Operations role focused on scaling AI-enabled business process automation. Ideal candidates will blend business judgment with technical problem-solving to enhance operational efficiency. Responsibilities include managing...Remote work
- Jiffy is seeking a Software Development Engineer in Dallas, Texas, to lead the integration of AI coding agents in software production. This role demands extensive experience in managing concurrent workflows and proficiency in modern technology stacks. With a strong focus...
- ## Data and AI Project AnalystApplylocations: Dallas, TXtime type: Full timeposted on... ...Description****Overview**The Data & AI Project Analyst serves as the field-facing connector... ...and identify deviations from standards, evaluating downstream impacts on data, development,...Contract workRemote workOverseasFlexible hours
- ...Nashville Public Radio seeks a dedicated individual to lead the go-to-market strategy for their AI agents. You will manage marketing, partnerships, and user acquisition while collaborating directly with the CEO. The role involves educating a large audience on AI capabilities...Remote job
- ...Description Summary: The Application System Analyst II serves as a liaison between system end-users (customers), operational leaders... .... Collaborates with team members as needed. Proactively evaluates all new release and functionality of applications. Complete...Full time
- ...Job Description Role: PL/SQL Data Analyst Location: Uptown (Dallas), ONE day onsite Duration: 18-24+ months.They will renew... ...chain and facilitate consumption through analytics, modelling, AI, machine learning, dashboarding, and reporting. About the...
- ...AI Integration Business Analyst LOCATION - Hybrid – 3 Days Charlotte, NC; Chicago, IL; Colorado Springs, CO; Conshohocken, PA; Dallas, TX; Fargo, ND; Garden City, NY; Houston, TX; Jacksonville, IL; Lenexa, KS; Los Angeles, CA; Lubbock, TX; Morristown, NJ; Mt Juliet,...
$125k - $140k
...Description Senior Pharmacy Data Analyst AArete is one-of-a-kind when it comes to consulting firm culture. We're a global, innovative... ...datasets to identify trends, anomalies, and cost drivers Evaluate pharmacy benefit utilization, including unit cost and trend...Contract workTemporary workWork at officeFlexible hours- ...Dallas, Texas. The role focuses on architecting sophisticated multi-agent systems using Agentforce, driving technical strategy, and... ...engineers. Ideal candidates will have 8+ years in software development, AI/ML expertise, and notable leadership skills. This position offers...
- Geico is looking for a Senior Product Manager to lead the development of an AI platform that enhances customer experience. You'll drive strategy and collaboration across teams to create high-impact products. The ideal candidate will have over 5 years in product management...
$99k - $232k
PwC (US) is seeking a Managed Services - AI Agent Build & DevOps - Manager in Dallas, Texas to lead innovative data-driven solutions and project management. You will guide teams to drive strategic insights, optimize performance, and mentor junior staff. Ideal candidates...- ...Wealth Management in Dallas, Texas is seeking an Automation Solution Analyst to design and maintain automation solutions using Microsoft... ...skills and offers a dynamic environment focused on innovation in AI and process efficiency. #J-18808-Ljbffr Tolleson Wealth...
- Applied Digital in Dallas, Texas is seeking a motivated Junior Analyst to enhance their content and communications. This role involves crafting investor presentations and sales decks while translating complex data into clear narratives. The ideal candidate will have a Bachelor...
$90k - $120k
..., Corporate Development (M&A / Financial Analyst) Broadstaff - Hiring on behalf of a Data... ...with transaction execution from initial evaluation through closing Corporate Development &... ...growth, high‑demand industry (data centers / AI infrastructure) Location Requirement...Full timeRemote workRelocationRelocation packageFlexible hours- ...develop, maintain, and enforce governance frameworks, operating model, standards, procedures, and policies for Chemicals Establish, evaluate, and monitor DMA performance metrics, providing feedback to the leadership team to improve future plans Establish and...
$95k - $138k
...currently seeking an Information Security Analyst to help monitor, protect, and strengthen... ...participating in post-incident reviews Help evaluate new security technologies and processes... ...culture across studio teams Support AI/ML security monitoring efforts by flagging...Full timeTemporary workPart time- ...Technical Business Analyst Location: Dallas, Texas (Hybrid) | Practice Area: Business Consulting | Type: Permanent Bridge business... ...Talent Acquisition At Capco, we use artificial intelligence (AI) tools to support and enhance several parts of talent...Permanent employment
- ...Software Engineer to architect and lead development of sophisticated agent systems using cutting-edge technology. The role focuses on... ...ideal candidate has extensive experience in backend development, AI frameworks, and cloud platforms. Enhanced work-life balance is offered...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Agent Evaluation Analyst. Be the first to apply!
- agent assistant Dallas, TX
- work from home chat agent Dallas, TX
- telemarketer - state farm agent team member Dallas, TX
- title agent Dallas, TX
- cruise agent Dallas, TX
- import export agent Dallas, TX
- remote chat agent Dallas, TX
- executive protection agent Dallas, TX
- commissioning agent Dallas, TX
- showing agent Dallas, TX


