AI Agent Evaluation Analyst

$55 per hour

Mindrift

3 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. We’re looking for curious and intellectually proactive contributors, the kind of person who double‑checks assumptions and plays devil’s advocate. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated? This is a flexible, project‑based opportunity well‑suited for analysts, researchers, or consultants with strong critical thinking skills; students (senior undergrads/grad students) looking for an intellectually interesting gig; people open to a part‑time and non‑permanent opportunity. About the project We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem‑solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases. You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit. What you’ll be doing Reviewing evaluation tasks and scenarios for logic, completeness, and realism Identifying inconsistencies, missing assumptions, or unclear decision points Helping define clear expected behaviors (gold standard) for AI agents Annotating cause‑effect relationships, reasoning paths, and plausible alternatives Thinking through complex systems and policies as a human would to ensure agents are tested properly Working closely with QA, writers, or developers to suggest refinements or edge case coverage How to get started Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone. Requirements Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements Familiarity with structured data formats: Can read, not necessarily write JSON/YAML Ability to assess scenarios holistically: What’s missing, what’s unrealistic, what might break? Good communication and clear writing (in English) to document your findings. We also value applicants who have: Experience with policy evaluation, logic puzzles, case studies, or structured scenario design Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research Exposure to LLMs, prompt engineering, or AI‑generated content Familiarity with QA or test‑case thinking (edge cases, failure modes, “what could go wrong”) Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.) Benefits Get paid for your expertise, with rates that can go up to $55/hour depending on your skills, experience, and project needs Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments Participate in an advanced AI project and gain valuable experience to enhance your portfolio Influence how future AI models understand and communicate in your field of expertise #J-18808-Ljbffr Mindrift

Apply

Vacancy posted 10 hours ago

Similar jobs that could be interesting for youBased on the AI Agent Evaluation Analyst in Dallas, TX vacancy

Remote AI Agent QA & Evaluation Specialist
$55 per hour
A leading AI innovation firm in Dallas is seeking QAs for autonomous AI agents to ensure the quality of complex systems and scenarios. This flexible, remote project... ...to detail. Candidates should be adept at evaluating scenarios and documenting findings. The role offers...
Suggested
Remote job
Flexible hours
Mindrift
Dallas, TX
4 days ago
MCP & Tools Python Developer - Agent Evaluation Infrastructure
$80 per hour
...Get AI‑powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the... ...Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification...
Suggested
Freelance
Remote work
Flexible hours
Mind Rift
Dallas, TX
2 days ago
Analyst, AI
$75k - $90k
...Description Data Science Analyst AArete is one-of-a-kind when it... ...requirements. In the new age of AI, this role will contribute to the design, evaluation, and responsible use of advanced... ...concepts or hands-on experience with AI agents, multi-step AI workflows, tool/...
Suggested
Temporary work
Work experience placement
Work at office
Flexible hours
AArete
Dallas, TX
1 day ago
Cyber Threat Management Analyst, Specialist
...improve investigative efficiency. Utilize advanced analytics and AI-assisted techniques to accelerate the identification of... ...techniques to improve overall team capability and maturity. Continuously evaluate and improve hunt processes, tooling, and methodologies to...
Suggested
Limelight Health
Dallas, TX
2 days ago
Full Time Lead Airport Agent - Central Analyst
Envoy Air Job Opportunity Would you like to work for a stable, secure, and fast-growing airline where you will be stimulated, challenged, and have the opportunity to develop your career? If so, read on! Come and work with the best of the best at Envoy Air where you ...
Suggested
Full time
Flexible hours
Night shift
Rotating shift
Envoy Air
Dallas, TX
4 days ago
Risk Specialist (Third Party IT Assessment)
...technology vendor assessment activities. This role will focus on evaluating technology vendors from an IT risk perspective, supporting audit... ...improvements. Preferred experience with SaaS environments and AI-driven assessment processes. Preferred experience using Drata or...
Pinnacle Group, Inc.
Dallas, TX
10 hours ago
Growth & Development Analyst (Data Centers)
Growth & Development Analyst (Data Centers) Galaxy is a global leader in digital assets and... ...edge data center infrastructure to power AI and high‑performance computing,... ...What You’ll Do: Deal Origination & Site Evaluation Identify and evaluate data center development...
For contractors
Work at office
Local area
Galaxy
Dallas, TX
10 hours ago
Associate Data Analyst
Gradient Systematics is seeking an Associate Data Analyst for its offices in Dallas, TX. Job Description... ...data engineering, data visualization (Tableau), AI, and Azure. *Will accept educational equivalency evaluation prepared by a qualified evaluation service. **Knowledge...
Gradientsystematics
Dallas, TX
3 days ago
Senior Technical Business Analyst
...Technical Business Analyst Location: Hartford, Connecticut, OR Dallas... ...Intelligence / Agentic AI. Nice to Have: Prior experience... ...Experience with building AI Agents. *** is an Equal Employment... .... All applicants will be evaluated solely on the basis of their ability...
Full time
Contract work
Work at office
Yantran LLC
Dallas, TX
2 days ago
Data Center Risk Associate
...in and operate cutting‑edge data center infrastructure to power AI and high‑performance computing, addressing the growing demand for... ...and address risks early in project planning and execution. Evaluate and manage risks across pre‑construction, procurement, and construction...
Contract work
For contractors
Local area
Galaxy Services
Dallas, TX
1 day ago
Senior Analyst, Data & Insights
$80k - $90k
...Teads drives value by leveraging predictive AI technology to connect quality media,... ...join our Data & Insights team as a Senior Analyst, Data & Insights. This is a hybrid role that... ...process to ensure an efficient and fair evaluation of candidates. #LI-HYBRID #LI-BAILEY...
Full time
Teads
Dallas, TX
2 days ago
Data Science/Analyst
$89.21k - $144.96k
...Launch your data science or technical analyst career by building AI and analytics solutions that help... ...life, and build equipment risk models Evaluating emerging technologies by evaluating new... ...models, developing new models, creating agents, assistants, and chatbots Ensure long...
Full time
Part time
Internship
Relocation
Flexible hours
Caterpillar Inc.
Irving, TX
2 days ago
Remote AI Automation Solutions Analyst
...SquareTrade in Dallas is recruiting for a Strategy & Operations role focused on scaling AI-enabled business process automation. Ideal candidates will blend business judgment with technical problem-solving to enhance operational efficiency. Responsibilities include managing...
Remote work
SquareTrade
Dallas, TX
1 day ago
AI-Driven SDE: Orchestrate Multi-Agent Features
Jiffy is seeking a Software Development Engineer in Dallas, Texas, to lead the integration of AI coding agents in software production. This role demands extensive experience in managing concurrent workflows and proficiency in modern technology stacks. With a strong focus...
Jiffy
Dallas, TX
1 day ago
Data and AI Project Analyst
## Data and AI Project AnalystApplylocations: Dallas, TXtime type: Full timeposted on... ...Description****Overview**The Data & AI Project Analyst serves as the field-facing connector... ...and identify deviations from standards, evaluating downstream impacts on data, development,...
Contract work
Remote work
Overseas
Flexible hours
DPR Construction
Dallas, TX
3 days ago
Remote Growth Architect for AI Agents
...Nashville Public Radio seeks a dedicated individual to lead the go-to-market strategy for their AI agents. You will manage marketing, partnerships, and user acquisition while collaborating directly with the CEO. The role involves educating a large audience on AI capabilities...
Remote job
Jobleads-US
Dallas, TX
10 hours ago
Epic Analyst - HIM
...Description Summary: The Application System Analyst II serves as a liaison between system end-users (customers), operational leaders... .... Collaborates with team members as needed. Proactively evaluates all new release and functionality of applications. Complete...
Full time
Christus Health
Irving, TX
10 hours ago
PL/SQL Data Analyst
...Job Description Role: PL/SQL Data Analyst Location: Uptown (Dallas), ONE day onsite Duration: 18-24+ months.They will renew... ...chain and facilitate consumption through analytics, modelling, AI, machine learning, dashboarding, and reporting. About the...
3Core Systems
Dallas, TX
1 day ago
AI integration Business Analyst
...AI Integration Business Analyst LOCATION - Hybrid – 3 Days Charlotte, NC; Chicago, IL; Colorado Springs, CO; Conshohocken, PA; Dallas, TX; Fargo, ND; Garden City, NY; Houston, TX; Jacksonville, IL; Lenexa, KS; Los Angeles, CA; Lubbock, TX; Morristown, NJ; Mt Juliet,...
RIT Solutions
Dallas, TX
4 days ago
Senior Pharmacy Data Analyst
$125k - $140k
...Description Senior Pharmacy Data Analyst AArete is one-of-a-kind when it comes to consulting firm culture. We're a global, innovative... ...datasets to identify trends, anomalies, and cost drivers Evaluate pharmacy benefit utilization, including unit cost and trend...
Contract work
Temporary work
Work at office
Flexible hours
AArete
Dallas, TX
3 days ago
Senior AI Agent Architect
...Dallas, Texas. The role focuses on architecting sophisticated multi-agent systems using Agentforce, driving technical strategy, and... ...engineers. Ideal candidates will have 8+ years in software development, AI/ML expertise, and notable leadership skills. This position offers...
Salesforce
Dallas, TX
10 hours ago
Senior AI Agent Platform Product Manager
Geico is looking for a Senior Product Manager to lead the development of an AI platform that enhances customer experience. You'll drive strategy and collaboration across teams to create high-impact products. The ideal candidate will have over 5 years in product management...
Geico
Dallas, TX
2 days ago
AI Agent Build & DevOps Leader — Data-Driven Strategy
$99k - $232k
PwC (US) is seeking a Managed Services - AI Agent Build & DevOps - Manager in Dallas, Texas to lead innovative data-driven solutions and project management. You will guide teams to drive strategic insights, optimize performance, and mentor junior staff. Ideal candidates...
PwC (US)
Dallas, TX
1 day ago
AI-Driven Automation Solutions Analyst
...Wealth Management in Dallas, Texas is seeking an Automation Solution Analyst to design and maintain automation solutions using Microsoft... ...skills and offers a dynamic environment focused on innovation in AI and process efficiency. #J-18808-Ljbffr Tolleson Wealth...
Tolleson Wealth Management
Dallas, TX
2 days ago
Junior Analyst, Content & Communications: AI-Powered Decks
Applied Digital in Dallas, Texas is seeking a motivated Junior Analyst to enhance their content and communications. This role involves crafting investor presentations and sales decks while translating complex data into clear narratives. The ideal candidate will have a Bachelor...
Applied Digital
Dallas, TX
4 days ago
Senior Associate, Corporate Development - Data Centers (M&A / Financial Analyst)
$90k - $120k
..., Corporate Development (M&A / Financial Analyst) Broadstaff - Hiring on behalf of a Data... ...with transaction execution from initial evaluation through closing Corporate Development &... ...growth, high‑demand industry (data centers / AI infrastructure) Location Requirement...
Full time
Remote work
Relocation
Relocation package
Flexible hours
Broadstaff
Dallas, TX
10 hours ago
Data Governance Consultant
...develop, maintain, and enforce governance frameworks, operating model, standards, procedures, and policies for Chemicals Establish, evaluate, and monitor DMA performance metrics, providing feedback to the leadership team to improve future plans Establish and...
Software Technology Inc
Dallas, TX
10 hours ago
Information Security Analyst
$95k - $138k
...currently seeking an Information Security Analyst to help monitor, protect, and strengthen... ...participating in post-incident reviews Help evaluate new security technologies and processes... ...culture across studio teams Support AI/ML security monitoring efforts by flagging...
Full time
Temporary work
Part time
Probably Monsters
Dallas, TX
10 hours ago
Technical Business Analyst
...Technical Business Analyst Location: Dallas, Texas (Hybrid) | Practice Area: Business Consulting | Type: Permanent Bridge business... ...Talent Acquisition At Capco, we use artificial intelligence (AI) tools to support and enhance several parts of talent...
Permanent employment
Capco
Dallas, TX
4 days ago
Senior Staff AI Agent Architect
...Software Engineer to architect and lead development of sophisticated agent systems using cutting-edge technology. The role focuses on... ...ideal candidate has extensive experience in backend development, AI frameworks, and cloud platforms. Enhanced work-life balance is offered...
salesforce.com, inc.
Dallas, TX
10 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Agent Evaluation Analyst. Be the first to apply!