AI Model Evaluation Lead: Metrics, Bias & Fairness
MERIT Beauty
We are seeking an expert to evaluate and improve our AI models through comprehensive testing and analysis. You will be responsible for designing evaluation frameworks, conducting model assessments, and providing actionable insights for model improvement. Key Responsibilities Design and implement evaluation metrics for AI models Conduct thorough testing of model performance across different scenarios Analyze model outputs for bias, fairness, and accuracy Collaborate with ML engineers to implement improvements Document findings and recommendations Ideal Candidate PhD or Masters in Computer Science, ML, or related field 5+ years experience in AI/ML model evaluation Strong background in statistical analysis Experience with evaluation frameworks and metrics Excellent communication skills #J-18808-Ljbffr
$240.45k - $300.3k
...Learning Engineer - Model Evaluations, Public Sector... ...deploys advanced AI systems-including... ...robustness, and safety metrics, including LLM-... ...measure generalization, bias, explainability,... ...us to ensure a fair and thorough evaluation... ...power the world's leading models, and help...SuggestedFull time- ...AI Model Evaluation Specialist New York, New York, United States About the Job Key Responsibilities: Perform scoring and qualitative... ...prompt tuning or model fine-tuning. Utilize NLP-based metrics and tools (e.g., ROUGE, BLEU, cosine similarity) for...Suggested
$163.6k - $225k
...Scientist focusing on AI & Model Risk, you will lead and coordinate AI... ...and safety guardrails, bias testing, A/B testing,... ...testing as needed. Evaluate ongoing monitoring... ...controls, reliability metrics, CSAT, acceptable performance... ...and local laws and "fair chance" ordinances....SuggestedWork experience placementWork at officeLocal areaRemote workFlexible hours$124k - $177k
...Intelligence & Data (AI&D) organization, you... ...stakeholders to ensure proper modeling processes are... ...practical controls and evaluations (e.g., bias testing, stress... ...features, and evaluation metrics. Parallel validation... ...criteria for drift, bias/fairness, stability,...SuggestedLocal area3 days per week$136.8k - $292.6k
...assurance and evaluating the company's risk... ...scientists and AI developers who... ...for auditing models, including criteria... ...robustness, fairness, interpretability... ...including bias, fairness violations... ...- Measurement Metrics & Statistical... ...TikTok is the leading destination for...SuggestedTemporary workLocal area- A forward-thinking tech company in the United States is seeking a CIO to join their remote team to train AI models. Responsibilities include evaluating chatbot outputs and enhancing AI quality. Candidates should be fluent in English and possess strong analytical capabilities...Hourly payContract workRemote work
- ...A tech company is seeking a Postdoctoral Researcher to evaluate AI chatbots and improve their performance. This role is remote and can be part-time or full-time, allowing you to choose the projects you want to work on. Candidates must have a strong understanding of physics...Hourly payFull timePart timeRemote work
- ...Role: AI Team Lead Location: NYC, NY Project description... ...convert research and models into scalable,... ...model training, evaluation, versioning, and monitoring... ...observability: metrics, logging, model drift... ...validation techniques, and bias/fairness considerations....
- ...The AI Platform Lead owns the definition, creation, and ongoing management... ...the long‑term scaling model for the AI platform,... ...publish, and track platform metrics KPIs/SLAs/SLOs (availability... ...testing strategies, GenAI/LLM evaluation, bias/fairness, model governance, and...Local areaRemote workFlexible hoursShift work
$150 per hour
...Modern MedEd is hiring Psychiatry experts to design clinical scenarios and evaluate AI-generated model outputs in healthcare. This role requires board certification and offers remote, flexible participation at a rate of $150–$350/hr based on experience. Responsibilities...Remote workFlexible hours$40 per hour
...A leading AI systems firm is seeking experienced quantitative professionals to evaluate AI-generated quantitative work and contribute to cutting-edge AI projects. Position offers a fully remote work environment with flexible scheduling and competitive pay, starting at...Hourly payRemote workFlexible hours- ...A technology firm in Pennsylvania is seeking a Physics Expert to train AI models by providing complex physics challenges and evaluating the outputs of these AI chatbots. You must have an expert-level knowledge of physics, particularly in areas like classical mechanics...Hourly payRemote workFlexible hours
- ...A leading AI development firm is seeking experienced quantitative professionals to evaluate AI-generated analyses and contribute to training robust AI models. This remote role offers flexibility and competitive pay, with opportunities for those from diverse quantitative...Remote work
- ...A leading AI data services provider is seeking highly qualified mathematics experts for a remote role focused on evaluating and annotating mathematical content. The ideal candidate should hold a Master's degree in a related field, with a PhD preferred. Responsibilities...Remote workFlexible hours
$50 - $60 per hour
...A technology company in the United States is seeking a Director of Finance to join its team. This role involves evaluating AI models and providing complex problems for chatbots. Candidates should have a strong financial background and ideally hold a Masters or PhD in...Hourly payFull timePart timeRemote work- ...Mercor is seeking Part-time Chemistry Researchers to connect elite talent with leading AI labs. The role involves authoring challenging chemistry problems, evaluating model outputs, and identifying failures. Ideal candidates will have significant publications in top...Hourly payPart timeRemote work
$60 per hour
...and contribute to developing cutting-edge AI systems, while enjoying the flexibility... ...professionals to help advance AI development. AI models are increasingly capable of performing... ...state-of-the-art AI models on tasks like evaluating AI-generated quantitative analysis,...Hourly payFull timeRemote workFlexible hours$40 per hour
...A leading data services company in Pennsylvania is seeking a Research And Development Chemist to join their team. The successful candidate will evaluate AI chatbots by measuring their outputs against complex chemistry questions. The role offers flexibility in project selection...Hourly payContract workRemote work$40 per hour
...A technology company is seeking a Scientific Researcher (Physics) to train AI models by solving complex physics problems and evaluating their outputs. Candidates should have a strong understanding of classical mechanics and related physics concepts. The position offers...Hourly payRemote workFlexible hours$40 per hour
...A dynamic AI company is seeking an R&D Physicist to join their team. In this role, you will evaluate AI chatbots by providing complex physics questions and assess their outputs for correctness and performance. A strong understanding of classical mechanics, electromagnetic...Hourly payRemote work- ...A leading insurance tech company is looking for an Insurance Subject Matter Expert with over 7 years of experience. In this role, you will evaluate AI models and ensure they align with insurance regulations and customer needs. Candidates should have hands-on experience...
$150 per hour
...A leading AI platform is seeking AI Trainers – Machine Learning Specialists to evaluate and train AI models using expert knowledge. Candidates can earn up to $150/hr per completed task. Responsibilities include performing tasks related to AI training and providing feedback...Remote workFlexible hours$60 per hour
Prolific Academic Ltd is looking for Biology Experts and Life Science Professionals to join their Expert Network. This role involves evaluating AI-generated science, fact-checking technical claims, and assessing experimental logic using your scientific expertise....Hourly payWork from home$40 per hour
A tech company specializing in AI training is seeking a Data Analyst (PhD) to enhance AI chatbot performance by providing complex mathematical evaluations. This role requires fluency in English and strong mathematical reasoning skills, with flexibility in work arrangements...Hourly payFull timePart timeRemote work$150 per hour
...A leading AI data platform is seeking an AI Trainer – Machine Learning Specialist to assist in training cutting-edge AI models. The role involves completing AI training tasks, providing expert feedback, and evaluating AI performance using specialized skills in machine...Remote workFlexible hours- A leading data services company is seeking an Applied Mathematician to evaluate AI models by providing complex mathematical problems to chatbots and assessing their outputs for quality and performance. This role offers flexibility with fully remote work and allows you to...Hourly payRemote work
- A data-driven company in the United States is seeking a Data Scientist to train AI models and evaluate their outputs. In this role, you will ensure quality and performance of AI chatbots while working remotely with flexible hours. The ideal candidate will have a strong...Hourly payRemote workFlexible hours
- A leading AI training company is seeking medical experts to evaluate AI chatbots' responses to complex healthcare problems in a REMOTE position. Applicants need to hold a medical degree or be in-progress towards one. Responsibilities include ensuring medical accuracy of...Hourly payRemote work
$130 per hour
...Obsidian is seeking Emergency Medicine experts to design clinical scenarios and evaluate AI models in a remote capacity. This role requires board certification and a current active medical license. Tasks include creating prompts rooted in EM practice, writing high-quality...Remote workFlexible hours$30 per hour
...A prominent AI data company is seeking Advanced Mandarin Speakers for evaluating AI models and completing training tasks. This remote role offers competitive pay of $30/hr for tasks requiring one hour of focused work. Candidates should be fluent in Mandarin with the ability...Remote workWork from homeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Model Evaluation Lead: Metrics, Bias & Fairness. Be the first to apply!

