Machine Learning and NLP Practitioner for AI Model Evaluation
$80 - $110 per hourSaidGig
Join a cutting-edge GenAI team at a leading AI lab, where your expertise will be pivotal in developing advanced AI models. This role focuses on designing and evaluating machine learning and natural language processing tasks that will help identify and address capability gaps in frontier AI models. Key Responsibilities
- Task design and development: Create challenging, real-world ML and NLP problems from your area of expertise, targeting specific capability gaps in a frontier AI model.
- Spec and golden-solution generation: Prepare all necessary components for the problems in an agentic development environment using Python.
- Evaluation and analysis: Assess the target model''s performance on your tasks.
- Headroom identification: Identify and classify tasks where the target model fails.
- Collaborate with other experts: Work with fellow subject-matter experts to ensure consistent and accurate evaluations.
- Deep, hands-on experience in machine learning and/or natural language processing, gained through applied industry work, research, or a graduate/PhD background.
- Working proficiency in Python, applied in research, industry, or open-source projects.
- Strong command of modern ML/NLP methods, including model training and evaluation, transformers, large language models, and standard tooling.
- Availability to engage for approximately 20 hours per week.
- Preferred experience in AI training, model evaluation, or data annotation.
- Strong written communication skills and the ability to work independently while managing your own time.
This is a part-time W-2 employment position with Cincinnatus LLC, offering the opportunity to work remotely within the United States.
CompensationHourly compensation ranges from $80 to $110.
EligibilityThis role is open to candidates located in the United States.
$20 - $60 per hour
...biology to train next-generation AI systems. Your contributions... ...directly influence how these models learn, reason, and perform by... ...biology domain expertise to evaluate, annotate, and benchmark AI systems... ...biology, bioinformatics, and machine learning within the...SuggestedRemote jobHourly payContract work$220k
...This role focuses on advancing the evaluation and development of cutting-edge... ...operate at the intersection of AI research, software engineering, and model evaluation, designing the... ...experience in software engineering, machine learning, AI research, evaluation, or related...SuggestedRemote jobFull time- Apple Inc. is seeking a Senior Machine Learning Engineer in Cupertino, California, to evaluate and refine Apple's AI systems. You will design and develop key infrastructures for model and agent evaluations, contribute to quality improvements, and work closely with product...Suggested
$150 per hour
A leading AI data platform is seeking an AI Trainer - Machine Learning Specialist to assist in training cutting-edge AI models. The role involves completing AI training tasks, providing expert feedback, and evaluating AI performance using specialized skills in machine learning...SuggestedRemote jobFlexible hours$224k - $356.5k
...the unlimited potential of AI to define the next era of computing... ...a Senior / Principal Deep Learning Engineer — Model Evaluation & AI Systems, you will play... ...or assessing contemporary machine learning and deep learning... ...large language models and NLP, including model behavior...Suggested$60 per hour
...contribute to developing cutting-edge AI systems, while enjoying the... ...advance AI development. AI models are increasingly capable of... ...but these systems still need practitioners with real-world quantitative... ...-art AI models on tasks like evaluating AI-generated quantitative...Hourly payFull timeRemote workFlexible hours$20 - $60 per hour
...apply your expertise to help train next-generation AI systems. Your work will shape how models learn, reason, and perform through high-quality, real-world... ...engineering to inform advanced AI benchmarking and evaluation processes. Analyze and provide feedback on AI models...Hourly payContract workRemote work- A fast-growing AI company seeks a Software Engineer to focus on Model Evaluation & Benchmarking. This role involves building evaluation systems for multimodal... ...strong Python programming skills, familiarity with machine learning workflows, and experience in automation...
- Medical Professionals can apply their expertise to evaluate AI models and enhance their understanding of healthcare tasks and terminology. This role involves assessing content relevant to your field and providing clear, structured feedback to improve AI performance. No...Hourly payTemporary workPart timeRemote workFlexible hours
- ...member for our ML Data Team to lead video data preparation and evaluation. This role includes defining dataset needs, automating processes... ...collaboration. Ideal candidates should have over 5 years of experience in AI data operations, proficiency in Python, and strong communication...Flexible hours
$300k - $320k
About the role: We are seeking a Technical Program Manager to lead our AI model evaluation initiatives across multiple workstreams. This role will be crucial in assessing the performance, capabilities, limitations, and potential risks of our AI models. Working closely with...Work at officeHome officeVisa sponsorshipRelocation package- ...Health is a Chicago-headquartered medical‑AI company applying artificial intelligence... ...no longer claims lives. Our foundational model, ABCD (AI Biomarker Cancer Detection), and... ...The Model Steward governs how ABCD is evaluated, validated, and released — owning the evaluation...
- Nuclear professionals can apply their expertise to evaluate and enhance AI models in their field through a flexible, part-time engagement. This role involves assessing AI-generated content related to nuclear science and providing structured feedback to improve the model...Remote jobPart timeFlexible hours
$301.75k - $355k
...vertically integrated AI infrastructure company... ...Senior Director for the Model LifeCycle team will... ...emphasis on utilizing Machine Learning models, including Large... ...versioning, lineage, evaluation, and reproducible fine... ...Learning, Computer Science, NLP, or a related field...Temporary work$85 per hour
...is looking for an ML Engineer (Coding Agent Experience) in Chicago, IL. The position focuses on using frontier AI coding agents for complex machine learning tasks. Candidates should have at least 2 years of experience and be familiar with various AI tools. Responsibilities...Remote jobHourly pay- ...their ML Data Team. This role focuses on video-language data preparation, model evaluation, and requires strong skills in Python and project management. Ideal candidates should have over 5 years in AI data operations, the ability to manage large datasets, and a commitment...Flexible hours
$80 - $110 per hour
...Join a leading AI lab''s cutting-edge GenAI team... ...the most advanced AI models. Overview A leading... ...AI lab is building and evaluating frontier models and needs... ...computer vision practitioners to act as ground-truth... ...methods, including deep learning architectures, evaluation...Hourly payPart timeWork at officeRemote work$80 per hour
...professionals can leverage their expertise in mineral title and upstream accounting workflows to contribute to AI research projects. This role involves evaluating AI-generated content and providing feedback to enhance AI''s understanding of upstream land management and...Remote jobHourly payContract workPart timeFlexible hours$350k
...play a pivotal role in shaping the future of AI-powered legal reasoning. This position... ...focuses on the intersection of large language models, agentic systems, and legal workflows, emphasizing the development of rigorous evaluation frameworks to measure and enhance AI...Remote jobFull time$150 per hour
Aerospace Engineering Professionals can apply their expertise to evaluate AI models and enhance their understanding of aerospace tasks and terminology. This role involves assessing AI-generated content related to your field and providing structured feedback to improve...Hourly payPart timeRemote work$30 - $90 per hour
...As a Go Developer, you will play a crucial role in evaluating and training next-generation AI coding tools during their highly confidential alpha stages... ...performance optimization. Test and evaluate alpha AI models in Cursor over multiple 4-day, 5+ hour daily bursts....Remote jobHourly payContract workPart time$100 per hour
Aviation professionals can utilize their industry expertise to enhance AI research projects by evaluating AI model outputs related to their field. This role involves assessing content pertinent to aviation, providing clear and structured feedback that aids in refining the...Temporary workPart timeRemote workFlexible hours$75 per hour
...Records Managers, including Archivists, Information Managers, Collections Managers, and Librarians, play a crucial role in evaluating AI models by leveraging their professional expertise. In this position, you will assess AI-generated content relevant to your field, providing...Remote jobFlexible hours$40 per hour
...DataAnnotation is seeking a Biotechnology R&D Scientist to train AI models. In this role, you will evaluate the outputs of AI chatbots and assess their logic to improve model quality. The ideal candidate should have a deep understanding of cell biology, genetics, biochemistry...Hourly payFor contractorsRemote work- ...Mercor is seeking AI & Data Science subject-matter experts to... ...week. You will guide teams, evaluate AI outputs, and ensure the quality... ...3 years of experience in Machine Learning, Data Science, or Software Engineering... ...to contribute to advanced AI models in a prominent lab setting....
- ...Mercor is hiring for an AI & Data Science role in Covington, KY, where you'll drive... ...through Cincinnatus LLC demands experts to evaluate AI outputs and refine training data... ...candidates should have 3+ years of experience in Machine Learning or Data Science, with strong programming...
- ...Cincinnatus LLC is seeking AI & Data Science subject-matter experts to join a leading... ...will have a strong background in machine learning, data science, and software engineering... ...guiding teams, designing agentic tasks, and evaluating AI outputs with a focus on detail and...
$40 per hour
A cutting-edge AI technology firm is looking for experienced quantitative professionals to evaluate AI-generated quantitative analyses and provide critical feedback that shapes future AI systems. This fully remote role allows for a flexible schedule and offers competitive...Hourly payRemote workFlexible hours$40 per hour
A data science team is seeking experienced quantitative professionals to evaluate AI-generated work and contribute to the development of cutting-edge AI systems. This fully remote position offers flexible scheduling and competitive hourly pay starting at $40+. Ideal candidates...Hourly payRemote workFlexible hours$40 per hour
A leading AI development company is seeking experienced quantitative professionals for a remote role. Candidates will evaluate AI-generated quantitative work, solve complex problems, and provide valuable feedback. The ideal candidate has 2+ years of experience in a quantitative...Hourly payFull timeRemote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Machine Learning and NLP Practitioner for AI Model Evaluation. Be the first to apply!
- machine learning remote United States
- machine learning research scientist United States
- machine learning United States
- artificial intelligence - machine learning intern United States
- machine learning part time United States
- machine learning project manager United States
- data engineer machine learning United States
- machine learning scientist United States
- internship machine learning United States
- machine learning researcher United States



