Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Machine Learning and NLP Practitioner for AI Model Evaluation

$80 - $110 per hour

SaidGig

Join a cutting-edge GenAI team at a leading AI lab, where your expertise will be pivotal in developing advanced AI models. This role focuses on designing and evaluating machine learning and natural language processing tasks that will help identify and address capability gaps in frontier AI models. Key Responsibilities

  • Task design and development: Create challenging, real-world ML and NLP problems from your area of expertise, targeting specific capability gaps in a frontier AI model.
  • Spec and golden-solution generation: Prepare all necessary components for the problems in an agentic development environment using Python.
  • Evaluation and analysis: Assess the target model''s performance on your tasks.
  • Headroom identification: Identify and classify tasks where the target model fails.
  • Collaborate with other experts: Work with fellow subject-matter experts to ensure consistent and accurate evaluations.
Core Qualifications
  • Deep, hands-on experience in machine learning and/or natural language processing, gained through applied industry work, research, or a graduate/PhD background.
  • Working proficiency in Python, applied in research, industry, or open-source projects.
  • Strong command of modern ML/NLP methods, including model training and evaluation, transformers, large language models, and standard tooling.
  • Availability to engage for approximately 20 hours per week.
  • Preferred experience in AI training, model evaluation, or data annotation.
  • Strong written communication skills and the ability to work independently while managing your own time.
Work Terms

This is a part-time W-2 employment position with Cincinnatus LLC, offering the opportunity to work remotely within the United States.

Compensation

Hourly compensation ranges from $80 to $110.

Eligibility

This role is open to candidates located in the United States.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Machine Learning and NLP Practitioner for AI Model Evaluation in United States vacancy
  • $20 - $60 per hour

     ...biology to train next-generation AI systems. Your contributions...  ...directly influence how these models learn, reason, and perform by...  ...biology domain expertise to evaluate, annotate, and benchmark AI systems...  ...biology, bioinformatics, and machine learning within the... 
    Suggested
    Remote job
    Hourly pay
    Contract work

    SaidGig

    United States
    a month ago
  • $220k

     ...This role focuses on advancing the evaluation and development of cutting-edge...  ...operate at the intersection of AI research, software engineering, and model evaluation, designing the...  ...experience in software engineering, machine learning, AI research, evaluation, or related... 
    Suggested
    Remote job
    Full time

    SaidGig

    United States
    23 days ago
  • Apple Inc. is seeking a Senior Machine Learning Engineer in Cupertino, California, to evaluate and refine Apple's AI systems. You will design and develop key infrastructures for model and agent evaluations, contribute to quality improvements, and work closely with product... 
    Suggested

    Apple Inc.

    Cupertino, CA
    3 days ago
  • $150 per hour

    A leading AI data platform is seeking an AI Trainer - Machine Learning Specialist to assist in training cutting-edge AI models. The role involves completing AI training tasks, providing expert feedback, and evaluating AI performance using specialized skills in machine learning... 
    Suggested
    Remote job
    Flexible hours

    Prolific - UK Job Board?

    New York, NY
    3 days ago
  • $224k - $356.5k

     ...the unlimited potential of AI to define the next era of computing...  ...a Senior / Principal Deep Learning Engineer — Model Evaluation & AI Systems, you will play...  ...or assessing contemporary machine learning and deep learning...  ...large language models and NLP, including model behavior... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $60 per hour

     ...contribute to developing cutting-edge AI systems, while enjoying the...  ...advance AI development. AI models are increasingly capable of...  ...but these systems still need practitioners with real-world quantitative...  ...-art AI models on tasks like evaluating AI-generated quantitative... 
    Hourly pay
    Full time
    Remote work
    Flexible hours

    DataAnnotation

    Montgomery, AL
    23 hours ago
  • $20 - $60 per hour

     ...apply your expertise to help train next-generation AI systems. Your work will shape how models learn, reason, and perform through high-quality, real-world...  ...engineering to inform advanced AI benchmarking and evaluation processes. Analyze and provide feedback on AI models... 
    Hourly pay
    Contract work
    Remote work

    SaidGig

    United States
    16 days ago
  • A fast-growing AI company seeks a Software Engineer to focus on Model Evaluation & Benchmarking. This role involves building evaluation systems for multimodal...  ...strong Python programming skills, familiarity with machine learning workflows, and experience in automation... 

    SpreeAI

    San Francisco, CA
    5 days ago
  • Medical Professionals can apply their expertise to evaluate AI models and enhance their understanding of healthcare tasks and terminology. This role involves assessing content relevant to your field and providing clear, structured feedback to improve AI performance. No... 
    Hourly pay
    Temporary work
    Part time
    Remote work
    Flexible hours

    SaidGig

    United States
    23 hours ago
  •  ...member for our ML Data Team to lead video data preparation and evaluation. This role includes defining dataset needs, automating processes...  ...collaboration. Ideal candidates should have over 5 years of experience in AI data operations, proficiency in Python, and strong communication... 
    Flexible hours

    Twelve-Labs

    San Francisco, CA
    2 days ago
  • $300k - $320k

    About the role: We are seeking a Technical Program Manager to lead our AI model evaluation initiatives across multiple workstreams. This role will be crucial in assessing the performance, capabilities, limitations, and potential risks of our AI models. Working closely with... 
    Work at office
    Home office
    Visa sponsorship
    Relocation package

    Anthropic

    Seattle, WA
    3 days ago
  •  ...Health is a Chicago-headquartered medical‑AI company applying artificial intelligence...  ...no longer claims lives. Our foundational model, ABCD (AI Biomarker Cancer Detection), and...  ...The Model Steward governs how ABCD is evaluated, validated, and released — owning the evaluation... 

    accentedge

    Chicago, IL
    1 day ago
  • Nuclear professionals can apply their expertise to evaluate and enhance AI models in their field through a flexible, part-time engagement. This role involves assessing AI-generated content related to nuclear science and providing structured feedback to improve the model... 
    Remote job
    Part time
    Flexible hours

    SaidGig

    United States
    16 days ago
  • $301.75k - $355k

     ...vertically integrated AI infrastructure company...  ...Senior Director for the Model LifeCycle team will...  ...emphasis on utilizing Machine Learning models, including Large...  ...versioning, lineage, evaluation, and reproducible fine...  ...Learning, Computer Science, NLP, or a related field... 
    Temporary work

    Crusoe Energy Systems LLC

    San Francisco, CA
    2 days ago
  • $85 per hour

     ...is looking for an ML Engineer (Coding Agent Experience) in Chicago, IL. The position focuses on using frontier AI coding agents for complex machine learning tasks. Candidates should have at least 2 years of experience and be familiar with various AI tools. Responsibilities... 
    Remote job
    Hourly pay

    ChatGPT Jobs

    Chicago, IL
    5 days ago
  •  ...their ML Data Team. This role focuses on video-language data preparation, model evaluation, and requires strong skills in Python and project management. Ideal candidates should have over 5 years in AI data operations, the ability to manage large datasets, and a commitment... 
    Flexible hours

    Twelve-Labs

    San Francisco, CA
    3 days ago
  • $80 - $110 per hour

     ...Join a leading AI lab''s cutting-edge GenAI team...  ...the most advanced AI models. Overview A leading...  ...AI lab is building and evaluating frontier models and needs...  ...computer vision practitioners to act as ground-truth...  ...methods, including deep learning architectures, evaluation... 
    Hourly pay
    Part time
    Work at office
    Remote work

    SaidGig

    United States
    1 day ago
  • $80 per hour

     ...professionals can leverage their expertise in mineral title and upstream accounting workflows to contribute to AI research projects. This role involves evaluating AI-generated content and providing feedback to enhance AI''s understanding of upstream land management and... 
    Remote job
    Hourly pay
    Contract work
    Part time
    Flexible hours

    SaidGig

    United States
    16 days ago
  • $350k

     ...play a pivotal role in shaping the future of AI-powered legal reasoning. This position...  ...focuses on the intersection of large language models, agentic systems, and legal workflows, emphasizing the development of rigorous evaluation frameworks to measure and enhance AI... 
    Remote job
    Full time

    SaidGig

    United States
    20 days ago
  • $150 per hour

    Aerospace Engineering Professionals can apply their expertise to evaluate AI models and enhance their understanding of aerospace tasks and terminology. This role involves assessing AI-generated content related to your field and providing structured feedback to improve... 
    Hourly pay
    Part time
    Remote work

    SaidGig

    United States
    16 days ago
  • $30 - $90 per hour

     ...As a Go Developer, you will play a crucial role in evaluating and training next-generation AI coding tools during their highly confidential alpha stages...  ...performance optimization. Test and evaluate alpha AI models in Cursor over multiple 4-day, 5+ hour daily bursts.... 
    Remote job
    Hourly pay
    Contract work
    Part time

    SaidGig

    United States
    16 days ago
  • $100 per hour

    Aviation professionals can utilize their industry expertise to enhance AI research projects by evaluating AI model outputs related to their field. This role involves assessing content pertinent to aviation, providing clear and structured feedback that aids in refining the... 
    Temporary work
    Part time
    Remote work
    Flexible hours

    SaidGig

    United States
    23 hours ago
  • $75 per hour

     ...Records Managers, including Archivists, Information Managers, Collections Managers, and Librarians, play a crucial role in evaluating AI models by leveraging their professional expertise. In this position, you will assess AI-generated content relevant to your field, providing... 
    Remote job
    Flexible hours

    SaidGig

    United States
    16 days ago
  • $40 per hour

     ...DataAnnotation is seeking a Biotechnology R&D Scientist to train AI models. In this role, you will evaluate the outputs of AI chatbots and assess their logic to improve model quality. The ideal candidate should have a deep understanding of cell biology, genetics, biochemistry... 
    Hourly pay
    For contractors
    Remote work

    DataAnnotation

    New York, NY
    1 day ago
  •  ...Mercor is seeking AI & Data Science subject-matter experts to...  ...week. You will guide teams, evaluate AI outputs, and ensure the quality...  ...3 years of experience in Machine Learning, Data Science, or Software Engineering...  ...to contribute to advanced AI models in a prominent lab setting.... 

    Mercor Inc

    Greeley, CO
    1 day ago
  •  ...Mercor is hiring for an AI & Data Science role in Covington, KY, where you'll drive...  ...through Cincinnatus LLC demands experts to evaluate AI outputs and refine training data...  ...candidates should have 3+ years of experience in Machine Learning or Data Science, with strong programming... 

    Mercor Inc

    Covington, KY
    1 day ago
  •  ...Cincinnatus LLC is seeking AI & Data Science subject-matter experts to join a leading...  ...will have a strong background in machine learning, data science, and software engineering...  ...guiding teams, designing agentic tasks, and evaluating AI outputs with a focus on detail and... 

    Mercor Inc

    Denton, TX
    3 days ago
  • $40 per hour

    A cutting-edge AI technology firm is looking for experienced quantitative professionals to evaluate AI-generated quantitative analyses and provide critical feedback that shapes future AI systems. This fully remote role allows for a flexible schedule and offers competitive... 
    Hourly pay
    Remote work
    Flexible hours

    DataAnnotation

    Phoenix, AZ
    23 hours ago
  • $40 per hour

    A data science team is seeking experienced quantitative professionals to evaluate AI-generated work and contribute to the development of cutting-edge AI systems. This fully remote position offers flexible scheduling and competitive hourly pay starting at $40+. Ideal candidates... 
    Hourly pay
    Remote work
    Flexible hours

    DataAnnotation

    Madison, WI
    23 hours ago
  • $40 per hour

    A leading AI development company is seeking experienced quantitative professionals for a remote role. Candidates will evaluate AI-generated quantitative work, solve complex problems, and provide valuable feedback. The ideal candidate has 2+ years of experience in a quantitative... 
    Hourly pay
    Full time
    Remote work
    Flexible hours

    DataAnnotation

    Topeka, KS
    23 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning and NLP Practitioner for AI Model Evaluation. Be the first to apply!