Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

STEM PhD Researcher for AI Model Evaluation

$70 - $100 per hour

SaidGig

Join a leading AI lab''s cutting-edge research team to be at the core of the AI revolution, where your expertise fuels the development of the most advanced LLMs. Overview

Advanced STEM Researchers and PhD-level subject-matter experts (SMEs) are needed to contribute to a project supporting a frontier-model evaluation effort focused on rigorous scientific and technical reasoning. The AI lab is building next-generation models capable of solving complex, research-grade problems across the sciences and requires deep domain expertise to design, solve, and evaluate the challenging tasks that train and benchmark these systems.

This is a W-2 employment position with Cincinnatus LLC, requiring a commitment of 40 hours per week during weekdays. This position will be placed at a leading AI Lab as part of their extended workforce.

Key Responsibilities
  • Guide research teams to close knowledge gaps in STEM domains by surfacing edge cases, ambiguities, and frontier problems where current models underperform.
  • Design challenging, rigorous domain tasks and write accurate, well-reasoned solutions that demonstrate expert-level scientific and technical reasoning.
  • Evaluate tasks and solutions produced by AI agents and other contributors, providing clear written technical feedback grounded in domain expertise.
  • Develop evaluation frameworks and rubrics for assessing scientific reasoning quality across STEM domains.
  • Collaborate with other subject matter experts to ensure consistency and accuracy in training data.
Core Qualifications
  • PhD (completed, enrolled, or equivalent research track) in Physics, Chemistry, Biology, Mathematics, Statistics, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Materials Science, or another STEM discipline.
  • 3+ years of research, academic, or industry experience in their primary STEM domain.
  • Demonstrated technical expertise in at least one domain: computational modeling, laboratory methods, data analysis, statistical inference, programming, or equivalent scientific methods.
  • Ability to commit to 40 hours per week during weekdays for the duration of the engagement.
  • Prior experience with data annotation, labeling, evaluation, or human feedback collection is a strong plus.
  • Experience with LLMs, AI systems, or agentic workflows; familiarity with agentic frameworks is a plus.
  • Strong written communication skills; ability to explain complex scientific or technical concepts clearly in writing.
About Cincinnatus LLC:

Cincinnatus LLC is an enterprise staffing company that partners with leading technology companies to source and employ highly skilled professionals for contingent and contract-based opportunities. Cincinnatus serves as the employer of record for these engagements, providing W-2 employment, payroll, benefits, and compliance, while placing employees directly within client teams to work on high-impact initiatives.

Equal Employment Opportunity:

Cincinnatus is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or any other legally protected characteristic.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the STEM PhD Researcher for AI Model Evaluation in United States vacancy
  • $50 per hour

     ...tuning large language models, utilizing strong...  ...solid foundation in STEM, particularly at the...  ...exams and graduate or PhD-level programs....  ...learning how to leverage AI, positioning...  ...Collaborate with LLM researchers to align problems with evaluation goals, focusing on areas... 
    For phd
    Contract work
    For contractors
    Freelance
    Remote work

    SaidGig

    United States
    9 days ago
  • $350k

     ...Join a dynamic research team as a Member of Technical...  ...shaping the future of AI-powered legal reasoning...  ...intersection of large language models, agentic systems, and...  ...of rigorous evaluation frameworks to measure and...  ...in Law (JD, LLM, SJD, PhD in Law, or equivalent).... 
    For phd
    Remote job
    Full time

    SaidGig

    United States
    16 days ago
  • $70 - $100 per hour

     ...benchmark aimed at evaluating the capabilities of advanced AI systems in tackling...  ...software to conduct research-level work, including...  ...state-of-the-art AI models, refining problems to...  ...level expertise (MS or PhD preferred) in the...  ...training in a relevant STEM field (MS, PhD, or... 
    For phd
    Immediate start
    Remote work

    SaidGig

    United States
    3 days ago
  • $70 - $100 per hour

     ...scale benchmark aimed at evaluating advanced AI systems'' capabilities in...  ...scientific workflows, testing AI models'' ability to conduct research-level work through...  ...-level expertise (MS or PhD preferred) in the relevant...  ...where the challenge stems from intelligent reasoning... 
    For phd
    Remote work

    SaidGig

    United States
    23 hours ago
  • $75 per hour

    Mathematics professionals with a PhD can apply their expertise to support AI research through flexible, hourly contract work. This role involves evaluating AI-generated content and providing feedback to enhance AI''s understanding of mathematical reasoning, proof construction... 
    For phd
    Hourly pay
    Contract work
    Part time
    Remote work
    Flexible hours

    SaidGig

    United States
    12 days ago
  • $301.75k - $355k

     ...vertically integrated AI infrastructure company...  ...Senior Director for the Model LifeCycle team will undertake...  ...: versioning, lineage, evaluation, and reproducible fine‑...  .... Bonus Points PhD in Machine Learning,...  ...strongly preferred Research publications at NeurIPS... 
    For phd
    Temporary work

    Jobleads-US

    San Francisco, CA
    1 day ago
  • $70 - $100 per hour

     ...Join a leading AI lab’s cutting-edge research team to be at the core of the AI revolution, where your expertise...  ...Advanced Physics Researchers and PhD Students are sought to contribute to a project supporting a frontier-model evaluation effort focused on physics reasoning... 
    For phd
    Remote job
    Work at office

    SaidGig

    Remote
    12 days ago
  • $80 - $135 per hour

     ...for the CritPt benchmark, a leading research-level physics benchmark. This role involves...  ...data that will be instrumental in evaluating large language models on advanced physics reasoning....  ...Ideal Qualifications ~ Solver: PhD or postdoc in the relevant subfield (... 
    For phd
    Hourly pay

    SaidGig

    United States
    23 hours ago
  • $50 per hour

     ...projects focused on enhancing and evaluating large language models through advanced...  ...from early undergraduate to PhD-level topics. Develop and...  ...: Strong analytical and research skills. Ability to provide...  ...collaborate on cutting-edge AI projects with leading... 
    For phd
    Contract work
    Freelance
    Remote work

    SaidGig

    United States
    9 days ago
  • $20 - $60 per hour

     ...train next-generation AI systems. Your contributions...  ...influence how these models learn, reason, and perform...  ...domain expertise to evaluate, annotate, and benchmark...  ...Qualifications: PhD in Biology, Bioinformatics...  ...or equivalent industry/research experience. Proven experience... 
    For phd
    Remote job
    Hourly pay
    Contract work

    SaidGig

    United States
    a month ago
  • $50 per hour

     ...fine-tune large language models (like ChatGPT) using your...  ...exams, as well as graduate or PhD-level programs. You...  ...you learn how to leverage AI to be a better analyst. This...  .... Collaborate with LLM researchers to align problems with evaluation goals, especially in areas... 
    For phd
    Remote job
    Contract work
    For contractors
    Freelance

    SaidGig

    Remote
    9 days ago
  • $50 per hour

     ...the capabilities of large language models. Your expertise will be pivotal in developing innovative evaluation benchmarks and providing insights that drive AI research forward. Key Responsibilities:...  ...undergraduate concepts to advanced PhD-level topics. Analyze model... 
    For phd
    For contractors
    Remote work

    SaidGig

    United States
    9 days ago
  • $50 per hour

     ...team focused on advancing AI through the lens of...  ...fine-tune large language models, enhancing their performance...  ...Collaborate with LLM researchers to align problem types with evaluation goals, particularly in areas...  ...spanning early undergraduate to PhD-level topics.... 
    For phd
    Contract work
    For contractors
    Freelance
    Remote work

    SaidGig

    United States
    9 days ago
  • $50 per hour

     ...projects aimed at improving and evaluating large language models through advanced...  ...entrance exams and graduate or PhD-level programs. You will...  ...future-proof your career in an AI-driven environment. Key...  ...Strong analytical and research skills. Ability to provide... 
    For phd
    Contract work
    Freelance
    Remote work

    SaidGig

    United States
    24 days ago
  • $20 - $60 per hour

     ...to train next-generation AI systems. Your contributions...  ...will directly influence how models learn, reason, and perform...  ...AI benchmarking and evaluation processes. Analyze and...  ...fields. Qualifications PhD or equivalent industry/research experience in a computational... 
    For phd
    Hourly pay
    Contract work
    Remote work

    SaidGig

    United States
    12 days ago
  • $90 - $150 per hour

     ...the advancement of AI systems focused on...  ...reasoning by engaging in research that pushes the...  ...helping to develop models that require...  ...problems for training and evaluation of advanced AI...  ...in mathematics or STEM. Competitive fellowships...  ...research role (PhD candidate, postdoc,... 
    For phd
    Hourly pay
    Part time

    SaidGig

    United States
    23 hours ago
  •  ...Join a leading AI lab''s cutting-edge GenAI...  ...Large Language Models. Overview This role...  ...for Professors and PhD students across all...  ...disciplines, including STEM (Machine Learning,...  ...model training and evaluation, focusing on...  ...Python, applied in research, industry, GitHub,... 
    For phd
    Weekday work

    SaidGig

    United States
    5 days ago
  • $35 - $50 per hour

     ...the advancement of AI systems focused on...  ...reasoning as a Mathematics Researcher. This role offers...  ...concepts and models that require a level...  ...problems for training and evaluating advanced AI models....  ...in mathematics or STEM. Competitive...  ...a research role: PhD candidate, postdoc,... 
    For phd
    Hourly pay
    Part time

    SaidGig

    United States
    12 days ago
  • $125 per hour

     ...accomplished chemistry researchers dedicated to...  ...next generation of AI systems for scientific...  ...contribute to cutting-edge models that reason about...  ...used to train and evaluate frontier AI models....  ...that a working PhD chemist would catch...  ...for chemistry or STEM. Competitive fellowships... 
    For phd
    Remote job
    Hourly pay
    Part time
    Immediate start

    SaidGig

    Remote
    12 days ago
  • $100 per hour

     ...Geoscientists leverage their expertise to support AI research by evaluating AI-generated content and providing critical feedback on geology concepts...  ...projects. Work with your Designated School Official to determine your eligibility. Note that STEM OPT is not supported.... 
    Hourly pay
    Full time
    Contract work
    For contractors
    Remote work
    Flexible hours

    SaidGig

    United States
    12 days ago
  • $35 - $50 per hour

     ...biology and biophysics researchers to contribute to...  ...of next-generation AI systems designed...  ...leading labs to enhance models that reason about...  ...used to train and evaluate advanced AI models....  ...mistakes that a working PhD would recognize....  ...life sciences or STEM. Competitive fellowships... 
    For phd
    Part time

    SaidGig

    United States
    12 days ago
  • $400k

     ...opportunity to work at the forefront of research and development, shaping how...  ...cross-functional teams to align on model training data requirements. Evaluate datasets for diversity, scalability...  ...Experience at leading robotics or AI companies. Familiarity with embodied... 
    Remote job
    Full time

    SaidGig

    Remote
    8 days ago
  • $75 per hour

    Cartographers and photogrammetrists can apply their expertise to evaluate AI models and enhance their understanding of geographical data. In this...  ...School Official to confirm your eligibility. Note that STEM OPT is not supported. For more information on work authorizations... 
    Remote job
    Flexible hours

    SaidGig

    Remote
    12 days ago
  • $75 per hour

     ...and Librarians, play a crucial role in evaluating AI models by leveraging their professional...  ...cataloguing and preserving collections, while researching and acquiring new materials to enhance...  ...confirm your eligibility. Note that STEM OPT is not supported. For more... 
    Remote work
    Flexible hours

    SaidGig

    United States
    12 days ago
  • $123.3k - $140.7k

     ...Associate, Data Scientist - Model Risk Office Data is at...  ...associated with Generative AI (GenAI). Leveraging expertise...  ...Innovative. You continually research and evaluate emerging technologies. You stay...  ...:  Master’s Degree or PhD in “STEM” field (Science, Technology,... 
    For phd
    Full time
    Part time
    Work at office
    Local area
    Flexible hours

    Capital One

    McLean, VA
    a month ago
  • $75 per hour

     ...and Librarians, play a crucial role in evaluating AI models by leveraging their professional...  ...cataloguing and preserving collections while researching and acquiring new materials to enhance...  ...to confirm eligibility. Note that STEM OPT is not supported. Refer to the Help... 
    Remote work
    Flexible hours

    SaidGig

    United States
    12 days ago
  • $75 per hour

     ...professionals can apply their expertise to contribute to AI research projects by evaluating AI model outputs and providing structured feedback. This role...  ...confirm their eligibility for CPT or OPT with their Designated School Official. Note that STEM OPT is not supported.... 
    Remote work
    Flexible hours

    SaidGig

    United States
    12 days ago
  • $60 per hour

     ...Telephone Operators play a vital role in evaluating and enhancing AI models by leveraging their professional experience in telecommunications. This...  ...projects. Consult with your Designated School Official to confirm eligibility. Note that STEM OPT is not supported.... 
    Remote job
    Part time
    Flexible hours

    SaidGig

    Remote
    12 days ago
  • $20 - $55 per hour

     ...help train next-generation AI systems. Your work will shape how models learn, reason, and...  ...Responsibilities Conduct in-depth research and analysis in your area...  ...500 standards. Evaluate and synthesize academic literature...  .... Master’s degree, PhD, or JD from a recognized... 
    For phd
    Hourly pay
    Contract work
    Remote work

    SaidGig

    United States
    23 hours ago
  • $160k - $220k

     ...client, an early-stage, AI-driven startup in the...  ...industry, is hiring an AI Researcher to join their team in...  ...into scalable, robust models that operate effectively...  ...collection campaigns to evaluate system performance in...  ...environments. Skillset ~PhD in Computer Science,... 
    For phd
    Permanent employment

    Alldus International Consulting Ltd

    California
    a month ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to STEM PhD Researcher for AI Model Evaluation. Be the first to apply!