STEM PhD Researcher for AI Model Evaluation
$70 - $100 per hourSaidGig
Join a leading AI lab''s cutting-edge research team to be at the core of the AI revolution, where your expertise fuels the development of the most advanced LLMs. Overview
Advanced STEM Researchers and PhD-level subject-matter experts (SMEs) are needed to contribute to a project supporting a frontier-model evaluation effort focused on rigorous scientific and technical reasoning. The AI lab is building next-generation models capable of solving complex, research-grade problems across the sciences and requires deep domain expertise to design, solve, and evaluate the challenging tasks that train and benchmark these systems.
This is a W-2 employment position with Cincinnatus LLC, requiring a commitment of 40 hours per week during weekdays. This position will be placed at a leading AI Lab as part of their extended workforce.
Key Responsibilities- Guide research teams to close knowledge gaps in STEM domains by surfacing edge cases, ambiguities, and frontier problems where current models underperform.
- Design challenging, rigorous domain tasks and write accurate, well-reasoned solutions that demonstrate expert-level scientific and technical reasoning.
- Evaluate tasks and solutions produced by AI agents and other contributors, providing clear written technical feedback grounded in domain expertise.
- Develop evaluation frameworks and rubrics for assessing scientific reasoning quality across STEM domains.
- Collaborate with other subject matter experts to ensure consistency and accuracy in training data.
- PhD (completed, enrolled, or equivalent research track) in Physics, Chemistry, Biology, Mathematics, Statistics, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Materials Science, or another STEM discipline.
- 3+ years of research, academic, or industry experience in their primary STEM domain.
- Demonstrated technical expertise in at least one domain: computational modeling, laboratory methods, data analysis, statistical inference, programming, or equivalent scientific methods.
- Ability to commit to 40 hours per week during weekdays for the duration of the engagement.
- Prior experience with data annotation, labeling, evaluation, or human feedback collection is a strong plus.
- Experience with LLMs, AI systems, or agentic workflows; familiarity with agentic frameworks is a plus.
- Strong written communication skills; ability to explain complex scientific or technical concepts clearly in writing.
Cincinnatus LLC is an enterprise staffing company that partners with leading technology companies to source and employ highly skilled professionals for contingent and contract-based opportunities. Cincinnatus serves as the employer of record for these engagements, providing W-2 employment, payroll, benefits, and compliance, while placing employees directly within client teams to work on high-impact initiatives.
Equal Employment Opportunity:Cincinnatus is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or any other legally protected characteristic.
$50 per hour
...tuning large language models, utilizing strong... ...solid foundation in STEM, particularly at the... ...exams and graduate or PhD-level programs.... ...learning how to leverage AI, positioning... ...Collaborate with LLM researchers to align problems with evaluation goals, focusing on areas...For phdContract workFor contractorsFreelanceRemote work$350k
...Join a dynamic research team as a Member of Technical... ...shaping the future of AI-powered legal reasoning... ...intersection of large language models, agentic systems, and... ...of rigorous evaluation frameworks to measure and... ...in Law (JD, LLM, SJD, PhD in Law, or equivalent)....For phdRemote jobFull time$70 - $100 per hour
...benchmark aimed at evaluating the capabilities of advanced AI systems in tackling... ...software to conduct research-level work, including... ...state-of-the-art AI models, refining problems to... ...level expertise (MS or PhD preferred) in the... ...training in a relevant STEM field (MS, PhD, or...For phdImmediate startRemote work$70 - $100 per hour
...scale benchmark aimed at evaluating advanced AI systems'' capabilities in... ...scientific workflows, testing AI models'' ability to conduct research-level work through... ...-level expertise (MS or PhD preferred) in the relevant... ...where the challenge stems from intelligent reasoning...For phdRemote work$75 per hour
Mathematics professionals with a PhD can apply their expertise to support AI research through flexible, hourly contract work. This role involves evaluating AI-generated content and providing feedback to enhance AI''s understanding of mathematical reasoning, proof construction...For phdHourly payContract workPart timeRemote workFlexible hours$301.75k - $355k
...vertically integrated AI infrastructure company... ...Senior Director for the Model LifeCycle team will undertake... ...: versioning, lineage, evaluation, and reproducible fine‑... .... Bonus Points PhD in Machine Learning,... ...strongly preferred Research publications at NeurIPS...For phdTemporary work$70 - $100 per hour
...Join a leading AI lab’s cutting-edge research team to be at the core of the AI revolution, where your expertise... ...Advanced Physics Researchers and PhD Students are sought to contribute to a project supporting a frontier-model evaluation effort focused on physics reasoning...For phdRemote jobWork at office$80 - $135 per hour
...for the CritPt benchmark, a leading research-level physics benchmark. This role involves... ...data that will be instrumental in evaluating large language models on advanced physics reasoning.... ...Ideal Qualifications ~ Solver: PhD or postdoc in the relevant subfield (...For phdHourly pay$50 per hour
...projects focused on enhancing and evaluating large language models through advanced... ...from early undergraduate to PhD-level topics. Develop and... ...: Strong analytical and research skills. Ability to provide... ...collaborate on cutting-edge AI projects with leading...For phdContract workFreelanceRemote work$20 - $60 per hour
...train next-generation AI systems. Your contributions... ...influence how these models learn, reason, and perform... ...domain expertise to evaluate, annotate, and benchmark... ...Qualifications: PhD in Biology, Bioinformatics... ...or equivalent industry/research experience. Proven experience...For phdRemote jobHourly payContract work$50 per hour
...fine-tune large language models (like ChatGPT) using your... ...exams, as well as graduate or PhD-level programs. You... ...you learn how to leverage AI to be a better analyst. This... .... Collaborate with LLM researchers to align problems with evaluation goals, especially in areas...For phdRemote jobContract workFor contractorsFreelance$50 per hour
...the capabilities of large language models. Your expertise will be pivotal in developing innovative evaluation benchmarks and providing insights that drive AI research forward. Key Responsibilities:... ...undergraduate concepts to advanced PhD-level topics. Analyze model...For phdFor contractorsRemote work$50 per hour
...team focused on advancing AI through the lens of... ...fine-tune large language models, enhancing their performance... ...Collaborate with LLM researchers to align problem types with evaluation goals, particularly in areas... ...spanning early undergraduate to PhD-level topics....For phdContract workFor contractorsFreelanceRemote work$50 per hour
...projects aimed at improving and evaluating large language models through advanced... ...entrance exams and graduate or PhD-level programs. You will... ...future-proof your career in an AI-driven environment. Key... ...Strong analytical and research skills. Ability to provide...For phdContract workFreelanceRemote work$20 - $60 per hour
...to train next-generation AI systems. Your contributions... ...will directly influence how models learn, reason, and perform... ...AI benchmarking and evaluation processes. Analyze and... ...fields. Qualifications PhD or equivalent industry/research experience in a computational...For phdHourly payContract workRemote work$90 - $150 per hour
...the advancement of AI systems focused on... ...reasoning by engaging in research that pushes the... ...helping to develop models that require... ...problems for training and evaluation of advanced AI... ...in mathematics or STEM. Competitive fellowships... ...research role (PhD candidate, postdoc,...For phdHourly payPart time- ...Join a leading AI lab''s cutting-edge GenAI... ...Large Language Models. Overview This role... ...for Professors and PhD students across all... ...disciplines, including STEM (Machine Learning,... ...model training and evaluation, focusing on... ...Python, applied in research, industry, GitHub,...For phdWeekday work
$35 - $50 per hour
...the advancement of AI systems focused on... ...reasoning as a Mathematics Researcher. This role offers... ...concepts and models that require a level... ...problems for training and evaluating advanced AI models.... ...in mathematics or STEM. Competitive... ...a research role: PhD candidate, postdoc,...For phdHourly payPart time$125 per hour
...accomplished chemistry researchers dedicated to... ...next generation of AI systems for scientific... ...contribute to cutting-edge models that reason about... ...used to train and evaluate frontier AI models.... ...that a working PhD chemist would catch... ...for chemistry or STEM. Competitive fellowships...For phdRemote jobHourly payPart timeImmediate start$100 per hour
...Geoscientists leverage their expertise to support AI research by evaluating AI-generated content and providing critical feedback on geology concepts... ...projects. Work with your Designated School Official to determine your eligibility. Note that STEM OPT is not supported....Hourly payFull timeContract workFor contractorsRemote workFlexible hours$35 - $50 per hour
...biology and biophysics researchers to contribute to... ...of next-generation AI systems designed... ...leading labs to enhance models that reason about... ...used to train and evaluate advanced AI models.... ...mistakes that a working PhD would recognize.... ...life sciences or STEM. Competitive fellowships...For phdPart time$400k
...opportunity to work at the forefront of research and development, shaping how... ...cross-functional teams to align on model training data requirements. Evaluate datasets for diversity, scalability... ...Experience at leading robotics or AI companies. Familiarity with embodied...Remote jobFull time$75 per hour
Cartographers and photogrammetrists can apply their expertise to evaluate AI models and enhance their understanding of geographical data. In this... ...School Official to confirm your eligibility. Note that STEM OPT is not supported. For more information on work authorizations...Remote jobFlexible hours$75 per hour
...and Librarians, play a crucial role in evaluating AI models by leveraging their professional... ...cataloguing and preserving collections, while researching and acquiring new materials to enhance... ...confirm your eligibility. Note that STEM OPT is not supported. For more...Remote workFlexible hours$123.3k - $140.7k
...Associate, Data Scientist - Model Risk Office Data is at... ...associated with Generative AI (GenAI). Leveraging expertise... ...Innovative. You continually research and evaluate emerging technologies. You stay... ...: Master’s Degree or PhD in “STEM” field (Science, Technology,...For phdFull timePart timeWork at officeLocal areaFlexible hours$75 per hour
...and Librarians, play a crucial role in evaluating AI models by leveraging their professional... ...cataloguing and preserving collections while researching and acquiring new materials to enhance... ...to confirm eligibility. Note that STEM OPT is not supported. Refer to the Help...Remote workFlexible hours$75 per hour
...professionals can apply their expertise to contribute to AI research projects by evaluating AI model outputs and providing structured feedback. This role... ...confirm their eligibility for CPT or OPT with their Designated School Official. Note that STEM OPT is not supported....Remote workFlexible hours$60 per hour
...Telephone Operators play a vital role in evaluating and enhancing AI models by leveraging their professional experience in telecommunications. This... ...projects. Consult with your Designated School Official to confirm eligibility. Note that STEM OPT is not supported....Remote jobPart timeFlexible hours$20 - $55 per hour
...help train next-generation AI systems. Your work will shape how models learn, reason, and... ...Responsibilities Conduct in-depth research and analysis in your area... ...500 standards. Evaluate and synthesize academic literature... .... Master’s degree, PhD, or JD from a recognized...For phdHourly payContract workRemote work$160k - $220k
...client, an early-stage, AI-driven startup in the... ...industry, is hiring an AI Researcher to join their team in... ...into scalable, robust models that operate effectively... ...collection campaigns to evaluate system performance in... ...environments. Skillset ~PhD in Computer Science,...For phdPermanent employment
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to STEM PhD Researcher for AI Model Evaluation. Be the first to apply!
- postdoctoral researcher cosmetic science United States
- trend researcher United States
- academic researcher United States
- online researcher United States
- data collection researcher United States
- visiting researcher United States
- work from home court researcher United States
- design researcher United States
- content researcher United States
- product researcher United States
