Get new jobs by email
- ...technical experts to support a frontier-model evaluation project focused on agentic workflows. You will design and validate challenging benchmark tasks in data science, machine learning, finance, and coding to help identify reasoning and problem-solving gaps in advanced...SuggestedRemote jobContract workFor contractors
$80 - $120 per hour
...were building the talent engine that helps leading labs and research orgs move AI forward. Our latest initiative focuses on benchmarking and improving model performance and training speed across real ML workloads. If youre an early-career Machine Learning Engineer...SuggestedRemote jobPart timeFor contractors- ...machine learning engineering tasks. The work focuses on translating practical ML research and engineering workflows into structured benchmarks for frontier models. This is a project-based, remote opportunity suited for experts with hands-on ML research experience. Key...SuggestedRemote jobContract workTemporary workFor contractors
- ...with expected conversational behaviorand system guidelines Apply consistent evaluation standardsby following clear taxonomies, benchmarks, and detailed evaluation guidelines Who You Are You hold aBS, MS, or PhD in Computer Science or a closely related field...SuggestedFull timeContract workPart timeFor contractorsRemote work
- ...expectations for scientific accuracy, reasoning quality, and clarity, comparing Korean and English responses where needed. Support Benchmarking and Quality Assurance: Collaborate in QA review processes to ensure prompt tasks and rubrics meet scientific standards,...SuggestedRemote jobContract workFor contractors
- ...assess preliminary outputs for accuracy, fluency, and cultural fit in Japanese, comparing results against English where needed. Benchmarking Quality Assurance: Collaborate in QA review processes to ensure prompt tasks and rubrics meet rigor, maintaining consistency...SuggestedRemote jobContract workTemporary workFor contractorsFreelanceFlexible hours
- ...Schedule and participate in IEP and other meetings for students with disabilities Prepare and administer all standardized tests, benchmark assessments and evaluation assessments as directed Work with the teaching staff to improve standardized and proficiency testing...SuggestedTemporary workLocal areaAfternoon shift
- ...expectations for scientific accuracy, reasoning quality, and clarity. Compare German and English responses where needed. Support Benchmarking and Quality Assurance Collaborate in QA review processes to ensure prompt tasks and rubrics meet scientific standards....SuggestedRemote jobContract workFor contractors
- ...explaining investigative and adversarial decisions. Identify weaknesses in AI threat analysis and suggest improvements. Help refine benchmarks for detection, triage, and attack simulation accuracy. Requirements Experience ~5+ years in cybersecurity with experience...SuggestedRemote jobFull timeContract workPart timeFor contractorsFlexible hours
$40 - $60 per hour
...to investment performance tables, portfolio analysis, or financial modelling ~ Experience interpreting market comparison data, benchmarking reports, and sector performance metrics ~ Familiarity with trading analytics, financial ratios, and performance indicators ~...SuggestedRemote jobContract workPart timeFor contractorsFlexible hours- ...expected conversational behavior and system guidelines Apply consistent evaluation standards by following clear taxonomies, benchmarks, and detailed evaluation guidelines Who You Are You hold a PhD in Chemistry or a closely related field You have deep expertise...SuggestedRemote jobFull timeContract workPart timeFor contractorsImmediate startFlexible hours
- ...-quality human data: annotate failures, classify vulnerabilities, and flag systemic risks Apply structure: follow taxonomies, benchmarks, and playbooks to keep testing consistent Document reproducibly: produce reports, datasets, and attack cases customers can act...SuggestedRemote jobFull timeContract workPart time
- ...matter expert for pricing tools and implementation. Market & Competitive Intelligence: Conduct market analysis and competitive benchmarking to ensure pricing competitiveness, translating raw market data into actionable comparisons. Compliance and Pricing Governance...SuggestedWorldwide
- ...align with expected conversational behavior and system guidelines Apply consistent annotations by following clear taxonomies, benchmarks, and detailed evaluation guidelines Who You Are You hold a Bachelors degree You are a native speaker or have ILR 5/primary...SuggestedFull timeContract workPart timeFor contractorsRemote work
- ...implement tools for monitoring, alerting, and operational visibility across exchange systems. Work with backend engineers to deploy, benchmark, and tune matching engines and market data services. Develop and maintain APIs, microservices, and CI/CD pipelines supporting...SuggestedFull timeRemote work
- ...performance to senior leadership Actively measure and evaluate the performance of Sales Executives, measuring against clearly identified benchmark guidelines; ensure that the performance of all key metrics in Salesforce.com is properly measured Provide coaching and...Work at officeWorldwideWeekend workAfternoon shift
- ...leader in material handling solutions, serving a broad range of customers across multiple industries. We consistently set the industry benchmark, from everyday improvements to the breakthroughs at moments that matter most, because we know we can always find a safer, more...Remote jobFor subcontractor
- ...Security Hub, Azure Security Center, or other 3rd party tools to assess the security posture of cloud environment against industry benchmarks (such as NIST 800-53, CIS, MITTRE ATT&CK, CSA CSM, ISO27002, etc.) Professional security certification such as CCSP/CCSK/CISSP...Remote jobWork at officeLocal areaFlexible hours
- ...Conduct Model Testing and Grading : Run prompts through models and assess preliminary outputs against expectations. Support Benchmarking and Quality Assurance : Collaborate in QA review processes to ensure prompt tasks and rubrics meet rigor, maintaining consistency...Hourly payContract workTemporary workFor contractorsFreelanceRemote workFlexible hours
$20 - $30 per hour
...Identify and document factual inaccuracies, logical inconsistencies, and reasoning gaps. Provide structured feedback and benchmarking using specialized evaluation tools. Work independently and asynchronously as part of a distributed research team. Requirements...Weekly payContract workPart timeRemote workFlexible hours$35 - $45 per hour
...with human reasoning Develop prompt-response iterations using variations in length, style, and system prompts Contribute to benchmarking efforts by identifying failure points in model responses Apply consistent quality standards across diverse content types (e.g...Weekly payContract workTemporary workFor contractorsRemote workFlexible hours- ...Affiliations: Being a member of elite AI and tech associations, especially those with rigorous membership standards, is a plus. Earnings Benchmark: If your compensation stands above your contemporaries, it reflects your unparalleled expertise. Commercial Impact: Your...
- ...consulting workflows (problem structuring analysis synthesis) into task specifications and evaluation rubrics. Review, edit, and benchmark AI-generated outputs such as market analyses, strategy decks, operating models, and written recommendations. Create high-...Contract workFor contractorsRemote work
- ...leader in material handling solutions, serving a broad range of customers across multiple industries. We consistently set the industry benchmark, from everyday improvements to the breakthroughs at moments that matter most, because we know we can always find a safer, more...Remote job
- ...leader in material handling solutions, serving a broad range of customers across multiple industries. We consistently set the industry benchmark, from everyday improvements to the breakthroughs at moments that matter most, because we know we can always find a safer, more...Remote jobFor subcontractorWork at office
- ...audience levels Ensure responses align with conversational guidelines and evaluation standards Apply structured taxonomies, benchmarks, and detailed review frameworks Required Qualifications ~ PhD in Chemistry or a closely related field Deep expertise in...Weekly payContract workFor contractorsRemote workFlexible hours
- ...you. Must leave work area clean and sanitized at end of each shift. Carry out the policies and procedures of Stonewall Resort and Benchmark Hospitality Group while maintaining the highest degree of professionalism and teamwork atmosphere as per standards of service. Follow...Hourly payWork at officeLocal areaImmediate startWorldwideAll shiftsShift work
- ...PYRAMID’S** progressive “Be The Difference” culture and values are a cornerstone to the company’s nearly 40 years of extraordinary achievement and prosperity. Many properties have been recognized with prestigious national and international awards. #J-18808-Ljbffr BENCHMARKWork at officeLocal areaImmediate startWorldwideWeekend work
- ...PYRAMID’S** progressive “Be The Difference” culture and values are a cornerstone to the company’s nearly 40 years of extraordinary achievement and prosperity. Many properties have been recognized with prestigious national and international awards. #J-18808-Ljbffr BENCHMARKHourly payWork at officeLocal areaImmediate startWorldwideFlexible hours
- ...PYRAMID’S** progressive “Be The Difference” culture and values are a cornerstone to the company’s nearly 40 years of extraordinary achievement and prosperity. Many properties have been recognized with prestigious national and international awards. #J-18808-Ljbffr BENCHMARKWork at officeLocal areaImmediate startWorldwideShift work
