Research Scientist, Frontier Risk Evaluations

$216k - $270k

Scale AI

Scale Labs, Research Scientist - Frontier Risk Evaluations

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale Labs has launched a new team focused on policy research, to bridge the gap between AI research and global policymakers to make informed, scientific decisions about AI risks and capabilities.

Our research tackles the hardest problems in agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption. This team collaborates broadly across industry, the public sector, and academia and regularly publishes our findings. We are actively seeking talented researchers to join us in shaping this vision.

As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems. For example, you might do any or all of the following:

Design and build harnesses to test AI models and systems (including agents) for dangerous capabilities such as security vulnerability exploitation, CBRN uplift, and other high-risk activities;
Work with government agencies or other labs to collectively scope and design evaluations to measure and mitigate risks posed by advanced AI systems;
Publish evaluation methodologies and write technical reports for policymakers.

Ideally you'd have:

Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance.
Practical experience conducting technical research collaboratively. You should be comfortable building and instrumenting ML pipelines, writing evaluation harnesses, and quickly turning new ideas from the research literature into working prototypes.
A track record of published research in machine learning, particularly in generative AI.
At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
Strong written and verbal communication skills to operate in a cross-functional team.

Nice to have:

Experience in crafting evaluations and benchmarks, or a background in data science roles related to LLM technologies.
Experience with red-teaming or adversarial testing of AI systems.
Familiarity with AI safety policy frameworks (e.g., NIST AI RMF, EU AI Act, Korea AI Basic Act).

Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions. If you're excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn't perfectly align with every requirement.

Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.

Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $216,000—$270,000 USD

PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.

About Us:

At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.

We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.

We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at View email address on click.appcast.io. Please see the United States Department of Labor's Know Your Rights poster for additional information.

We comply with the United States Department of Labor's Pay Transparency provision .

PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants' needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Research Scientist, Frontier Risk Evaluations in New York, NY vacancy

Research Scientist (Measurement and Evaluation)
...growing team of practicing MDs, AI scientists, PhDs, creatives,... ...The Role Abridge is hiring Research Scientists to join our Strategic... ...Research team to rigorously evaluate and advance the real-world impact... ...instead push the methodological frontier to solve the real world...
Suggested
Hourly pay
Full time
Work at office
Relocation package
Flexible hours
aijoblist
New York, NY
19 hours ago
Senior Product Manager - Risk Evaluation & Underwriting Delivery
$110.35k - $181.29k
...A leading insurance company in the United States seeks a Senior Product Manager for Risk Evaluation & Delivery. This role involves defining the product roadmap and collaborating with diverse teams to enhance digital platforms. Candidates should have at least 10 years of...
Risk
Guardian Life
New York, NY
6 hours ago
Regulatory Risk & Compliance AI Evaluator
$1,150 - $1,450 per unit
...Obsidian is seeking experienced regulatory compliance and risk management professionals to enhance AI systems' capabilities. This project-based role focuses on evaluating AI output related to regulatory compliance and risk management scenarios. Ideal candidates will have...
Risk
Obsidian
New York, NY
2 days ago
Environmental Scientist - Fully Remote
$50 - $70 per hour
...technical talent with leading AI research labs. Headquartered in San... ...stakeholders to challenge frontier AI agents. Collaborate with... ...teams to refine task designs and evaluation criteria for environmental-... ..., ESG reporting, or climate-risk analysis. ~ Day-to-day use...
Risk
Remote job
Hourly pay
Full time
Contract work
Summer work
Mercor
New York, NY
23 days ago
Remote Credit Risk Specialist & AI Model Evaluator
$50 - $60 per hour
A financial technology company based in New York is seeking a Credit Risk Officer to train AI models, evaluate their outputs, and enhance their performance. Applicants should possess expert financial reasoning and skills in financial analysis. This role offers flexibility...
Risk
Hourly pay
Remote work
DataAnnotation
New York, NY
2 days ago
Remote STEM Research Scientist - AI Evaluation (Contract)
$70 - $100 per hour
...Mercor is seeking Expert Professionals for STEM Research to guide research teams and design rigorous domain tasks. Candidates should... ...requires a 40-hour work commitment during weekdays and involves the evaluation of AI agents within scientific contexts. Strong written...
Hourly pay
Contract work
Remote work
Weekday work
Mercor Inc
New York, NY
5 days ago
Senior Machine Learning Scientist, Frontier Research, AI for Drug Discovery
$168.1k - $312.3k
...discovery and development. Roche's Research and Early Development... ...Intelligence (AI) to assist our scientists in both pRED and gRED to... ...worldwide. The Opportunity Frontier Research is dedicated to... ...reusable code, running evaluations, and organizing results....
Local area
Worldwide
Relocation package
Genentech
New York, NY
3 days ago
Senior Product Manager, Risk Evaluation & Delivery
$110.35k - $181.29k
...Senior Product Manager, Risk Evaluation & Delivery As a Senior Product Manager, Risk Evaluation & Delivery, you will be responsible for defining... ...role will involve integrating usability studies, customer research, and data/AI standards into product requirements to ensure a...
Risk
Work experience placement
Visa sponsorship
Work visa
Guardian Life
New York, NY
6 hours ago
Remote AI Research Scientist: LLM Evaluation & Experiments
$30 - $50 per hour
...A tech company is seeking an AI Researcher to support end-to-end research for modern AI systems. This remote role involves designing experiments, defining evaluation protocols, and improving evaluation rigor for large language models. Key responsibilities include developing...
Hourly pay
Remote work
Rex USA
New York, NY
6 hours ago
Senior AI Research Scientist Frontier ML, Remote-Flexible
...A leading AI research organization in the United States is seeking a Senior Research Scientist to drive high-impact research on frontier topics in AI. The role involves autonomy in research agenda, collaboration across disciplines, and mentoring new researchers. Candidates...
Remote work
Flexible hours
Cohere
New York, NY
5 days ago
Remote Applied AI Research Scientist (LLM & Evaluation)
$30 - $50 per hour
...A tech company specializing in AI research is seeking a mid-senior level researcher to manage applied AI research projects. The role involves end-to-end research cycles, building and evaluating LLM systems, and collaborating on dataset development. The ideal candidate...
Hourly pay
Full time
Remote work
Rex USA
New York, NY
5 days ago
Measurement & Evaluation Research Scientist (Hybrid NYC)
A health tech company in New York City is hiring a Research Scientist to evaluate the impact of ambient AI on healthcare outcomes. The role emphasizes designing studies, engaging with health systems, and fostering collaboration across product teams. A PhD in a relevant...
Work at office
Abridge
New York, NY
19 hours ago
Senior/Lead Risk Analyst, Payment fraud(Relocation to Toronto Required)
...At Snaplii, risk management isn't a "brake" on growth-it's the... ...engineering, model training, evaluation, and deployment. Ability to... ...Operations Specialist, Data Scientist, or Product Manager. Bachelor... ...Direct Access to the AI Frontier Connect with leading AI companies...
Risk
Work experience placement
Work at office
Relocation
Relocation package
Snaplii
New York, NY
1 day ago
Climate Financial Analyst
$115k - $130k
...0 policy advocates, lawyers, scientists, and communication experts to... ...to scope analyses, identify risks, and surface insights to translate... ...functions: Evaluate, model, and structure innovative... ...colleagues, scope and assess new frontiers in climate and development...
Risk
Work at office
Local area
Flexible hours
Night shift
Natural Resources Defense Council Inc
New York, NY
1 day ago
Senior Safety Scientist, Clinical Risk & Safety Analytics
$154.3k - $204.3k
...BeiGene, Ltd. is seeking an Associate Director Safety Scientist to lead safety evaluation and analysis in the drug development process. This role will drive safety assessments and manage risk based on regulatory standards. The ideal candidate will have a PharmD or PhD...
Risk
BeiGene
New York, NY
2 days ago
Senior Economist - Effects of Transformative AI
$140k - $190k
...\'ll develop models to analyze critical risks including labor obsolescence, economic instability... ...as AI capabilities accelerate. This research aims to anticipate and identify... ...risks Strong critical thinking skills for evaluating competing economic theories in unprecedented...
Risk
For contractors
Visa sponsorship
Future of Life Institute
New York, NY
6 hours ago
Principal Applied Scientist
$197.27k - $267.04k
...intersection of machine learning research, real world data, and... ...As Principal Applied Scientist, you lead the science on a major... ...series, control, planning, or evaluation ~ Take problems from ambiguous... ...on ~ Identify and de-risk scaling challenges in your...
Risk
Local area
Siemens
New York, NY
11 days ago
Associate Director, PV Scientist
$172k - $237k
...environment. What You’ll Do Serve as the lead PV Scientist for a specific product or group of... ...detection and management activities, Risk Management Plans (RMPs – core and EU), and... ...(DSURs) and Periodic Benefit‑Risk Evaluation Reports (PBRERs). Delineate leadership and...
Risk
Temporary work
Local area
Biogen
New York, NY
3 days ago
Senior Principal Scientist, Cardiovascular, Translational Development
$184.06k - $223.04k
...Development is part of the Global Research organization in BMS and leads... ...the team to critically evaluate the literature regarding the... ...biology Works with TM laboratory scientists and academic TLs to address... ...with the ability to identify risks and implement contingency plans...
Risk
Hourly pay
Full time
Temporary work
Part time
Summer work
Live in
Local area
Remote work
Flexible hours
Shift work
Bristol-Myers Squibb
New York, NY
6 hours ago
Senior Data Center Security Site Evaluation Program Manager
$143k - $191k
...in March 2025. Learn more at What You'll Do: The Sr. Data Center Security Site Evaluation Program Manager drives new site selection physical security due diligence and initial risk evaluation. Reporting to the Senior Manager of Data Center Risk and Assurance, this...
Risk
Permanent employment
Contract work
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
New York, NY
12 days ago
Economic Research Analyst
$125k
...Purpose of the role To produce and deliver Research with differentiated market insights and... ...help them navigate financial markets and risks. * Collaboration with the Supervisory... ...this role you will gather, validate, and evaluate economic data from a variety of internal...
Risk
Hourly pay
Barclays Plc
New York, NY
1 hour ago
Oncology Physicist
...ensuring high standards in the planning /optimization, delivery and evaluation of radiation therapy and in this capacity is responsible for... ...misses”) with a focus on improving the process to minimize the risk of future incidents; ensures that clinical and physics...
Risk
Work at office
Local area
Akumin
New York, NY
6 hours ago
Drug Discovery Scientist (Medicinal Chemistry / Pharmacology) | Remote
$70 - $100 per hour
...engagement logic, and pharmacokinetic interpretations Evaluate AI-generated reasoning on drug mechanism, toxicity risk, and safety margins Ensure correct interpretation... ...‐functional discovery teams Exposure to AI or ML tools applied to biomedical research #J-18808-Ljbffr...
Risk
Hourly pay
Contract work
Remote work
Crossing Hurdles
New York, NY
5 days ago
LLM Security Evaluation & Engineer
...Zettamine Labs is seeking an LLM Security Evaluation Expert to rigorously probe large language models for security vulnerabilities. This... ...findings, and collaborating with AI development teams to convey risks effectively. You will need strong expertise in offensive security...
Risk
Zettamine Labs
New York, NY
6 hours ago
Senior Applied Economist, Causal Inference & Forecasting
$121.5k - $270k
...that Finance and Treasury can rely on for risk management. Causal Inference &... ...post-academic experience in an applied research, finance, or data science role, ideally... ...factors, including primary work location, an evaluation of the candidate's skills and experience...
Risk
Navan
New York, NY
3 days ago
Machine Learning Scientist - Vice President
...hands-on experience in fine-tuning and evaluation. You must have a strong passion for machine... ...invest independent time in learning, researching, and experimenting with new innovations,... ...regulated finance domains and working with risk/control processes. ~ Experience with...
Risk
Chase
New York, NY
4 days ago
Senior Health Economist
...across employer contracting, value-based and risk-bearing care models. This role blends... ...Medicare (FFS and MA) and Medicaid populations. Evaluate program impact on total cost of care,... ...experience in health economics, outcomes research, population health analytics, or a...
Risk
Contract work
Remote work
FlyteHealth
New York, NY
6 hours ago
Machine Learning Scientist - NLP - Vice President - Machine Learning Center of Excellence
...develop new products, and enhance risk management. We're offering... ..., WA. As a Machine Learning Scientist, you'll tackle complex... ...chance to independently study, research, and experiment with new innovations... ..., and to outline and evaluate intrinsic and extrinsic metrics...
Risk
Work at office
JPMorgan Chase & Co.
New York, NY
2 days ago
LA-ICP-MS Gemstone Analytics Scientist
$34 - $44 per hour
...contributes to both production support and research initiatives focused on gemstone... ...research teams to design experiments and evaluate findings Preparation of technical reports... ...toxic or caustic chemicals, electrical shock risk, radiation risk, and moderate noise level...
Risk
Hourly pay
GIA (Gemological Institute of America)
New York, NY
3 days ago
Postdoctoral Fellow / Research Scientist in MR-guided Radiotherapy (MRgRT) Department of Medica[...]
$72k - $93.73k
...expert care to patients of all ages. Informed by basic research done at our Sloan Kettering Institute, scientists collaborate to conduct innovative translational and... ...MRI, for motion‑adaptive radiation therapy of high‑risk thoracic and abdominal tumors. The research will...
Risk
Worldwide
Memorial Sloan Kettering Cancer Center
New York, NY
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Scientist, Frontier Risk Evaluations. Be the first to apply!