Research Scientist, Frontier Risk Evaluations
$216k - $270kScale AI
Scale Labs, Research Scientist - Frontier Risk Evaluations
As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale Labs has launched a new team focused on policy research, to bridge the gap between AI research and global policymakers to make informed, scientific decisions about AI risks and capabilities.
Our research tackles the hardest problems in agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption. This team collaborates broadly across industry, the public sector, and academia and regularly publishes our findings. We are actively seeking talented researchers to join us in shaping this vision.
As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems. For example, you might do any or all of the following:
- Design and build harnesses to test AI models and systems (including agents) for dangerous capabilities such as security vulnerability exploitation, CBRN uplift, and other high-risk activities;
- Work with government agencies or other labs to collectively scope and design evaluations to measure and mitigate risks posed by advanced AI systems;
- Publish evaluation methodologies and write technical reports for policymakers.
Ideally you'd have:
- Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance.
- Practical experience conducting technical research collaboratively. You should be comfortable building and instrumenting ML pipelines, writing evaluation harnesses, and quickly turning new ideas from the research literature into working prototypes.
- A track record of published research in machine learning, particularly in generative AI.
- At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
- Strong written and verbal communication skills to operate in a cross-functional team.
Nice to have:
- Experience in crafting evaluations and benchmarks, or a background in data science roles related to LLM technologies.
- Experience with red-teaming or adversarial testing of AI systems.
- Familiarity with AI safety policy frameworks (e.g., NIST AI RMF, EU AI Act, Korea AI Basic Act).
Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions. If you're excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn't perfectly align with every requirement.
Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.
Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $216,000—$270,000 USDPLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.
About Us:
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at View email address on click.appcast.io. Please see the United States Department of Labor's Know Your Rights poster for additional information.
We comply with the United States Department of Labor's Pay Transparency provision .
PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants' needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
- ...growing team of practicing MDs, AI scientists, PhDs, creatives,... ...The Role Abridge is hiring Research Scientists to join our Strategic... ...Research team to rigorously evaluate and advance the real-world impact... ...instead push the methodological frontier to solve the real world...SuggestedHourly payFull timeWork at officeRelocation packageFlexible hours
$110.35k - $181.29k
...A leading insurance company in the United States seeks a Senior Product Manager for Risk Evaluation & Delivery. This role involves defining the product roadmap and collaborating with diverse teams to enhance digital platforms. Candidates should have at least 10 years of...Risk$1,150 - $1,450 per unit
...Obsidian is seeking experienced regulatory compliance and risk management professionals to enhance AI systems' capabilities. This project-based role focuses on evaluating AI output related to regulatory compliance and risk management scenarios. Ideal candidates will have...Risk$50 - $70 per hour
...technical talent with leading AI research labs. Headquartered in San... ...stakeholders to challenge frontier AI agents. Collaborate with... ...teams to refine task designs and evaluation criteria for environmental-... ..., ESG reporting, or climate-risk analysis. ~ Day-to-day use...RiskRemote jobHourly payFull timeContract workSummer work$50 - $60 per hour
A financial technology company based in New York is seeking a Credit Risk Officer to train AI models, evaluate their outputs, and enhance their performance. Applicants should possess expert financial reasoning and skills in financial analysis. This role offers flexibility...RiskHourly payRemote work$70 - $100 per hour
...Mercor is seeking Expert Professionals for STEM Research to guide research teams and design rigorous domain tasks. Candidates should... ...requires a 40-hour work commitment during weekdays and involves the evaluation of AI agents within scientific contexts. Strong written...Hourly payContract workRemote workWeekday work$168.1k - $312.3k
...discovery and development. Roche's Research and Early Development... ...Intelligence (AI) to assist our scientists in both pRED and gRED to... ...worldwide. The Opportunity Frontier Research is dedicated to... ...reusable code, running evaluations, and organizing results....Local areaWorldwideRelocation package$110.35k - $181.29k
...Senior Product Manager, Risk Evaluation & Delivery As a Senior Product Manager, Risk Evaluation & Delivery, you will be responsible for defining... ...role will involve integrating usability studies, customer research, and data/AI standards into product requirements to ensure a...RiskWork experience placementVisa sponsorshipWork visa$30 - $50 per hour
...A tech company is seeking an AI Researcher to support end-to-end research for modern AI systems. This remote role involves designing experiments, defining evaluation protocols, and improving evaluation rigor for large language models. Key responsibilities include developing...Hourly payRemote work- ...A leading AI research organization in the United States is seeking a Senior Research Scientist to drive high-impact research on frontier topics in AI. The role involves autonomy in research agenda, collaboration across disciplines, and mentoring new researchers. Candidates...Remote workFlexible hours
$30 - $50 per hour
...A tech company specializing in AI research is seeking a mid-senior level researcher to manage applied AI research projects. The role involves end-to-end research cycles, building and evaluating LLM systems, and collaborating on dataset development. The ideal candidate...Hourly payFull timeRemote work- A health tech company in New York City is hiring a Research Scientist to evaluate the impact of ambient AI on healthcare outcomes. The role emphasizes designing studies, engaging with health systems, and fostering collaboration across product teams. A PhD in a relevant...Work at office
- ...At Snaplii, risk management isn't a "brake" on growth-it's the... ...engineering, model training, evaluation, and deployment. Ability to... ...Operations Specialist, Data Scientist, or Product Manager. Bachelor... ...Direct Access to the AI Frontier Connect with leading AI companies...RiskWork experience placementWork at officeRelocationRelocation package
$115k - $130k
...0 policy advocates, lawyers, scientists, and communication experts to... ...to scope analyses, identify risks, and surface insights to translate... ...functions: Evaluate, model, and structure innovative... ...colleagues, scope and assess new frontiers in climate and development...RiskWork at officeLocal areaFlexible hoursNight shift$154.3k - $204.3k
...BeiGene, Ltd. is seeking an Associate Director Safety Scientist to lead safety evaluation and analysis in the drug development process. This role will drive safety assessments and manage risk based on regulatory standards. The ideal candidate will have a PharmD or PhD...Risk$140k - $190k
...\'ll develop models to analyze critical risks including labor obsolescence, economic instability... ...as AI capabilities accelerate. This research aims to anticipate and identify... ...risks Strong critical thinking skills for evaluating competing economic theories in unprecedented...RiskFor contractorsVisa sponsorship$197.27k - $267.04k
...intersection of machine learning research, real world data, and... ...As Principal Applied Scientist, you lead the science on a major... ...series, control, planning, or evaluation ~ Take problems from ambiguous... ...on ~ Identify and de-risk scaling challenges in your...RiskLocal area$172k - $237k
...environment. What You’ll Do Serve as the lead PV Scientist for a specific product or group of... ...detection and management activities, Risk Management Plans (RMPs – core and EU), and... ...(DSURs) and Periodic Benefit‑Risk Evaluation Reports (PBRERs). Delineate leadership and...RiskTemporary workLocal area$184.06k - $223.04k
...Development is part of the Global Research organization in BMS and leads... ...the team to critically evaluate the literature regarding the... ...biology Works with TM laboratory scientists and academic TLs to address... ...with the ability to identify risks and implement contingency plans...RiskHourly payFull timeTemporary workPart timeSummer workLive inLocal areaRemote workFlexible hoursShift work$143k - $191k
...in March 2025. Learn more at What You'll Do: The Sr. Data Center Security Site Evaluation Program Manager drives new site selection physical security due diligence and initial risk evaluation. Reporting to the Senior Manager of Data Center Risk and Assurance, this...RiskPermanent employmentContract workTemporary workCasual workWork at officeFlexible hours$125k
...Purpose of the role To produce and deliver Research with differentiated market insights and... ...help them navigate financial markets and risks. * Collaboration with the Supervisory... ...this role you will gather, validate, and evaluate economic data from a variety of internal...RiskHourly pay- ...ensuring high standards in the planning /optimization, delivery and evaluation of radiation therapy and in this capacity is responsible for... ...misses”) with a focus on improving the process to minimize the risk of future incidents; ensures that clinical and physics...RiskWork at officeLocal area
$70 - $100 per hour
...engagement logic, and pharmacokinetic interpretations Evaluate AI-generated reasoning on drug mechanism, toxicity risk, and safety margins Ensure correct interpretation... ...‐functional discovery teams Exposure to AI or ML tools applied to biomedical research #J-18808-Ljbffr...RiskHourly payContract workRemote work- ...Zettamine Labs is seeking an LLM Security Evaluation Expert to rigorously probe large language models for security vulnerabilities. This... ...findings, and collaborating with AI development teams to convey risks effectively. You will need strong expertise in offensive security...Risk
$121.5k - $270k
...that Finance and Treasury can rely on for risk management. Causal Inference &... ...post-academic experience in an applied research, finance, or data science role, ideally... ...factors, including primary work location, an evaluation of the candidate's skills and experience...Risk- ...hands-on experience in fine-tuning and evaluation. You must have a strong passion for machine... ...invest independent time in learning, researching, and experimenting with new innovations,... ...regulated finance domains and working with risk/control processes. ~ Experience with...Risk
- ...across employer contracting, value-based and risk-bearing care models. This role blends... ...Medicare (FFS and MA) and Medicaid populations. Evaluate program impact on total cost of care,... ...experience in health economics, outcomes research, population health analytics, or a...RiskContract workRemote work
- ...develop new products, and enhance risk management. We're offering... ..., WA. As a Machine Learning Scientist, you'll tackle complex... ...chance to independently study, research, and experiment with new innovations... ..., and to outline and evaluate intrinsic and extrinsic metrics...RiskWork at office
$34 - $44 per hour
...contributes to both production support and research initiatives focused on gemstone... ...research teams to design experiments and evaluate findings Preparation of technical reports... ...toxic or caustic chemicals, electrical shock risk, radiation risk, and moderate noise level...RiskHourly pay$72k - $93.73k
...expert care to patients of all ages. Informed by basic research done at our Sloan Kettering Institute, scientists collaborate to conduct innovative translational and... ...MRI, for motion‑adaptive radiation therapy of high‑risk thoracic and abdominal tumors. The research will...RiskWorldwide
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Research Scientist, Frontier Risk Evaluations. Be the first to apply!
- principal applied scientist New York, NY
- image scientist New York, NY
- scientist New York, NY
- research scientist machine learning deep learning New York, NY
- deep learning scientist New York, NY
- senior principal scientist New York, NY
- machine learning scientist New York, NY
- bioanalytical scientist New York, NY
- scientist immunology New York, NY
- safety scientist New York, NY


