Staff Machine Learning Research Scientist, LLM Evals

$264.8k - $331k

Scale AI

As the leading data and evaluation partner for frontier AI companies, Scale is dedicated to advancing the evaluation and benchmarking of large language models (LLMs). We are building industry-leading LLM evals, setting new standards for model performance assessment. Our mission is to develop rigorous, scalable, and fair evaluation methodologies to drive the next generation of AI capabilities.

Our Research teams work with the industry's leading AI labs to provide high quality data and accelerate progress in GenAI research. As a Staff Machine Learning Research Scientist on the LLM Evals team, you will lead the development of novel evaluation methodologies, metrics, and benchmarks to measure the capabilities and limitations of frontier LLMs. You will help define what "good" looks like in generative AI, driving research that informs both our internal roadmap and the broader research community. This role is critical for designing and executing a roadmap that defines best practices in data driven AI development and will accelerate the next generation of generative AI models in partnership with top foundational model labs.

You will:

Drive research on the effectiveness and limitations of existing LLM evaluation techniques.
Design and develop novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.
Communicate, collaborate, and build relationships with clients and peer teams to facilitate cross-functional projects.
Collaborate with internal teams and external partners to refine metrics and create standardized evaluation protocols.
Implement scalable and reproducible evaluation pipelines using modern ML frameworks.
Publish research findings in top-tier AI conferences and contribute to open-source benchmarking initiatives.
Mentor and guide research scientists and engineers, providing technical leadership across cross-functional projects.
Stay deeply engaged with the ML research community, tracking emerging work and contributing to the advancement of LLM evaluation science.
Thrive in a high-energy, fast-paced startup environment and are ready to dedicate the time and effort needed to drive impactful results.

Ideally you'd have:

5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development
Experience and track of recording in landing major research impacts in a fast-paced environment
Experience tech leading a team of research scientists and research engineers
Excellent written and verbal communication skills
Published research in areas of machine learning at major conferences (NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, etc.) and/or journals
Previous experience in a customer facing role.

Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.

Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $264,800—$331,000 USD

PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.

About Us:

At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.

We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.

We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at View email address on click.appcast.io. Please see the United States Department of Labor's Know Your Rights poster for additional information.

We comply with the United States Department of Labor's Pay Transparency provision .

PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants' needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.

Apply

Vacancy posted 5 days ago

Similar jobs that could be interesting for youBased on the Staff Machine Learning Research Scientist, LLM Evals in New York, NY vacancy

Applied Researcher II (AI Foundations, LLM Core and Agentic AI)
$262.5k - $299.6k
Applied Researcher II (AI Foundations, LLM Core and Agentic AI) At Capital One, we are creating trustworthy... ...leading the industry in using machine learning to create real‑time, intelligent,... ...with a cross‑functional team of data scientists, software engineers, machine...
Suggested
Full time
Part time
Local area
Flexible hours
Capital One National Association
New York, NY
3 days ago
NLP & LLM Research Scientist — Quantitative Finance
$150k - $300k
A leading systematic hedge fund in New York is seeking an ML Researcher to build LLM tools for various business use cases. You will use NLP techniques and develop novel models from large datasets. Ideal candidates will have 5+ years in a research role and possess strong...
Suggested
Point Three Group
New York, NY
1 day ago
Machine Learning Research Scientist, Post-Training
$252k - $315k
...data and accelerate progress in GenAI research. We are looking for Research Scientists and Research Engineers with expertise in LLM post-training (SFT, RLHF, reward modeling... ...Master's degree in Computer Science, Machine Learning, AI, or a related field. Deep understanding...
Suggested
Full time
Scale AI
New York, NY
4 days ago
Senior/Staff Machine Learning Researcher
...for critical resources. We are looking for a talented deep learning engineer or scientist to lead the development of this model that will... ...cleaning, and maintaining high‑quality datasets tailored for machine learning applications. Strong Software Engineering and Design...
Suggested
terra.ai Inc.
New York, NY
1 day ago
Senior/Staff ML Researcher - Generative AI Worlds in Brooklyn, NY
...About The Role We're looking for an exceptional machine learning researcher to join our founding team. You'll be pushing the frontier of diffusion transformer & world model research, solving core problems of how to create a universal simulator and how to integrate it...
Suggested
Work at office
Flexible hours
Dream3D
Brooklyn, NY
10 days ago
AI Research Scientist - Agentic Systems
...that is accelerating scientific research & development. We are at the... ...artificial intelligence and machine learning to pioneer generative... ...levels of seniority: Senior, Staff, and Principal. Mission... ...components. Leverage various LLM architectures and models to balance...
Radical AI
New York, NY
18 days ago
Distinguished Applied Researcher
$278.4k - $317.7k
...Distinguished Applied Researcher Overview: At Capital One, we are... ...leading the industry in using machine learning to create real-time, intelligent... ...mentor a team of applied scientists and their managers without... ...Engineering or related fields LLM PhD focus on NLP or Masters...
Full time
Part time
Local area
Flexible hours
Capital One
New York, NY
3 days ago
AIML - Machine Learning Research Scientist, Data and ML Innovation
$139.5k - $258.1k
...AIML - Machine Learning Research Scientist, Data and ML Innovation Would you like to join a team curious about understanding how foundation models work and to expand their capabilities in scientific domains? We perform and publish novel research and apply our findings...
Relocation
Apple
New York, NY
5 days ago
Machine Learning Research Scientist, Generative Biology
...including Y Combinator. As a Generative AI Research Scientist, you'll work alongside our founders and... ...or industry experience in Deep Learning, Artificial Intelligence, or other relevant... ...: You have experience applying machine learning to biological or chemical...
Output Services
New York, NY
1 day ago
ML Scientist
$200k - $250k
...ML Scientist Boston or NYC Layer Health was founded in 2023 by leading machine learning researchers from MIT and Harvard Medical School. We are building an AI layer that can accurately... ...friction everywhere in healthcare. Our LLM-powered platform is solving chart review...
Layer Health
New York, NY
4 days ago
Principal Applied Scientist
$142.8k - $274.8k
...this role, the Principal Applied Scientist will design and implement state-of-the-art machine learning models and algorithms that... ...Stay at the forefront of AI research, incorporating the latest advancements... ...learning, transformers or LLM. ~6+ years of experience in developing...
Ongoing contract
Work at office
Local area
Microsoft Corporation
New York, NY
2 days ago
ML Research Scientist - Atomistic Foundation Models
...world-class, interdisciplinary team of ML researchers, physicists, chemists, and engineers... ...models. Push the frontier where deep learning meets the laws of nature - bridging generative... ...or equivalent research experience in machine learning, physics, chemistry, computer...
Achira
New York, NY
1 day ago
Applied Researcher I (AI Foundations, LLM Customization, Finetuning, Reinforcement Learning)
$218.7k - $249.6k
...Overview Applied Researcher I (AI Foundations, LLM Customization, Finetuning, Reinforcement Learning) Overview: At Capital One,... ...leading the industry in using machine learning to create real-... ...cross-functional team of data scientists, software engineers,...
Full time
Part time
Local area
Flexible hours
Capital One
New York, NY
a month ago
Staff + Sr. Software Engineer, Inference
$300k
...Staff + Sr. Software Engineer, Inference San... ...growing group of committed researchers, engineers, policy... ...by giving our scientists the high-performance... ...to pair!) Want to learn more about machine learning systems and... ...management systems LLM inference optimization...
Work at office
Worldwide
Visa sponsorship
Flexible hours
anthropic
New York, NY
5 days ago
Applied Researcher I
$218.7k - $249.6k
...leading the industry in using machine learning to create real‑time,... ...touches every aspect of the research life cycle, from partnering... ...cross‑functional team of data scientists, software engineers, machine... ...Engineering or related fields. LLM PhD focus on NLP or Masters...
Full time
Part time
Local area
Flexible hours
Capital One National Association
New York, NY
2 days ago
Computational Scientist, Computational Biology and Machine Learning - Hematology & Medical Oncology
$96.46k - $159.1k
...looking for a Computational Scientist in Computational Biology and Machine Learning to join our growing translational research program at the Tisch... ...and genomic datasets Build LLM-powered pipelines for extracting... ...the well-being of our staff, patients, and organization...
Traineeship
Local area
Mount Sinai Medical Center
New York, NY
3 days ago
Machine Learning User Research Scientist (Ph.D. required)
$130k - $135k
Our Opportunity We are currently seeking a Ph.D. level Machine Learning User Research Scientist for our Data Sciences Practice in New York, NY . In this role, you will work as part of a team to plan and execute global data collection efforts, utilize and improve next-generation...
Local area
Exponent Inc.
New York, NY
4 days ago
Machine Learning Scientist, Scientific Reasoning Models, AI for Drug Discovery
$141.1k - $262.1k
...discovery and development. Roche’s Research and Early Development... ...Intelligence (AI) to assist our scientists in both pRED and gRED to... ...discovery with cutting‑edge machine learning (ML) techniques. We are seeking... ...relevant work experience. LLM Expertise: Experience...
Work experience placement
Local area
Worldwide
Relocation package
Genentech
New York, NY
3 days ago
Machine Learning User Research Scientist (Ph.D. required)
$130k - $135k
...mentoring, sponsorship, and a culture of learning. Thanks for your interest in... ...are currently seeking a Ph.D. level Machine Learning User Research Scientist for our Data Sciences Practice in New... ...recognition programs empowers our staff to do work that makes a difference....
Work at office
Local area
Flexible hours
Exponent, Inc
New York, NY
2 days ago
PhD ML User Research Scientist, Global Programs
$130k - $135k
Exponent Inc. is seeking a Ph.D. level Machine Learning User Research Scientist in New York, NY. In this role, you will be involved in global data collection to support clients in the consumer electronics industry. Key responsibilities include consulting on large-scale...
Exponent Inc.
New York, NY
2 days ago
Senior / Principal Machine Learning Scientist, Scientific Reasoning Models, AI for Drug Discovery
$167.4k - $310.8k
...development. Roche’s Research and Early Development... ...Intelligence (AI) to assist our scientists in both pRED and gRED... ...with cutting‑edge machine learning (ML) techniques. We... ...technical mentor to junior staff and interns, fostering... ...project ownership. LLM Expertise: Extensive...
Local area
Worldwide
Relocation package
Genentech
New York, NY
5 days ago
Senior Research Scientist, Generative Music AI
$164.23k - $234.62k
A prominent music streaming company is seeking a Senior Research Scientist for their Artist-First AI Music lab in New York City. This role... ...conducting innovative research in music generation and machine learning to enhance artist-fan connections. Candidates should have...
Flexible hours
Spotify AB
New York, NY
1 day ago
Senior Research Scientist, Personalization
$169.16k - $241.65k
...the technology that helps millions of listeners discover what they love. Within this space, our research team focuses on advancing the state of the art in machine learning and AI to shape the future of personalization. We explore new approaches, challenge existing assumptions...
Work from home
Flexible hours
Spotify AB
New York, NY
2 days ago
PhD ML User Research Scientist - Data & UX Insights
A premium consulting firm in New York is seeking a Machine Learning User Research Scientist with a Ph.D. You'll work on global data collection and support clients in the consumer electronics industry. Responsibilities include managing data, optimizing programs, and leading...
SupportFinity™
New York, NY
1 day ago
Machine Learning Researcher Engineer
...Machine Learning Engineer / Researcher BoldVoice helps the 1 billion global non native English speakers speak English with clarity and confidence,... ...similar. Up to date with latest developments in using LLM tools like Claude Code, Cursor, Codex or similar to rapidly...
Work at office
Relocation package
BoldVoice
New York, NY
5 days ago
Senior ML Research Scientist - Finance & Trading
The Voleon Group is seeking a Senior Member of Research Staff in New York to lead research projects in statistical machine learning. You will work alongside leading experts in AI and machine learning, solving complex finance-related problems. Candidates should have 5-1...
Work visa
Relocation package
The-Voleon-Group
New York, NY
4 days ago
Research Scientist, Artificial Intelligence (PhD)
$85k - $150k
...reality with the right fusion of deep learning, signal processing, and computational neuroscience... .... We're seeking a full time Research Scientist, Artificial Intelligence (PhD) to join... ...equivalent deep technical expertise in Machine Learning, Artificial Intelligence,...
Full time
Synaptrix Labs
New York, NY
9 days ago
Senior ML Research Scientist - Lead End-to-End AI
$200k - $320k
A cutting-edge AI startup in New York is seeking a Senior Research Scientist to advance their machine learning initiatives. The successful candidate will hold a Ph.D., have substantial experience in research, and a passion for solving real-world challenges. This role offers...
EliseAI
New York, NY
1 day ago
Staff AI Architect: Production LLM & Inference
$197k - $290k
...significant experience in building and running complex systems, as well as a strong grasp of model evaluation and cost management. Competitive salary ranges between $197,000 and $290,000 USD, alongside comprehensive employee benefits for US-based staff. #J-18808-Ljbffr...
Remote work
Life360
New York, NY
1 day ago
Applied Machine Learning Scientist II (AI/ML - Fraud/Risk, GenAI & Agentic AI)
...Intelligence Job Description: The Applied Machine Learning Scientist II is responsible for providing... ...emerging AI capabilities, including: *LLM-powered applications *AI copilots and... ...of emerging industry trends, academic research, and evolving AI technologies, proactively...
Full time
Work experience placement
Work at office
Work from home
Flexible hours
TD
New York, NY
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Machine Learning Research Scientist, LLM Evals. Be the first to apply!