Staff Machine Learning Research Scientist, LLM Evals
$264.8k - $331kScale AI
As the leading data and evaluation partner for frontier AI companies, Scale is dedicated to advancing the evaluation and benchmarking of large language models (LLMs). We are building industry-leading LLM evals, setting new standards for model performance assessment. Our mission is to develop rigorous, scalable, and fair evaluation methodologies to drive the next generation of AI capabilities.
Our Research teams work with the industry's leading AI labs to provide high quality data and accelerate progress in GenAI research. As a Staff Machine Learning Research Scientist on the LLM Evals team, you will lead the development of novel evaluation methodologies, metrics, and benchmarks to measure the capabilities and limitations of frontier LLMs. You will help define what "good" looks like in generative AI, driving research that informs both our internal roadmap and the broader research community. This role is critical for designing and executing a roadmap that defines best practices in data driven AI development and will accelerate the next generation of generative AI models in partnership with top foundational model labs.
You will:- Drive research on the effectiveness and limitations of existing LLM evaluation techniques.
- Design and develop novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.
- Communicate, collaborate, and build relationships with clients and peer teams to facilitate cross-functional projects.
- Collaborate with internal teams and external partners to refine metrics and create standardized evaluation protocols.
- Implement scalable and reproducible evaluation pipelines using modern ML frameworks.
- Publish research findings in top-tier AI conferences and contribute to open-source benchmarking initiatives.
- Mentor and guide research scientists and engineers, providing technical leadership across cross-functional projects.
- Stay deeply engaged with the ML research community, tracking emerging work and contributing to the advancement of LLM evaluation science.
- Thrive in a high-energy, fast-paced startup environment and are ready to dedicate the time and effort needed to drive impactful results.
- 5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development
- Experience and track of recording in landing major research impacts in a fast-paced environment
- Experience tech leading a team of research scientists and research engineers
- Excellent written and verbal communication skills
- Published research in areas of machine learning at major conferences (NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, etc.) and/or journals
- Previous experience in a customer facing role.
Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.
Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $264,800—$331,000 USDPLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.
About Us:
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at View email address on click.appcast.io. Please see the United States Department of Labor's Know Your Rights poster for additional information.
We comply with the United States Department of Labor's Pay Transparency provision .
PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants' needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
$262.5k - $299.6k
Applied Researcher II (AI Foundations, LLM Core and Agentic AI) At Capital One, we are creating trustworthy... ...leading the industry in using machine learning to create real‑time, intelligent,... ...with a cross‑functional team of data scientists, software engineers, machine...SuggestedFull timePart timeLocal areaFlexible hours$150k - $300k
A leading systematic hedge fund in New York is seeking an ML Researcher to build LLM tools for various business use cases. You will use NLP techniques and develop novel models from large datasets. Ideal candidates will have 5+ years in a research role and possess strong...Suggested$252k - $315k
...data and accelerate progress in GenAI research. We are looking for Research Scientists and Research Engineers with expertise in LLM post-training (SFT, RLHF, reward modeling... ...Master's degree in Computer Science, Machine Learning, AI, or a related field. Deep understanding...SuggestedFull time- ...for critical resources. We are looking for a talented deep learning engineer or scientist to lead the development of this model that will... ...cleaning, and maintaining high‑quality datasets tailored for machine learning applications. Strong Software Engineering and Design...Suggested
- ...About The Role We're looking for an exceptional machine learning researcher to join our founding team. You'll be pushing the frontier of diffusion transformer & world model research, solving core problems of how to create a universal simulator and how to integrate it...SuggestedWork at officeFlexible hours
- ...that is accelerating scientific research & development. We are at the... ...artificial intelligence and machine learning to pioneer generative... ...levels of seniority: Senior, Staff, and Principal. Mission... ...components. Leverage various LLM architectures and models to balance...
$278.4k - $317.7k
...Distinguished Applied Researcher Overview: At Capital One, we are... ...leading the industry in using machine learning to create real-time, intelligent... ...mentor a team of applied scientists and their managers without... ...Engineering or related fields LLM PhD focus on NLP or Masters...Full timePart timeLocal areaFlexible hours$139.5k - $258.1k
...AIML - Machine Learning Research Scientist, Data and ML Innovation Would you like to join a team curious about understanding how foundation models work and to expand their capabilities in scientific domains? We perform and publish novel research and apply our findings...Relocation- ...including Y Combinator. As a Generative AI Research Scientist, you'll work alongside our founders and... ...or industry experience in Deep Learning, Artificial Intelligence, or other relevant... ...: You have experience applying machine learning to biological or chemical...
$200k - $250k
...ML Scientist Boston or NYC Layer Health was founded in 2023 by leading machine learning researchers from MIT and Harvard Medical School. We are building an AI layer that can accurately... ...friction everywhere in healthcare. Our LLM-powered platform is solving chart review...$142.8k - $274.8k
...this role, the Principal Applied Scientist will design and implement state-of-the-art machine learning models and algorithms that... ...Stay at the forefront of AI research, incorporating the latest advancements... ...learning, transformers or LLM. ~6+ years of experience in developing...Ongoing contractWork at officeLocal area- ...world-class, interdisciplinary team of ML researchers, physicists, chemists, and engineers... ...models. Push the frontier where deep learning meets the laws of nature - bridging generative... ...or equivalent research experience in machine learning, physics, chemistry, computer...
$218.7k - $249.6k
...Overview Applied Researcher I (AI Foundations, LLM Customization, Finetuning, Reinforcement Learning) Overview: At Capital One,... ...leading the industry in using machine learning to create real-... ...cross-functional team of data scientists, software engineers,...Full timePart timeLocal areaFlexible hours$300k
...Staff + Sr. Software Engineer, Inference San... ...growing group of committed researchers, engineers, policy... ...by giving our scientists the high-performance... ...to pair!) Want to learn more about machine learning systems and... ...management systems LLM inference optimization...Work at officeWorldwideVisa sponsorshipFlexible hours$218.7k - $249.6k
...leading the industry in using machine learning to create real‑time,... ...touches every aspect of the research life cycle, from partnering... ...cross‑functional team of data scientists, software engineers, machine... ...Engineering or related fields. LLM PhD focus on NLP or Masters...Full timePart timeLocal areaFlexible hours$96.46k - $159.1k
...looking for a Computational Scientist in Computational Biology and Machine Learning to join our growing translational research program at the Tisch... ...and genomic datasets Build LLM-powered pipelines for extracting... ...the well-being of our staff, patients, and organization...TraineeshipLocal area$130k - $135k
Our Opportunity We are currently seeking a Ph.D. level Machine Learning User Research Scientist for our Data Sciences Practice in New York, NY . In this role, you will work as part of a team to plan and execute global data collection efforts, utilize and improve next-generation...Local area$141.1k - $262.1k
...discovery and development. Roche’s Research and Early Development... ...Intelligence (AI) to assist our scientists in both pRED and gRED to... ...discovery with cutting‑edge machine learning (ML) techniques. We are seeking... ...relevant work experience. LLM Expertise: Experience...Work experience placementLocal areaWorldwideRelocation package$130k - $135k
...mentoring, sponsorship, and a culture of learning. Thanks for your interest in... ...are currently seeking a Ph.D. level Machine Learning User Research Scientist for our Data Sciences Practice in New... ...recognition programs empowers our staff to do work that makes a difference....Work at officeLocal areaFlexible hours$130k - $135k
Exponent Inc. is seeking a Ph.D. level Machine Learning User Research Scientist in New York, NY. In this role, you will be involved in global data collection to support clients in the consumer electronics industry. Key responsibilities include consulting on large-scale...$167.4k - $310.8k
...development. Roche’s Research and Early Development... ...Intelligence (AI) to assist our scientists in both pRED and gRED... ...with cutting‑edge machine learning (ML) techniques. We... ...technical mentor to junior staff and interns, fostering... ...project ownership. LLM Expertise: Extensive...Local areaWorldwideRelocation package$164.23k - $234.62k
A prominent music streaming company is seeking a Senior Research Scientist for their Artist-First AI Music lab in New York City. This role... ...conducting innovative research in music generation and machine learning to enhance artist-fan connections. Candidates should have...Flexible hours$169.16k - $241.65k
...the technology that helps millions of listeners discover what they love. Within this space, our research team focuses on advancing the state of the art in machine learning and AI to shape the future of personalization. We explore new approaches, challenge existing assumptions...Work from homeFlexible hours- A premium consulting firm in New York is seeking a Machine Learning User Research Scientist with a Ph.D. You'll work on global data collection and support clients in the consumer electronics industry. Responsibilities include managing data, optimizing programs, and leading...
- ...Machine Learning Engineer / Researcher BoldVoice helps the 1 billion global non native English speakers speak English with clarity and confidence,... ...similar. Up to date with latest developments in using LLM tools like Claude Code, Cursor, Codex or similar to rapidly...Work at officeRelocation package
- The Voleon Group is seeking a Senior Member of Research Staff in New York to lead research projects in statistical machine learning. You will work alongside leading experts in AI and machine learning, solving complex finance-related problems. Candidates should have 5-1...Work visaRelocation package
$85k - $150k
...reality with the right fusion of deep learning, signal processing, and computational neuroscience... .... We're seeking a full time Research Scientist, Artificial Intelligence (PhD) to join... ...equivalent deep technical expertise in Machine Learning, Artificial Intelligence,...Full time$200k - $320k
A cutting-edge AI startup in New York is seeking a Senior Research Scientist to advance their machine learning initiatives. The successful candidate will hold a Ph.D., have substantial experience in research, and a passion for solving real-world challenges. This role offers...$197k - $290k
...significant experience in building and running complex systems, as well as a strong grasp of model evaluation and cost management. Competitive salary ranges between $197,000 and $290,000 USD, alongside comprehensive employee benefits for US-based staff. #J-18808-Ljbffr...Remote work- ...Intelligence Job Description: The Applied Machine Learning Scientist II is responsible for providing... ...emerging AI capabilities, including: *LLM-powered applications *AI copilots and... ...of emerging industry trends, academic research, and evolving AI technologies, proactively...Full timeWork experience placementWork at officeWork from homeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Machine Learning Research Scientist, LLM Evals. Be the first to apply!
- assistant scientist New York, NY
- downstream processing scientist New York, NY
- machine learning research scientist New York, NY
- drug safety scientist New York, NY
- remote scientist New York, NY
- variant scientist New York, NY
- hplc scientist New York, NY
- graduate scientist New York, NY
- operations research scientist New York, NY
- senior scientist New York, NY


