Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Data Scientist, LLM Data Quality & Evaluation

Cohere

Cohere is looking for a Member of Technical Staff in Data Analysis and Evaluation to ensure the quality and performance of large language models. The role involves designing data collection tasks, collaborating with teams, and applying statistical methods for data evaluation. The ideal candidate will have strong software engineering and statistical skills, along with experience in machine learning frameworks. Applicants from diverse backgrounds are encouraged to apply! #J-18808-Ljbffr Cohere

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Staff Data Scientist, LLM Data Quality & Evaluation in San Francisco, CA vacancy
  • $210k - $385k

    A leading financial technology firm seeks a Data Scientist to architect evaluation pipelines and improve answer quality through innovative evaluations. Candidates should possess a PhD or MS in a technical field with at least 4 years of relevant experience, strong skills... 
    Quality

    Pantera Capital

    San Francisco, CA
    18 hours ago
  •  ...candidate for a role focused on improving answer quality across Perplexity's products. You will architect evaluation pipelines, design methods to measure tool impact...  ...technical field and at least 4 years of experience in data science or machine learning, alongside strong... 
    Quality

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    2 days ago
  • $200k - $235k

     ...LiveRamp is the data collaboration platform of choice...  ...global network of top-quality partners. Hundreds...  ...requirements. Staff Data Scientist LiveRamp is the data...  ...the performance of our LLM-based agentic system....  ...and implement rigorous evaluation methodologies to measure... 
    Quality
    Work at office
    Work from home
    Flexible hours
    Night shift

    LiveRamp

    San Francisco, CA
    4 days ago
  • $164.7k - $339.08k

    Position We are looking for a Staff Data Scientist for our Engagement Ecosystem...  ...outcomes. Designing and evaluating interventions that sustainably...  ...engagement, ads, content quality, monetization), turning highly...  ...scikit‑learn, or similar, plus LLM‑based assistants or copilots... 
    Quality
    Temporary work
    Work at office
    Local area
    Relocation
    Relocation package

    Pinterest

    San Francisco, CA
    1 day ago
  •  ...millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and our specialized data sources. We aim to use the latest models as...  ...Responsibilities Architect and maintain automated evaluation pipelines to assess answer quality across... 
    Quality

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    2 days ago
  •  ...Location Type Hybrid Department Data Science Perplexity serves...  ...daily with reliable, high-quality answers grounded in an LLM‑first search engine and...  ...for our users. As a Data Scientist/Engineer on this team, you...  ...with ground‑truth evaluations Analyze experimental results... 
    Quality
    Full time

    Pantera Capital

    San Francisco, CA
    18 hours ago
  • Obsidian is seeking a talented Data Scientist to join our leading AI lab’s GenAI team. The role involves guiding research teams on data science methodologies and improving AI training data quality. Candidates should have a minimum of 3 years of experience in data science... 
    Quality

    Obsidian

    San Francisco, CA
    3 days ago
  • $238k - $302k

     ...Staff Data Scientist, Driving Quality Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver....  ...simulation across 15+ U.S. states. Rigorous behavioral evaluation of the Waymo Driver is a critical part of scaling our... 
    Quality
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $300k

     ...Staff Data Scientist Grindr is an AI-native platform powering how millions of gay people connect...  ...deployed to production and improve quality of life for our millions of users. You...  ...broad, open-ended questions Design and evaluate experiments to measure the impact of... 
    Quality
    Work at office
    Immediate start
    Worldwide
    Flexible hours

    Grindr

    San Francisco, CA
    2 days ago
  • $175k - $235k

     ...road maps and strategies with data-driven analytics and insights. As a senior/staff data scientist focused on product analytics for...  ...decision-making * Design and evaluate product experiments and...  ...distributed data, validating quality and delivering insights without... 
    Quality
    Work at office
    3 days per week

    Gallup

    San Francisco, CA
    18 hours ago
  •  ...at About the Role We’re looking for a Data & LLM Systems Engineer to help us design, build...  ...for monitoring, debugging, and evaluating LLM behavior in production Analysis & Insight...  ...interactions to identify failure modes, drift, and quality issues and help ensure overall... 
    Quality
    Full time
    Home office
    Flexible hours

    Benchstrengthvc

    San Francisco, CA
    4 days ago
  • $160k - $200k

    The Opportunity As a Staff Data Scientist on the Clinical Performance team, you will be the lead...  ...providers. While your primary focus is evaluative, you will be a key player in the broader...  ...forecasting engines that predict our quality performance across various value-based... 
    Quality

    Pearl Health

    San Francisco, CA
    2 days ago
  • $179.5k - $269.5k

     ...billion since 2010. We're looking for a Staff Data Scientist, Finance to own the data science...  ...Design and deploy AI/ML solutions—including LLM-based workflows—to automate financial close...  ...reporting cycles, and improve data quality at scale. Own the financial data foundation... 
    Quality
    Full time
    Work at office
    Flexible hours

    GoFundMe

    San Francisco, CA
    2 days ago
  •  ...research for their next generation of LLM products. Join us if you: Wish to work...  .... Responsibilities Own LLM evaluation processes and methods with a focus on...  ...safety vulnerabilities. Generate high quality synthetic data, curate labels, and conduct rigorous benchmarking... 
    Quality
    Local area
    Shift work

    Capitolis

    San Francisco, CA
    1 day ago
  • $180k - $270k

     ...committed to the highest standards of data security and privacy...  ...distributed systems, data pipelines, or evaluation harnesses that can run at...  ...good" looks like for a Speech LLM, translating capabilities (...  ...transcription accuracy, audio quality, and reasoning of audio models... 
    Quality
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    18 hours ago
  • Dynamo AI is seeking a candidate to lead LLM evaluation and benchmarking in San Francisco, California. You will generate high-quality data and develop innovative methods for assessing the safety and helpfulness of LLMs. The role requires domain knowledge in evaluation techniques... 
    Quality

    Capitolis

    San Francisco, CA
    1 day ago
  • Reflection in San Francisco is seeking a Data Quality Engineer to ensure high-quality data for their LLM models. The role involves partnerships with research teams and designing automated QA methods. Ideal candidates should have strong engineering fundamentals and experience... 
    Quality

    Reflection

    San Francisco, CA
    1 day ago
  • B Capital is seeking a data engineer to ensure high data quality for training AI models. You will own the upstream data quality for LLM post-training and design automated QA methods in a collaborative environment. Ideal candidates will have strong engineering skills, a... 
    Quality

    B Capital

    San Francisco, CA
    2 days ago
  • Airbnb, Inc. is hiring a Senior Staff Machine Learning Engineer, focusing on driving evaluation strategies and data infrastructure for CSxAI initiatives. This role requires...  ...significantly impact ... operations to ensure quality and efficiency in AI applications.... 
    Quality
    Remote job

    airbnb, Inc.

    San Francisco, CA
    1 day ago
  • $230k - $284k

     ...Staff Product Data Scientist, Expansion Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver...  ...on high-impact projects across the company — from driving quality and operational efficiency to market analysis and rider... 
    Quality
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $187k - $230k

     ...Role: Gusto is looking for a highly skilled and motivated Data Scientist to join our Growth Data Science team. In this role, you'll leverage...  ...scale across multiple teams and influence org-wide decision quality in an AI native capacity. Thought Partnership: Serve as a... 
    Quality
    Full time
    Work at office
    Local area
    2 days per week
    3 days per week

    Gusto

    San Francisco, CA
    4 days ago
  •  ...Staff Data Scientist The Data team at Imprint builds the data foundation that powers smarter, faster decision-making. The team develops infrastructure...  ...both daily operations and long-term strategy, enabling high-quality insights into customer behavior, product performance, and... 
    Quality
    Relocation
    Flexible hours
    2 days per week
    3 days per week

    Imprint.com

    San Francisco, CA
    10 days ago
  • $155k - $189k

     ...Staff Data Scientist - Time Products San Francisco, CA - Hybrid; New York, NY - Hybrid; At Gusto, we're on a mission to grow the small...  ...highlight trade-offs and limitations based on sample size and data quality. Execution: Deliver multiple high-impact projects,... 
    Quality
    Full time
    Work at office
    Local area
    2 days per week
    3 days per week

    Gusto

    San Francisco, CA
    4 days ago
  •  ...Perplexity AI Data Scientist Perplexity is AI for people who expect more. This role brings that same standard to how our data team works...  ..., automated dbt model generation and validation, data quality agents that detect, diagnose, and fix issues autonomously... 
    Quality

    Perplexity AI

    San Francisco, CA
    18 hours ago
  • $179.5k - $269.5k

     ...Staff Data Scientist (Pricing) San Francisco, CA Want to help us help others? We're hiring!...  ...modern AI tools and coding agents (e.g., LLM-based assistants, autonomous or semi-autonomous...  ...-driven systems, including monitoring, evaluation, and iteration in live environments.... 
    Full time
    Work at office
    Flexible hours
    Shift work

    GoFundMe

    San Francisco, CA
    4 days ago
  • Ironclad, located in San Francisco, is seeking an AI Evaluation Engineer to join their team. This role involves...  ...closely with AI Engineers to improve model quality. Applicants should have 8+ years of experience in ML or data science, particularly in NLP applications.... 
    Quality
    Contract work

    Ironclad

    San Francisco, CA
    1 day ago
  •  ...offices. About the Role As a Senior Software Engineer (AI Data & Evaluation) at Mercor, you will be at the core of building the data...  ...of frontier AI models. Our team's mission is to develop high-quality data types that push frontier models forward and drive the AI... 
    Quality
    Work at office
    Relocation package

    Mercor Alabaster

    San Francisco, CA
    1 day ago
  •  ...Staff AI Scientist Hybrid - San Francisco, California Our...  ..., activity, and sleep quality by using their Oura Ring...  ...and learn from their data. We are building a...  ...ranking, generation, and evaluation — and you will be the...  ...and preserved by the LLM serving layer. Develop... 
    Quality
    Work at office
    Local area
    Flexible hours
    2 days per week
    3 days per week

    Oura

    San Francisco, CA
    18 hours ago
  • $204k - $259k

     ...hierarchical learning, and robust evaluation. This role follows a hybrid...  ...you will report to a Senior Staff Software Engineer. You will...  ...(RL), for evaluating the quality, safety, and realism of embodied...  ...and extend large large scale data and evaluation pipelines.... 
    Quality
    Full time
    Temporary work
    Remote work

    Waymo

    San Francisco, CA
    1 day ago
  • $302.4k - $378k

     ...Scale has been the leading AI data foundry, helping fuel the most...  ...building upon our prior model evaluation work with enterprise customers...  ...requires not only expertise in LLM agents and planning algorithms...  ...Our products provide the high-quality data and full-stack... 
    Quality
    Full time

    Scale AI

    San Francisco, CA
    7 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Data Scientist, LLM Data Quality & Evaluation. Be the first to apply!