Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Data Quality Auditor — Onsite, Model Evaluation

$34 per hour

Welocalize

Overview Welo Data is looking for sharp, curious, and detail-oriented individuals to join our team as Data Quality Associate. This is not a traditional annotation role. You’ll be working directly with cutting-edge AI systems — evaluating outputs, identifying gaps, and helping improve how these systems behave in real-world scenarios. The work sits at the intersection of data quality, model evaluation, and human judgment , where your ability to think critically matters just as much as following guidelines. We’re looking for people who are naturally curious about AI, comfortable forming opinions, and confident in contributing to conversations with teammates, leads, and stakeholders. Project Details Job Title: Data Labeling Associate Hiring in: NYC, Seattle, Bellevue, Redmond, San Francisco, Sunnyvale, Burlingame. Hours: Full-time, 40 hours per week Employment Type: W2 Full-Time Employee Work Authorization: Must be authorized to work in the U.S. (no visa sponsorship) Pay Rate: $34/hour Contract Duration: 1-year contract with possibility of extension Important : This is a 100% onsite position — remote work is not available for this role. To be considered, candidates must be located in or able to commute to one of the following cities: New York City, Seattle, Bellevue, Redmond, San Francisco, Sunnyvale, or Burlingame. Please only apply if you meet this location requirement. What You'll Do Evaluate AI model outputs and provide structured, high-quality feedback Perform audit-based reviews of data and model behavior — identifying patterns, edge cases, and failure modes Apply guidelines thoughtfully — and flag when they don’t reflect real-world scenarios Contribute to improving evaluation frameworks, not just executing them Identify trends in model performance and communicate insights clearly Participate in team discussions, calibrations, and stakeholder syncs Partner with leads and cross-functional teams to refine quality standards Document findings in a clear, concise, and actionable way What We're Looking For Native-level language proficiency and a university degree (Bachelor’s or higher). B2 or superior level of English. 1–2 years of professional writing experience with strong, structured writing skills Ability to apply complex writing rules and guidelines consistently Strong understanding of safety considerations in GenAI data delivery, with 2+ years of relevant experience Strong critical thinking and attention to detail Ability to make sound judgment calls in ambiguous situations Naturally curious about AI, technology, and how systems behave Comfortable speaking up, asking questions, and contributing ideas Strong written and verbal communication skills Ability to stay consistent while working with evolving guidelines Experience in data quality, QA, annotation, or analysis is helpful — but not required Benefits Paid Vacation: 6 days Paid Company Holidays: 2 days (Memorial Day and Labor Day) Paid Sick Leave: accrued per applicable state law and company policy Medical, Dental, and Vision Insurance (eligibility applies) Health Savings Account (HSA) 401(k) Retirement Plan Employee Assistance Program Additional voluntary benefits (life, accident, critical illness, etc.) Free Gourmet Food: Free breakfast, lunch, and dinner are provided, featuring a wide variety of cuisines in multiple cafes. Micro-kitchens & Snacks: Offices are stocked with free snacks and beverages, including premium coffee and La Croix. Unique Campus Features: Some locations include roof-top nature parks Commuter Benefits: Free transport, shuttles, and sometimes bike-to-work perks. #J-18808-Ljbffr Welocalize

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the AI Data Quality Auditor — Onsite, Model Evaluation in San Francisco, CA vacancy
  • Welocalize is seeking a Data Quality Associate to evaluate AI model outputs and provide structured feedback. This is a full-time, onsite role located in San Francisco. The ideal candidate possesses a Bachelor's degree and has 1-2 years of professional writing experience... 
    Data
    Full time

    Welocalize

    San Francisco, CA
    1 day ago
  • Welo Data is seeking a Data Labeling Associate to evaluate AI model outputs and improve data quality. The role requires native-level Canadian English proficiency and a relevant degree, offering a full-time, onsite position in cities including San Francisco and NYC. Responsibilities... 
    Data
    Full time

    Welo Data

    San Francisco, CA
    2 days ago
  •  ...YO IT Consulting is seeking a Senior Data Architect to contribute to how AI systems reason about complex enterprise data. This remote...  ...with cloud platforms. Responsibilities include evaluating AI responses, refining models, and providing structured feedback on data architecture... 
    Data
    Remote work

    YO IT Consulting

    San Francisco, CA
    16 days ago
  •  ...seeking a dedicated member for our ML Data Team to lead video data preparation and evaluation. This role includes defining...  ...automating processes, and enhancing data quality through collaboration. Ideal...  ...over 5 years of experience in AI data operations, proficiency in Python... 
    Data
    Flexible hours

    Twelve-Labs

    San Francisco, CA
    13 hours ago
  • Welocalize is seeking a Data Quality Associate based in San Francisco for a full-time position. This role involves evaluating AI outputs and providing detailed feedback, with applicants needing native-level language proficiency and a university degree. Successful candidates... 
    Data
    Full time

    Welocalize

    San Francisco, CA
    1 day ago
  • $240.45k - $300.3k

     ...Machine Learning Engineer - Model Evaluations, Public Sector The Public...  ...team at Scale deploys advanced AI systems-including LLMs, agentic...  ..., regression testing, and quality assurance for ML systems....  ...Background in algorithms, data structures, and object-oriented... 
    Data
    Full time

    Scale AI

    San Francisco, CA
    4 days ago
  • A leading AI solutions company in San Francisco is seeking an ML Eval Engineer to design evaluation benchmarks and improve model performance. This role involves working with unstructured enterprise data and collaborating closely with the ML and engineering teams. You will... 
    Data

    Reducto

    San Francisco, CA
    1 day ago
  •  ...multimodal foundation models that have the ability to...  ...Ventures, and prominent AI visionaries and...  ...vital member of our ML Data Team - which leads the...  ...preparation and model evaluation. This role comes with high...  ...partnership, annotation, and quality evaluation work as possible... 
    Data
    Work at office
    Worldwide
    Flexible hours

    Twelve Labs, Inc

    San Francisco, CA
    3 days ago
  • $180k - $270k

     ...the world’s most trusted AI work companion for...  ...the highest standards of data security and privacy protection...  ..., data pipelines, or evaluation harnesses that can run at scale against live model checkpoints. Can deeply...  ...transcription accuracy, audio quality, and reasoning of audio... 
    Data
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    13 hours ago
  • $25 per hour

    Prolific is seeking AI Training Experts to assist in training and evaluating cutting-edge AI models. The role involves completing tasks such as analyzing and writing...  ...home. Prolific creates a global pool for quality human data, connecting researchers with quality participants... 
    Data
    Remote job
    Hourly pay
    Work from home
    Flexible hours

    Prolific

    San Francisco, CA
    2 days ago
  •  ...Francisco is seeking an experienced data operations professional for...  ...-language data preparation, model evaluation, and requires strong skills...  ...should have over 5 years in AI data operations, the ability...  ...commitment to ensuring high-quality data. The position includes benefits... 
    Data
    Flexible hours

    Twelve-Labs

    San Francisco, CA
    1 day ago
  • Software Engineer (Model Evaluation & Benchmarking) About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems...  ...measure realism, consistency, and quality across image, video, and...  ...Python, or similar). Strong data structures and algorithms... 
    Data

    SpreeAI

    San Francisco, CA
    3 days ago
  •  ...San Francisco is seeking an innovative Quality Engineer for their AI products. This role blends ops,...  ...the AI engineering team, you will use data to shape how AI behaves, work with partners...  ...user satisfaction through effective evaluation baselines. Competitive salary and... 
    Data

    Notion

    San Francisco, CA
    13 hours ago
  • $180k - $220k

     ...Software Engineer to develop datasets and evaluation systems that enhance AI models performance. This role involves designing data slices, running experiments, and collaborating...  ...in data manipulation. Role is fully onsite in San Francisco, CA, with a competitive compensation... 
    Data

    David Joseph & Company

    San Francisco, CA
    3 days ago
  •  ...Expert Network to connect with leading AI labs and companies seeking your expertise...  ...our network contribute to Training and evaluating AI models in physics Creating tasks and deliverables...  ...modeling, experimental design and data analysis, and scientific programming Strong... 
    Data
    Contract work
    Remote work

    Mercor Inc

    Oakland, CA
    1 day ago
  • $320k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...Research Engineers to build the evaluations that tell us — and the world —...  ...and leadership use to monitor model health during training,...  ...operating distributed systems, data pipelines, or other infrastructure... 
    Data
    Remote job
    Work at office
    Visa sponsorship
    Flexible hours
    San Francisco, CA
    a month ago
  •  ...Management Expert in San Francisco, California. This role involves evaluating AI-driven storage management systems and ensuring the...  ...should possess extensive experience in storage administration and data migration, familiar with cloud storage platforms like AWS, GCS... 
    Data

    Obsidian

    San Francisco, CA
    1 day ago
  • $93.6k - $220.4k

     ...Safety (T&S) Responsible AI Policy team's mission...  ...the development of GenAI models and applications are...  ...product, engineering, data science, operations, red...  ...drive end-to-end policy to evaluate workflows for your...  ...policy and evaluation quality over time. Identify emerging... 
    Data
    Temporary work
    Local area

    Tik Tok

    San Francisco, CA
    2 days ago
  •  ...how the next generation of AI systems understand construction...  ...work. You’ll challenge and evaluate advanced language models on construction engineering...  ...experience with AI data training, annotation, or evaluating...  ..., cost, schedule, safety, quality, and documentation. Review... 
    Data
    For contractors
    Remote work

    YO IT Consulting

    San Francisco, CA
    4 days ago
  •  ...candidate with a PhD in chemistry to design tasks and workflows evaluating scientific reasoning. Ideal candidates will have strong...  ...is a plus. This role is crucial for improving data quality and model evaluation in a collaborative environment. #J-18808-Ljbffr... 
    Data

    Cypress HCM

    San Francisco, CA
    13 hours ago
  • $180k - $260k

    Perplexity is looking for a Model Behavior Architect to help shape...  ...through well-designed research and evaluation projects. These projects may...  ...Demonstrated passion for AI and can share specific, related...  ...philosophy, psychology, linguistics, data science, or related fields.... 
    Data

    Perplexity

    San Francisco, CA
    1 day ago
  •  ...Science Professionals to join their Expert Network. In this role, you'll evaluate AI-generated scientific responses, fact-check technical claims, and ensure ethical alignment in biological data. Ideal candidates will have a BS, MS, or PhD in relevant fields and experience... 
    Data
    Remote job
    Work from home
    Flexible hours

    Prolific Academic Ltd

    San Francisco, CA
    1 day ago
  • Build the AI infrastructure layer of the physical world At Meter...  ...team to build and train models that understand these systems,...  ...latency really matter. Unmatched data advantage, control over the full...  ...all decisions on a network. Evaluate model performance over real‑... 
    Data

    Meter

    San Francisco, CA
    3 days ago
  • $172.43k - $230.95k

     ...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is on a mission to accelerate...  ...experts across energy, manufacturing, data center construction, and cloud services....  ...management: versioning, lineage, evaluation, and reproducible fine-tuning at scale.... 
    Data
    Temporary work

    Crusoe

    San Francisco, CA
    1 day ago
  • $281k - $356k

     ...Senior Staff Software Engineer, Model Post Training Waymo is an...  ...the next generation of frontier AI models. You will: Post...  ...researchers across ML, infra, and data teams. Raise the technical bar for how Waymo trains, evaluates, and deploys LLM models in the... 
    Data
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  •  ...Research Engineer - Language Model Pre-Training , you'll shape our...  ...collection, processing, and evaluation Architecture and methodology...  ...training pipelines – including model/data parallelism, distributed...  ...we do and love discussing AI Benefits and Perks: Comprehensive... 
    Data
    Work at office
    Relocation package

    Zyphra

    San Francisco, CA
    4 days ago
  • $207k - $285k

    About the Team The Human Data team at OpenAI is responsible for identifying and mitigating risks in advanced AI systems by designing evaluations, surfacing vulnerabilities, and collaborating...  ...closely with researchers to strengthen model reliability and public trust.... 
    Data
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  • $320k

     ...interpretable, and steerable AI systems. We want AI to be safe...  ...tooling, infrastructure, and evaluations. You’ll build systems that help...  ...evaluation systems that measure model capabilities across diverse...  ...at scale Develop pipelines for data collection, processing, and analysis... 
    Data
    Work experience placement
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    4 days ago
  • $208k - $300k

    A leading AI company is seeking a Machine Learning Engineer in the Public Sector to develop automated evaluation pipelines for AI models. You will work on advanced AI systems and ensure they perform reliably in mission-critical environments. Ideal candidates have a strong... 

    Scale AI, Inc.

    San Francisco, CA
    1 day ago
  • $60 - $80 per hour

     ...Mercor, headquartered in San Francisco, is seeking a STEM PhD to enhance AI model training through innovative problem-solving. This contract role entails independent work, a commitment of 15–20 hours per week, and offers a competitive pay between $60 and $80 per hour.... 
    Data
    Hourly pay
    Contract work
    Remote work

    Mercor Inc

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Data Quality Auditor — Onsite, Model Evaluation. Be the first to apply!