AI Data & Model Evaluation Lead

Twelve-Labs

Twelve-Labs in San Francisco is seeking an experienced data operations professional for their ML Data Team. This role focuses on video-language data preparation, model evaluation, and requires strong skills in Python and project management. Ideal candidates should have over 5 years in AI data operations, the ability to manage large datasets, and a commitment to ensuring high-quality data. The position includes benefits like full health coverage and a flexible PTO policy. #J-18808-Ljbffr Twelve-Labs

Apply

Vacancy posted 5 days ago

Similar jobs that could be interesting for youBased on the AI Data & Model Evaluation Lead in San Francisco, CA vacancy

AI Model Evaluation Leader — Data Quality
...Francisco is seeking a dedicated member for our ML Data Team to lead video data preparation and evaluation. This role includes defining dataset needs, automating... ...should have over 5 years of experience in AI data operations, proficiency in Python, and strong communication...
Data
Flexible hours
Twelve-Labs
San Francisco, CA
4 days ago
Remote Kannada Evaluator for AI Model Quality
$15 - $20 per hour
Mercor is seeking a Generalist with proficiency in English and Kannada to conduct fact-checking and generate evaluation data. This role involves assessing model response quality and ensuring alignment with conversational guidelines. The ideal candidate will possess a Bachelor...
Data
Remote job
Hourly pay
Mercor
San Francisco, CA
4 days ago
ML Evaluation Engineer: Benchmark & Model Quality
A leading AI solutions company in San Francisco is seeking an ML Eval Engineer to design evaluation benchmarks and improve model performance. This role involves working with unstructured enterprise data and collaborating closely with the ML and engineering teams. You will...
Data
Reducto
San Francisco, CA
5 days ago
Model Evaluation & Data Quality Lead
...-edge multimodal foundation models that have the ability to comprehend... ...Ventures, and prominent AI visionaries and founders... ...be a vital member of our ML Data Team - which leads the full spectrum of video-language... ...data preparation and model evaluation. This role comes with high...
Data
Work at office
Worldwide
Flexible hours
Twelve Labs, Inc
San Francisco, CA
2 days ago
Senior Machine Learning Engineer - Model Evaluations, Public Sector New York, NY Apply →
$208k - $300k
Machine Learning Engineer - Model Evaluations, Public Sector San Francisco,... ...the team shaping the future of AI at Scale. Machine Learning... .... Background in algorithms, data structures, and object‑oriented... ...that power the world’s leading models, and help enterprises...
Data
Full time
Scale AI, Inc.
San Francisco, CA
4 days ago
Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco
$180k - $270k
...building the world’s most trusted AI work companion for professionals... ...committed to the highest standards of data security and privacy protection.... ...systems, data pipelines, or evaluation harnesses that can run at scale against live model checkpoints. Can deeply partner...
Data
Full time
Work at office
Worldwide
Plaud
San Francisco, CA
4 days ago
AI Data Quality & Model Evaluation Associate
Welocalize is seeking a Data Quality Associate to evaluate AI model outputs and provide structured feedback. This is a full-time, onsite role located in San Francisco. The ideal candidate possesses a Bachelor's degree and has 1-2 years of professional writing experience...
Data
Full time
Welocalize
San Francisco, CA
5 days ago
Remote AI Training Specialist: Model Tuning & Evaluation
$25 per hour
Prolific is seeking AI Training Experts to assist in training and evaluating cutting-edge AI models. The role involves completing tasks such as analyzing and writing annotations... ...creates a global pool for quality human data, connecting researchers with quality participants...
Data
Remote job
Hourly pay
Work from home
Flexible hours
Prolific
San Francisco, CA
1 day ago
Software Engineer (Model Evaluation & Benchmarking)
Software Engineer (Model Evaluation & Benchmarking) About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves... ...(C++, Java, Python, or similar). Strong data structures and algorithms fundamentals. Understanding...
Data
SpreeAI
San Francisco, CA
2 days ago
AI Model Evaluator & Data Quality Analyst
Welocalize is seeking a Data Quality Associate based in San Francisco for a full-time position. This role involves evaluating AI outputs and providing detailed feedback, with applicants needing native-level language proficiency and a university degree. Successful candidates...
Data
Full time
Welocalize
San Francisco, CA
5 days ago
AI Model Behavior Engineer—Quality & Evaluation
...innovative Quality Engineer for their AI products. This role blends ops,... ...AI engineering team, you will use data to shape how AI behaves, work with partners in leading labs, and ensure user satisfaction through effective evaluation baselines. Competitive salary and benefits...
Data
Notion
San Francisco, CA
4 days ago
Lead, Multimodal AI Evaluation & Data Ops
TwelveLabs is seeking a key member for its ML Data Team in San Francisco. This role involves designing evaluation frameworks, managing data operations, and collaborating... ...should have over 5 years of experience in AI data operations, proficiency in Python and a strong...
Data
Flexible hours
TwelveLabs
San Francisco, CA
5 days ago
Lead, Multimodal AI Evaluation & Data Ops
Twelve Labs in San Francisco is seeking a vital ML Data Team member to lead video-language data preparation and model evaluation. You will define dataset needs, automate... ...collaborate cross-functionally with engineering and AI model teams. Ideal candidates have over 5 years...
Data
Flexible hours
Twelve-Labs
San Francisco, CA
1 day ago
Research Lead, Model Evaluation & Training Insights
Anthropic is seeking a Research Lead for the Training Insights team to shape the evaluation of model capabilities. This hands-on leadership role involves developing innovative... ...You will play a crucial role in transforming how AI capabilities are assessed, working...
Remote work
Anthropic
San Francisco, CA
4 days ago
Member of Technical Staff (Model Behavior Architect)
$180k - $260k
Perplexity is looking for a Model Behavior Architect to help shape... ...through well-designed research and evaluation projects. These projects may... ...Demonstrated passion for AI and can share specific, related... ...philosophy, psychology, linguistics, data science, or related fields....
Data
Perplexity
San Francisco, CA
5 days ago
Remote Propulsion Engineer for AI Model Evaluation
YO IT Consulting is seeking a Senior Propulsion Engineer to evaluate AI-generated content related to propulsion engineering. This remote... ...processes would be advantageous. Join a team challenging AI language models to improve their technical reasoning. #J-18808-Ljbffr YO IT...
Remote job
YO IT Consulting
San Francisco, CA
2 days ago
Finance AI Model Evaluator - Contract, 20 hrs/week
$50 - $75 per hour
A leading tech company based in Australia is seeking an AI Model Evaluator on a contract basis. The role involves evaluating AI-generated responses, writing prompts, and providing justifications based on specific criteria. Ideal candidates will hold a Master's degree in...
Hourly pay
Contract work
Mercor
San Francisco, CA
2 days ago
Model Engineer - Member of Technical Staff
Build the AI infrastructure layer of the physical world At Meter... ...team to build and train models that understand these systems,... ...latency really matter. Unmatched data advantage, control over the full... ...all decisions on a network. Evaluate model performance over real‑...
Data
Meter
San Francisco, CA
2 days ago
Research Engineer - Language Model Pre-Training
...Research Engineer - Language Model Pre-Training , you\'ll shape our... ...collection, processing, and evaluation Architecture and methodology... ...training pipelines - including model/data parallelism, distributed... ...what we do and love discussing AI Benefits and Perks: Comprehensive...
Data
Work at office
Relocation package
Zyphra
San Francisco, CA
5 days ago
Senior Software Engineer, AI Model Lifecycle
$172.43k - $230.95k
...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is... ...energy, manufacturing, data center construction, and cloud... ...management: versioning, lineage, evaluation, and reproducible fine-tuning... ...years of industry experience leading and driving impactful...
Data
Temporary work
Crusoe
San Francisco, CA
5 days ago
Remote Financial Analyst for AI Model Training
...IT Consulting is seeking finance professionals to evaluate AI-generated financial analyses and enhance model reasoning capabilities. This role involves challenging... ..., and are capable of translating complex financial data into clear insights. The position is remote,...
Data
Remote job
YO IT Consulting
San Francisco, CA
1 day ago
Benchmarking Research Engineer: Frontier Model Evaluations
Refresh AI is seeking a Research Engineer in San Francisco to push the boundaries of benchmarking technology. You will build benchmarks that labs use for evaluating coding abilities and computer-use capability. Your role will require expertise in reinforcement learning...
Full time
Refresh AI
San Francisco, CA
5 days ago
Model Performance Software Engineer, Claude Code
$320k
...interpretable, and steerable AI systems. We want AI to be safe... ...tooling, infrastructure, and evaluations. You’ll build systems that help... ...evaluation systems that measure model capabilities across diverse... ...at scale Develop pipelines for data collection, processing, and analysis...
Data
Work experience placement
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
3 days ago
Technical Program Manager - Adversarial Model Research
$207k - $285k
About the Team The Human Data team at OpenAI is responsible... ...risks in advanced AI systems by designing evaluations, surfacing vulnerabilities,... ...researchers to strengthen model reliability and public trust... ...Program Manager, you will lead initiatives that test the safety...
Data
Work at office
Relocation package
OpenAI
San Francisco, CA
3 days ago
Staff Software Engineer, Model LifeCycle
$300 per month
...create ambitiously with AI — without sacrificing... ...Software Engineer for the Model LifeCycle team will... ...: versioning, lineage, evaluation, and reproducible fine-... ...of consistent success leading a varied portfolio of initiatives... ...alignment with market data. Equal Opportunity...
Data
Temporary work
Crusoe Energy Systems LLC
San Francisco, CA
5 days ago
AI Engineer - Model Performance
Role Overview We’re hiring a Model Performance Engineer to own the... ...infrastructure that makes the rest of the AI team faster. This is not a... ...than 1% quality degradation. Evaluate serving frameworks (vLLM vs... ...frameworks, understanding of data formatting, learning rate...
Data
Fathom
San Francisco, CA
4 days ago
AI Partnerships & Model Launch Lead
.... is seeking a Technical Business Development professional in San Francisco, CA. You will work directly with partners on AI infrastructure and model launches, serving as their primary contact and ensuring successful integration. Your role involves significant collaboration...
Gravity Engineering Services Pvt Ltd.
San Francisco, CA
4 days ago
AI Inference & Model Routing Lead
Anysphere is looking for an experienced leader for the Model Routing & Inference team in San Francisco. This role involves owning the inference platform that is crucial to AI interactions in the product. You will manage the whole inference path and be responsible for optimizing...
Anysphere
San Francisco, CA
5 days ago
AI Model Launch & Partnerships Lead
$220k - $270k
fal is seeking a Technical Business Development professional to manage partner relationships and drive successful AI model launches in San Francisco. The ideal candidate will possess over 4 years of experience in AI infrastructure and strategic partnerships. Responsibilities...
Contract work
fal
San Francisco, CA
2 days ago
Lead, CS AI Content
$92k - $115k
...Lead, CS AI Content Flex is a growth-stage, NYC headquartered FinTech company that is creating... ...tools, helping ensure AI can retrieve data, trigger actions, or route conversations... ...: experience with chatbot authoring, AI evaluation, or support QA. Compensation Flex...
Data
Full time
Local area
Relocation package
Flexible hours
2 days per week
3 days per week
FLEX Inc
San Francisco, CA
7 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Data & Model Evaluation Lead. Be the first to apply!