AI Data & Model Evaluation Lead
Twelve-Labs
Twelve-Labs in San Francisco is seeking an experienced data operations professional for their ML Data Team. This role focuses on video-language data preparation, model evaluation, and requires strong skills in Python and project management. Ideal candidates should have over 5 years in AI data operations, the ability to manage large datasets, and a commitment to ensuring high-quality data. The position includes benefits like full health coverage and a flexible PTO policy. #J-18808-Ljbffr Twelve-Labs
- ...Francisco is seeking a dedicated member for our ML Data Team to lead video data preparation and evaluation. This role includes defining dataset needs, automating... ...should have over 5 years of experience in AI data operations, proficiency in Python, and strong communication...DataFlexible hours
- ...YO IT Consulting is seeking a Senior Data Architect to contribute to how AI systems reason about complex enterprise data. This remote... ...with cloud platforms. Responsibilities include evaluating AI responses, refining models, and providing structured feedback on data architecture...DataRemote work
$200k - $365k
...building the world's most trusted AI work companion for professionals... ...committed to the highest standards of data security and privacy protection.... ...systems, data pipelines, or evaluation harnesses that can run at scale against live model checkpoints. Can deeply partner...DataFull timeWork at officeWorldwide$240.45k - $300.3k
...Machine Learning Engineer - Model Evaluations, Public Sector San Francisco... ...at Scale deploys advanced AI systems—including LLMs, agentic... ...Background in algorithms, data structures, and object-oriented... ...technologies that power the world's leading models, and help enterprises...DataFull time- A leading AI solutions company in San Francisco is seeking an ML Eval Engineer to design evaluation benchmarks and improve model performance. This role involves working with unstructured enterprise data and collaborating closely with the ML and engineering teams. You will...Data
- A cutting-edge AI company located in San Francisco is seeking an ML Eval Engineer to enhance model evaluations and ensure quality metrics. This role involves designing benchmarks,... ...complex problems, and a background in AI or data infrastructure. The position is in-person...Data
- ...-edge multimodal foundation models that have the ability to comprehend... ...Ventures, and prominent AI visionaries and founders... ...be a vital member of our ML Data Team - which leads the full spectrum of video-language... ...data preparation and model evaluation. This role comes with high...DataWork at officeWorldwideFlexible hours
$25 per hour
Prolific is seeking AI Training Experts to assist in training and evaluating cutting-edge AI models. The role involves completing tasks such as analyzing and writing annotations... ...creates a global pool for quality human data, connecting researchers with quality participants...DataRemote jobHourly payWork from homeFlexible hours- Software Engineer (Model Evaluation & Benchmarking) About the Role We are hiring Engineers focused on AI Model Evaluation to build the systems that ensure multimodal AI behaves... ...(C++, Java, Python, or similar). Strong data structures and algorithms fundamentals. Understanding...Data
- ...innovative Quality Engineer for their AI products. This role blends ops,... ...AI engineering team, you will use data to shape how AI behaves, work with partners in leading labs, and ensure user satisfaction through effective evaluation baselines. Competitive salary and benefits...Data
- TwelveLabs is seeking a key member for its ML Data Team in San Francisco. This role involves designing evaluation frameworks, managing data operations, and collaborating... ...should have over 5 years of experience in AI data operations, proficiency in Python and a strong...DataFlexible hours
- Twelve Labs in San Francisco is seeking a vital ML Data Team member to lead video-language data preparation and model evaluation. You will define dataset needs, automate... ...collaborate cross-functionally with engineering and AI model teams. Ideal candidates have over 5 years...DataFlexible hours
- ...contribute directly to how the next generation of AI systems understand construction work. You'll challenge and evaluate advanced language models on construction engineering topics to... ...not required.Previous experience with AI data training, annotation, or evaluating AI-...DataRemote work
- ...candidate with a PhD in chemistry to design tasks and workflows evaluating scientific reasoning. Ideal candidates will have strong... ...is a plus. This role is crucial for improving data quality and model evaluation in a collaborative environment. #J-18808-Ljbffr...Data
$180k - $260k
Perplexity is looking for a Model Behavior Architect to help shape... ...through well-designed research and evaluation projects. These projects may... ...Demonstrated passion for AI and can share specific, related... ...philosophy, psychology, linguistics, data science, or related fields....Data- ...Science Professionals to join their Expert Network. In this role, you'll evaluate AI-generated scientific responses, fact-check technical claims, and ensure ethical alignment in biological data. Ideal candidates will have a BS, MS, or PhD in relevant fields and experience...DataRemote jobWork from homeFlexible hours
$172.43k - $230.95k
...Senior Software Engineer For The Ai Model Lifecycle Team Crusoe is... ...energy, manufacturing, data center construction, and cloud... ...management: versioning, lineage, evaluation, and reproducible fine-tuning... ...years of industry experience leading and driving impactful...DataTemporary work- Build the AI infrastructure layer of the physical world At Meter... ...team to build and train models that understand these systems,... ...latency really matter. Unmatched data advantage, control over the full... ...all decisions on a network. Evaluate model performance over real‑...Data
$281k - $356k
...Staff Software Engineer, Model Post Training Waymo... ...engineers to join our team to lead the post-training LLM... ...generation of frontier AI models. You will:... ...researchers across ML, infra, and data teams. Raise the... ...for how Waymo trains, evaluates, and deploys LLM models...DataFull timeRemote work- ...Research Engineer - Language Model Pre-Training , you'll shape our... ...collection, processing, and evaluation Architecture and methodology... ...training pipelines – including model/data parallelism, distributed... ...we do and love discussing AI Benefits and Perks: Comprehensive...DataWork at officeRelocation package
$320k
...interpretable, and steerable AI systems. We want AI to be safe... ...Research Engineers to build the evaluations that tell us — and the world —... ...and leadership use to monitor model health during training,... ...operating distributed systems, data pipelines, or other infrastructure...DataRemote jobWork at officeVisa sponsorshipFlexible hours- ...for the world's most dynamic AI companies, like Cursor, Notion... ...frontier of AI to bring cutting-edge models into production. We're growing... ...helping developers discover, evaluate, and select the right models... ...low‑code, API‑first, or model/data platform company. BENEFITS...DataFlexible hours
$40 per hour
A leading AI data firm is looking for Python Developers to join as Domain Expert participants. The role involves training and evaluating AI models through Python tasks. Candidates should have verifiable Python experience, strong attention to detail, and a reliable internet...DataHourly payRemote workWork from homeFlexible hours- Refresh AI is seeking a Research Engineer in San Francisco to push the boundaries of benchmarking technology. You will build benchmarks that labs use for evaluating coding abilities and computer-use capability. Your role will require expertise in reinforcement learning...Full time
$208k - $300k
A leading AI company is seeking a Machine Learning Engineer in the Public Sector to develop automated evaluation pipelines for AI models. You will work on advanced AI systems and ensure they perform reliably in mission-critical environments. Ideal candidates have a strong...$148.5k - $266.2k
...Engineering Manager on the Model Delivery team within... ...Autodesk Research, you will lead production ML... ...deployment, monitoring, evaluation, reliability, and operational... ...generative models and other AI capabilities used across... ...) Experience with 3D data (geometry/CAD/BIM) and/or...DataFor contractorsRemote work- ...organization in San Francisco is seeking a Research & Evaluation Senior Lead to lead research efforts and manage impact... ...requires over 8 years of experience in data analysis leadership, with strong skills in survey design and using AI tools for analysis. Responsibilities...DataRemote workFlexible hours
$300 per month
...create ambitiously with AI — without sacrificing... ...Software Engineer for the Model LifeCycle team will... ...: versioning, lineage, evaluation, and reproducible fine-... ...of consistent success leading a varied portfolio of initiatives... ...alignment with market data. Equal Opportunity...DataTemporary work- ...Research Engineering Manager to lead the team of all-star AI researchers and engineers... ...for developing the models that drive our products. Our... ...technical contributions. Own the data, training, and eval... ...iteration velocity. Design evaluations and improve the production...Data
$125k - $135k
...The AI Education Project (aiEDU) is a non‑profit devoted to... ...About the Role The Research & Evaluation Senior Lead role is responsible for... ...impact and managing our impact data and organizational dashboards... ...agenda aligned with a logic model for our organization’s impact...DataTemporary workLocal areaImmediate startRemote workHome officeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Data & Model Evaluation Lead. Be the first to apply!
- data analytics lead San Francisco, CA
- data lead San Francisco, CA
- manager workday data conversion lead San Francisco, CA
- clinical data San Francisco, CA
- master data coordinator San Francisco, CA
- clinical data coordinator remote San Francisco, CA
- data intern San Francisco, CA
- data cabling installation San Francisco, CA
- data collection researcher San Francisco, CA
- data technician San Francisco, CA


