AI Evaluation Program Manager
TwelveLabs
Who We Are At Twelve Labs, we are pioneering the development of cutting‑edge multimodal foundation models that can comprehend videos just like humans do. Our models have redefined the standards in video‑language modeling, empowering us with more intuitive and far‑reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media. With a remarkable $107 million in Seed and Series A funding, our company is backed by top‑tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei‑Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation. We are a global company that values the uniqueness of each person’s journey. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI. About the Role You will be a vital member of our ML Data Team – which leads the full spectrum of video‑language data preparation and model evaluation. This role comes with high ownership and includes responsibilities such as defining dataset needs and requirements in consultation with our research and product teams, designing and building data pipelines, and driving our post‑training model evaluation strategy. You will also be responsible for automating as much of the repetitive partnership, annotation, and quality evaluation work as possible. A desire to work cross functionally and to build relationships is critical for success in this position. You Will Model Evaluation: Design and build robust model evaluation frameworks, automating repetitive processes and maintaining a balanced approach to efficiency and depth to obtain evaluation metrics and feedback. Portfolio Monitoring: Manage resource allocation and timelines, adjusting direction flexibly based on real‑time information across all data streams in your product vertical. External Partner Collaboration: Enhance dataset and process quality through seamless collaboration with vendors and outsourcing partners. Data Quality & Tooling Advancement: Establish labeling guidelines, monitor data quality, and improve tools and infrastructure to build a sustainable data operations framework. Internal Collaboration: Partner with Engineering and AI Model teams to align on top priority data needs, design tools such as analytical reports and dashboards, and clearly communicate project progress. You May Be a Good Fit If You Have 5+ years of experience working in an AI focused data operations organization. A proven track record designing and executing large‑scale data or evaluation projects, including gathering, labeling, and post‑processing data. The ability to analyze messy and complex data, identify overarching patterns, and distill your findings into crisp annotation guidelines or model quality reports. Proficiency with Python, LLMs, or other popular industry tools for automation. Excellent communication and project management skills, and the ability to support several projects simultaneously. A foundational understanding of and interest in LLMs/VLMs and multimodal AI. Conviction that data is the key ingredient for the performance and assessment of AI models. You’ll Stand Out If You Have Experience in data collection and labeling for multimodal language models. Experience in red teaming, localization testing, or other evaluation focused fields. Experience working with research scientists and engineers. Expertise or interest in video‑centric domains, such as sports, advertising, and content creation. Tech Stack Development & Analysis: Python (primarily pandas, Jupyter, etc.) Data Management & Visualization: Amazon S3, various data visualization tools (framework‑agnostic) Project Management Tools: Linear, Notion Even if there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply! If you are a 0‑1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at TwelveLabs. We are a global company that values the uniqueness of each person’s journey. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Benefits and Perks Open and inclusive culture and work environment. Work closely with a collaborative, mission‑driven team on cutting‑edge AI technology. Full health, dental, and vision benefits. Flexible PTO and parental leave policy. Office closed the week of Christmas and New Years. #J-18808-Ljbffr TwelveLabs
- A leading AI development firm situated in San Francisco is looking for a Research Program Manager to coordinate the development and evaluation of AI benchmarks. You will collaborate across teams to ensure rigorous evaluation processes are upheld and results shared with...SuggestedRelocation package
$300k - $320k
About the role: We are seeking a Technical Program Manager to lead our AI model evaluation initiatives across multiple workstreams. This role will be crucial in assessing the performance, capabilities, limitations, and potential risks of our AI models. Working closely with...SuggestedWork at officeHome officeVisa sponsorshipRelocation package$207k - $285k
OpenAI is seeking a Technical Program Manager in San Francisco to lead initiatives that ensure the safety and robustness of its AI models. The role involves collaborating with diverse teams to turn risks into actionable plans. Ideal candidates will have experience in technical...Suggested$126.9k - $197.8k
...What you'll do The Program Manager for Enterprise Transformation is a highly driven individual with demonstrated... ..., and customer success teams Identify, evaluate, and champion opportunities to integrate Artificial Intelligence (AI) into Program Management processes and...SuggestedPermanent employmentFull timeWork at officeLocal areaRemote work2 days per week$132.7k - $206.8k
.... With intelligent agreement management, Docusign unleashes business-... ...pivotal moment-transitioning to an AI-first enterprise. This isn't... ...experience value. As a Program Manager, AI & Innovation, you... ...problems like model drift, evaluation uncertainty, data governance,...SuggestedPermanent employmentFull timeContract workWork at officeLocal areaRemote work2 days per week$147.68k - $236.28k
...Role Overview Axon’s Corporate AI Team sits within the... ...Design and deliver training programs, workshops, demos, and enablement... ...covering topics like secrets management, data classification, prompt... ...Ecosystem & Community Building Evaluate new AI tools and vendors; make...Work at office- ...deeper understanding in healthcare. Our AI‑powered platform was purpose‑built for medical... .... The Role As the AI Enablement Program Manager, you will own Abridge’s company‑wide strategy... ...& Approval: Own the AI tool intake and evaluation process, the single front door for any...Hourly payFull timeFlexible hours
$184k - $236k
...journey to build the world's most advanced AI infrastructure ecosystem. The Stargate... ...Stargate is hiring a Technical Program Manager to coordinate datacenter deployments in... ...maintaining and updating risk registers to evaluate contingencies and drive resolution. Support...Contract workFor contractorsWork at office3 days per week- ...freight with groundbreaking vision‑based AI, designed for today’s global logistics... ...present. Position Overview The Technical Program Manager (TPM) is responsible for driving the planning... ...by working with technical leads to evaluate schedule, cost, and scope impacts. Coordinate...Local area
$162k - $240k
...OpenAI needs to build frontier AI models. We run high-impact data acquisition programs, such as our web crawler and our... ...the Role We are hiring a Program Manager to join our team. In this role,... ...engagements for data sourcing - including evaluating suppliers, negotiating...- ...Technical Program Manager San Francisco, California About HumanSignal Real-world data is the competitive edge in AI. HumanSignal is a human data partner for companies building... ...recruit and manage the domain experts who evaluate model output, and run everything...Worldwide
$241.6k - $302k
...Director, Technical Program Manager San Francisco, CA About This Role As the Director... ...efficiency, and the seamless integration of AI capabilities across all business units.... ...including LLM fine-tuning, performance evaluation, and the infrastructure requirements for...Full timeShift work$148.7k - $201.2k
...experience on Twitch. As a Senior Technical Program Manager in this org, you'll own programs that... ...decisions, identifying dependencies, evaluating engineering approaches, and influencing... ...or online entertainment Experience with AI/ML systems, agentic architectures, or applied...Flexible hours$290k - $365k
...Technical Program Manager, Compute San Francisco, CA | New York City, NY | Seattle, WA About... ...reliable, interpretable, and steerable AI systems. We want AI to be safe and... ...foundation on which every model training run, evaluation, and inference workload depends. You'...Work at officeVisa sponsorshipFlexible hours$325k
...AI Technical Program Manager (Fintech) | San Diego / San Francisco (Hybrid) Salary: $325,000 base + bonus + equity I’m partnering with a high... ...— this is not a surface-level coordination role Evaluate vendor performance: data quality, model outputs, ROI Help...Work at officeRemote workFlexible hours3 days per week- ...Hardware Technical Program Manager Sesame believes in a future where computers are lifelike... ...functional teams (including world-class AI builders) to align on marketing & technical... ...selection, prototyping, materials evaluation, and manufacturing, ensuring adherence to...Full timeContract workOverseasFlexible hours
$148.7k - $201.2k
...experience on Twitch. As a Senior Technical Program Manager in this org, you'll own programs that... ...decisions, identifying dependencies, evaluating engineering approaches, and influencing... ...online entertainment - Experience with AI/ML systems, agentic architectures, or applied...Local areaFlexible hours- ...Valence has built the only AI native coaching platform for enterprise, offering personalized... ...Role Valence is hiring a Technical Program Manager to sit at the intersection of our... ...you submit will be used for recruiting, evaluation, legal compliance, and recordkeeping...Work at officeRemote work3 days per week
$211.2k - $264k
As a Technical Program Manager for the Platform team, you will partner with engineering teams to... ...development and maturity of the Scale Generative AI Platform (SGP). We are looking for a TPM... ...cluster utilization, or model training/evaluation setups. Masterful Communication: Proven...Full time$162k - $240k
...reliable signals for training and evaluation. We design and run end-to-end programs that capture the depth of human intent... ...About the Role As a Program Manager (PGM) in the Human Data team you... ...between our external vendors and AI trainers, ensuring human data campaigns...Flexible hoursShift work$365k
...reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial... ...of AI development. As a Technical Program Manager for Research, you\'ll define and build the... ...ideally with hands-on exposure to training, evaluation, or large-scale distributed systems Are...Work at officeVisa sponsorshipFlexible hoursShift work- ...field , 5+ years of experience as a Technical Program Manager in a software engineering environment , A... ...projects involving machine learning and/or AI , Experience with machine learning models, training pipelines, or evaluation frameworks , Excellent communication,...
- ...'ll be driving for others. You've used AI to fundamentally change how you work —... ...user experience, design engineering, or program management, and experience partnering with design... ...security, and performance Source and evaluate vendors when needed. Interface with legal...Immediate startShift work
$241.6k - $302k
...of TPM, Enterprise , you are not just managing programs; you are the architect of a new organizational... ..., and the seamless integration of AI capabilities across all business units.... ...including LLM fine‑tuning, performance evaluation, and the infrastructure requirements...Shift work$211.2k - $264k
As a Technical Program Manager for the Platform team, you will partner with engineering teams to... ...development and maturity of the Scale Generative AI Platform (SGP). We are looking for a TPM... ...cluster utilization, or model training/evaluation setups. Masterful Communication: Proven...Full time$140k - $185k
Overview Tiki AI builds the infrastructure that teaches frontier... ...AI models are trained and evaluated. If you want to work at the infrastructure... ...through multiple concurrent programs running across RLHF, coding... ...a steady-state operations management role. The deliverable is a...Local areaOverseas$290k - $365k
...reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial... ...systems. About the Role As a Technical Program Manager on the Compute team, you will help drive... ...on which every model training run, evaluation, and inference workload depends. You’ll...Visa sponsorship- ...Technical Program Manager - Multimodal Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is critical... ...full lifecycle: experimentation → training → evaluation → scaling → deployment Coordinate cross-...
$90k - $110k
POSITION TITLE Program Data and Evaluation Manager REPORTS TO Development Director SALARY $90,000 - $110,000 SCHEDULE Full‑time position. Onsite/hybrid. BENEFITS Health, Dental, and Vision insurance; Commuter stipend HOMELESS CHILDREN’S NETWORK MISSION STATEMENT Homeless...Full time- ...responsible for identifying and mitigating risks in advanced AI systems by designing evaluations, surfacing vulnerabilities, and collaborating closely... ...and public trust. About the Role As a Technical Program Manager, you will lead initiatives that test the safety and robustness...Work at officeRelocation package
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Evaluation Program Manager. Be the first to apply!
- program coordinator remote San Francisco, CA
- transformation program manager San Francisco, CA
- head of program management San Francisco, CA
- agile transformation program manager San Francisco, CA
- program supervisor San Francisco, CA
- design program manager San Francisco, CA
- program manager engineering San Francisco, CA
- executive program manager San Francisco, CA
- staff program manager San Francisco, CA
- event program manager San Francisco, CA


