Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Model Evaluation Program Lead

$300k - $320k

Anthropic

About the role: We are seeking a Technical Program Manager to lead our AI model evaluation initiatives across multiple workstreams. This role will be crucial in assessing the performance, capabilities, limitations, and potential risks of our AI models. Working closely with our Research, Trust & Safety, Frontier Redteaming, and Policy teams, you will drive high-priority evaluation projects to build new processes, align metrics with policy, and track measurable progress. You will help build and adapt the model evaluation program to ensure model deployments are rigorous and aligned with our commitment to responsible AI development. The ideal candidate will have a strong technical background and experience managing cross-functional programs in AI development, ML engineering, or related fields. You’ll be joining a team of Technical Program Managers who own and drive cross-functional programs that align to the company’s top priorities. In this role, you’ll have the opportunity to make a foundational impact as you contribute the scaling of a centralized TPM function for the company. Extremely strong soft skills are paramount, as our team is front and center in driving lots of company-wide changes and top priority initiatives that require generating buy-in, balancing various opinions, and competing for attention in our rapidly scaling environment. This role is a great fit for someone who has both seen excellence at scale and operated in rapidly scaling, high-ambiguity teams and scope. We are seeking candidates with deep TPM expertise but who are comfortable acting as adaptable generalists who add value fast. We excel at maintaining a broad view of our work but diving deep into the details when necessary. We understand business goals, translate and organize them into technical programs and projects, and drive execution. We are adept at engaging with both non-technical and technical stakeholders at all levels of the company, including executive leadership. In this role, you will have the opportunity to shape the development of advanced AI systems and contribute to Anthropic's mission of ensuring that AI benefits all of humanity. If you are passionate about responsible AI development, have a strong technical background, and thrive in a fast-paced, collaborative environment, we'd love to hear from you. Responsibilities: Partner with teams like Frontier Risk Evaluations, Security, and Trust & Safety to develop and implement comprehensive evaluation protocols for our latest frontier AI models Build a single source of truth for tracking all types of model evaluations as required by our Responsible Scaling Policy, AI safety institutes, the White House, and others Develop and maintain procedures for conducting evaluations, including designing test suites, coordinating red team exercises, and analyzing results Create and manage dashboards and reporting systems to track model performance, safety metrics, and evaluation outcomes across different AI systems and versions Lead cross-functional workshops to identify potential risks and edge cases for evaluation, ensuring thorough coverage of AI capabilities and limitations Coordinate with external partners and industry standards bodies to align our evaluation practices with emerging best practices in responsible AI development Provide detailed status reports, identifying technical risks, dependencies, and areas requiring additional support Facilitate communication and coordination between technical workstreams and stakeholders Continuously identify opportunities for technical process improvements and implement changes as needed Stay up-to-date with the latest developments in AI safety, ML engineering, and related fields to ensure the program remains at the forefront of responsible AI development You might be a good fit if you: Have several years of experience in technical program management, with a track record of successfully delivering complex technical programs, preferably in AI development, ML engineering, or related fields Have experience executing technical programs that require systems and engineering-level knowledge. Have exceptionally strong interpersonal and communication skills that enable you to influence without authority, build cross-organizational support, cooperation and action around initiatives and process adoption. Have experience prompt engineering on language models Have experience designing and/or running evaluations on Large Language Models Have knowledge of emerging AI governance frameworks and best practices Have a high threshold for navigating ambiguity and are able to balance setting strategic priorities with rapid, high-quality execution. Thrive in unstructured environments, and have a knack for bringing order to chaos. The expected salary range for this position is: Annual Salary:

$300,000—$320,000 USD

Logistics Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. US visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate; operations roles are especially difficult to support. But if we make you an offer, we will make every effort to get you into the United States, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Compensation and Benefits* Anthropic’s compensation package consists of three elements: salary, equity, and benefits. We are committed to pay fairness and aim for these three elements collectively to be highly competitive with market rates. Equity - For eligible roles, equity will be a major component of the total compensation. We aim to offer higher-than-average equity compensation for a company of our size, and communicate equity amounts at the time of offer issuance. US Benefits - The following benefits are for our US-based employees: Optional equity donation matching. Comprehensive health, dental, and vision insurance for you and all your dependents. 401(k) plan with 4% matching. 22 weeks of paid parental leave. Unlimited PTO – most staff take between 4-6 weeks each year, sometimes more! Stipends for education, home office improvements, commuting, and wellness. Fertility benefits via Carrot. Daily lunches and snacks in our office. Relocation support for those moving to the Bay Area. UK Benefits - The following benefits are for our UK-based employees: Optional equity donation matching. Private health, dental, and vision insurance for you and your dependents. Pension contribution (matching 4% of your salary). 21 weeks of paid parental leave. Unlimited PTO – most staff take between 4-6 weeks each year, sometimes more! Health cash plan. Life insurance and income protection. Daily lunches and snacks in our office. #J-18808-Ljbffr Anthropic

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the AI Model Evaluation Program Lead in Seattle, WA vacancy
  • Welo Data in Washington is seeking a Data Labeling Associate to evaluate AI systems and improve their performance. This position is full-time and requires native-level British English proficiency and a university degree. Responsibilities include evaluating AI outputs, identifying... 
    Suggested
    Full time

    Welo Data

    Seattle, WA
    5 days ago
  •  ...seeking a Senior Python Infrastructure Engineer to work remotely on critical AI model development tasks. You will design, build, and optimize data pipelines, annotation tools, and evaluation systems essential for next-generation AI models. This contract role allows for... 
    Suggested
    Contract work
    Remote work
    Flexible hours

    Alignerr

    Seattle, WA
    1 day ago
  • $34 per hour

    Welocalize is seeking a Data Quality Associate in Seattle to evaluate AI model outputs and provide structured feedback. This role involves high-level interaction with cutting-edge AI systems and requires a Bachelor’s degree along with strong professional writing skills.... 
    Suggested
    Full time

    Welocalize

    Seattle, WA
    3 days ago
  • Welo Data is seeking a Data Labeling Associate in Washington State. This full-time role involves evaluating AI model outputs and improving data quality. The ideal candidate should have native-level Australian English proficiency, a bachelor’s degree, and strong analytical... 
    Suggested
    Full time

    Welo Data

    Seattle, WA
    1 day ago
  • Welo Data is hiring a Data Labeling Associate in Washington with expertise in evaluating AI systems. This role focuses on providing structured feedback on model outputs and requires strong writing skills and attention to detail. As a full-time employee, you will engage... 
    Suggested
    Full time

    Welo Data

    Seattle, WA
    1 day ago
  • $100 per hour

    A leading technology firm is seeking finance experts to enhance AI models. Responsibilities include evaluating performance in capital markets and creating assessment rubrics. Candidates should have 2+ years in finance fields like investment banking and possess strong financial... 
    Remote job
    Hourly pay
    10 hours per week

    Turing

    Seattle, WA
    3 days ago
  • Welocalize is seeking a Data Quality Associate in Seattle to evaluate AI model outputs and provide structured feedback. You will work directly with advanced AI systems and contribute to improving evaluation frameworks. Candidates must possess a Bachelor's degree, be proficient... 
    Full time
    Contract work

    Welocalize

    Seattle, WA
    3 days ago
  • $104.03k - $145.64k

    Blue Origin is seeking a Supply Chain Program Manager III in Seattle, responsible for managing supply chains in a high-rate manufacturing environment. The ideal candidate will collaborate with engineers to develop sourcing strategies and ensure supplier readiness through... 

    jobs.frontdoordefense.com - Jobboard

    Seattle, WA
    2 days ago
  • $123.1k - $186.3k

    Salesforce is looking for a Technical Program Manager in Seattle to lead and manage complex technical initiatives within our Technology & Products organizations. The candidate will work with engineering, product management, and sales to ensure timely project delivery.... 

    Salesforce

    Seattle, WA
    4 days ago
  • Google is seeking a Technical Program Manager Lead for Workspace in Kirkland, WA. You will lead and direct program management for a large engineering organization focused on Generative AI programs, influencing cross-functional teams and driving strategic initiatives. The... 

    Google

    Kirkland, WA
    3 days ago
  • The Boeing Company is seeking a Senior Information Technology Program Management Specialist for our team in Seattle or Everett, WA. This...  ...IT project plans, working collaboratively across teams, and leading efforts to enhance data integrity across Boeing’s commercial operations... 

    The Boeing Company

    Seattle, WA
    1 day ago
  • brobstongroup.com - Jobboard is looking for a Senior Manager, Program Management for Finance Insights, Analytics & AI to lead a range of enterprise programs focused on finance automation and analytics modernization. This role involves program leadership, stakeholder management... 

    brobstongroup.com - Jobboard

    Seattle, WA
    1 day ago
  • DocuSign is looking for a Partner Operations Program Manager based in Seattle, WA. The role focuses on managing the operational strategy...  ...for partner programs and ensuring the effectiveness of Data and AI capabilities across partnerships. Candidates should have over eight... 
    Work at office

    DocuSign

    Seattle, WA
    1 day ago
  • Affirm is seeking a professional for their Model Risk Management team. In this role, you will challenge and validate machine learning...  ...experience in model validation, strong technical skills in programming languages like Python, and knowledge in machine learning frameworks... 
    Remote job
    Flexible hours

    Affirm

    Seattle, WA
    3 days ago
  • $148k - $186k

     ...the dreamers and builders in the world. We are looking for a Lead Technical Program Manager to lead the data center build‑outs for our high‑...  ...You’ll join a fast‑paced team dedicated to scaling our Agentic AI Cloud, making sure our physical hardware is deployed quickly... 
    Remote work

    DigitalOcean

    Seattle, WA
    2 days ago
  • VigorCare Pediatric Services is looking for a skilled AI Model Validator in Seattle, Washington. You will be responsible for validating...  ...hold a Bachelor's degree and possess strong analytical and programming skills. If you're detail-oriented and can manage multiple projects... 

    VigorCare Pediatric Services

    Seattle, WA
    1 day ago
  •  ...is seeking a Student Researcher in Seattle to conduct research on infrastructure for AI foundation models. This role requires pursuing a PhD in computer science and strong programming skills, focusing on efficiency and reliability in large-scale systems. Interns enjoy... 
    Internship

    Pangleglobal

    Seattle, WA
    5 days ago
  • We are seeking a highly skilled AI Model Validator to join our team in Seattle, Washington. As an AI Model Validator, you will be responsible...  ...learning algorithms and AI technology Experience with programming languages such as Python, R, or Java Familiarity with... 

    VigorCare Pediatric Services

    Seattle, WA
    1 day ago
  • CommonSpirit Health is hiring a Manager for Continuing Medical Education to develop and evaluate educational programs for medical staff. This role involves ensuring compliance with accreditation standards and assessing educational needs. The ideal candidate will possess... 

    CommonSpirit Health

    Seattle, WA
    3 days ago
  • brobstongroup.com - Jobboard is looking for a Program Manager to lead enterprise programs aimed at improving inventory visibility and controls. The ideal candidate will own complex programs, translating strategic objectives into executable plans while ensuring stakeholder... 

    brobstongroup.com - Jobboard

    Seattle, WA
    5 days ago
  • $96k - $200k

    Indeed is seeking an experienced professional to manage complex, cross-functional programs in Seattle, Washington. This role demands strong program leadership, risk management, and AI integration skills, aimed at enhancing operational excellence. Ideal candidates will have... 

    Indeed

    Seattle, WA
    3 days ago
  •  ...Veterans Health Administration is seeking a Supervisory Program Specialist (Patient Experience Officer) to lead initiatives at the VA Puget Sound Healthcare...  ...WA. The incumbent will focus on healthcare program evaluation and organizational change management, ensuring... 

    Veterans Health Administration

    Seattle, WA
    3 days ago
  • $233.6k - $362.2k

     ...and increasingly through AI and connected device...  ...equal chance to thrive and lead, everyone benefits. We...  ..., and child health programs. This individual will bridge...  ...Standard approach for evaluating AI projects, confirming...  ...whether capable models translate into real‑world... 
    H1b
    Local area
    Relocation

    SwiftCruit

    Seattle, WA
    4 days ago
  • $163.2k - $220.8k

     ...is looking for a Senior AI Risk Advisor to join...  ...how one of the world’s leading law firms harnesses AI...  ...excited to build governance programs that are both rigorous...  ...function end-to-end: evaluating the latest AI tools and...  ...assessments across the full model lifecycle — evaluating... 
    Remote job
    Work experience placement
    Worldwide
    Shift work

    Wilson-Sonsini-Goodrich-

    Seattle, WA
    2 days ago
  •  ...Seattle Art Museum is looking for a Manager of Gallery Learning to oversee the docent program and enhance education initiatives. This role includes recruiting, training, and evaluating docents, as well as creating educational resources for school and public tours. The... 

    Seattle Art Museum

    Seattle, WA
    2 days ago
  • $180k

     ...xAI xAI’s mission is to create AI systems that can accurately...  ...multimodal engineer on the Imagine Model Team, you will develop cutting...  ...visual and audio data. Design evaluation frameworks, metrics,...  ...Qualifications Track record in leading studies that significantly improve... 
    Temporary work

    xAI

    Seattle, WA
    4 days ago
  • $57 per hour

     ...inference, and heterogeneous hardware compilation technologies for AI foundation models. Conduct research on infrastructure and systems for large‑...  ..., mathematics, engineering, or a related field. Strong programming skills and solid foundation in algorithms, data structures,... 
    Hourly pay
    Internship
    Local area

    ByteDance

    Seattle, WA
    4 days ago
  • $116k - $174k

    Remote, USA Lead Compensation Program Manager - Market Strategy & Infrastructure Location: Remote, USA...  ...new and midstream projects, and build AI‑leveraged tools necessary to scale our...  ...structures, utilizing sophisticated modeling to ensure ranges are fiscally sustainable... 
    Work at office
    Immediate start
    Remote work
    Worldwide
    Relocation package
    Shift work

    Unity Technologies

    Bellevue, WA
    4 days ago
  • $188k - $275k

    Lead Technical Analyst, Workspace AI, Trust and Safety Google Seattle, WA, USA Benefits...  ...in one or more programming languages (e.g., Python,...  ...Anomaly Detection, or AI models. Preferred Qualifications...  ...safety, prompt injection evaluations, and misuse prevention across... 
    Temporary work
    Work experience placement

    Google Inc.

    Seattle, WA
    4 days ago
  • $20 per hour

    A growing tech company is seeking a Freelance Contractor to evaluate and improve AI chatbots. This position allows for flexible hours and project selection, with competitive pay starting at $20 per hour. Applicants should have strong English skills and an eye for detail... 
    Remote job
    Hourly pay
    For contractors
    Freelance
    Flexible hours

    DataAnnotation

    Seattle, WA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Model Evaluation Program Lead. Be the first to apply!