Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Program Manager - Model Evals and Safety

Reflection AI, Inc

Our Mission

Reflection's mission is to build open superintelligence and make it accessible to all .

We're developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond.

About the Role

Research Program Managers at Reflection are high-leverage leaders and operators who embed directly with research and infrastructure teams to accelerate the pace of frontier model development. They are not project trackers. They are force multipliers who bring clarity to ambiguity, drive decisions when the path forward is unclear, and ensure that the work happening across multiple teams connects into a coherent whole.

This is a foundational role. Reflection is building model evals and safety from the ground up, and this RPM will be at the center of that effort. You won't be stepping into an established function with existing processes and tooling. You will be the person who figures out what this function needs to look like, stands it up, and makes it real. That means defining the evaluation frameworks, building the operational infrastructure for model safety, establishing the processes that connect evals to the model development lifecycle, and laying the groundwork for how Reflection interfaces with the broader safety ecosystem. This is 0-to-1 work in its purest form.

You bring a first-responder mentality. When things go sideways, you don't wait to be asked. You jump in, assess the situation, cut through noise, align the people who need to be aligned, and drive resolution.

What You'll Do
  • Build the foundational infrastructure for model evals and safety at Reflection. Define the evaluation frameworks, tooling requirements, and operational processes that will underpin how we assess model capabilities, risks, and readiness for release.
  • Stand up model safety operations as a function, including establishing the workflows, review cadences, and decision frameworks that connect safety evaluation to the model development and release lifecycle.
  • Partner with research and engineering leads across pre-training, mid-training, and post-training to embed safety and evaluation checkpoints into the development process in a way that is rigorous without being a bottleneck.
  • Drive the scoping and prioritization of eval science and eval infrastructure investments, working with technical leads to determine what to build in-house, what to adopt, and where to invest research effort.
  • Establish Reflection's engagement with the external safety ecosystem, including third-party assessments, academic partnerships, and industry safety frameworks. Represent the company's safety posture to external stakeholders with credibility and clarity.
  • Create visibility and reporting structures that give leadership a clear, honest picture of model safety status, evaluation coverage, and open risks, so they can make informed decisions at the pace the business requires.
  • Champion a culture of blameless post-mortems and continuous learning, turning every safety-relevant finding into a concrete improvement to our systems and processes.
About You
  • 7+ years of experience in technical program management, research operations, or ML engineering, with demonstrated experience standing up new functions, teams, or programs from scratch.
  • Familiar with the landscape of model evaluation and AI safety, including evaluation methodologies, red-teaming, alignment research, and the evolving regulatory and industry safety ecosystem. You don't need to be a safety researcher, but you need to understand the space well enough to make sound judgments about what matters and what to prioritize.
  • Deep enough technically to engage with researchers and engineers on topics like model behavior, evaluation design, data pipelines, and safety-critical system architecture. You follow the technical thread and you know when something doesn't add up.
  • Proven ability to build structures where none exists. You've taken ambiguous mandates and turned them into functioning programs with clear ownership, measurable outcomes, and durable processes.
  • Strong stakeholder management skills spanning deeply technical ICs, research leadership, and external partners. You build trust through competence and follow-through.
  • Excited to build from zero to one. We are a small, fast-moving team and this role will help define how model safety and evaluation works at Reflection.
  • Motivated by enabling researchers and engineers to build the world's most capable open-weight AI systems, responsibly.
What We Offer:

We believe that to build superintelligence that is truly open, you need to start at the foundation. Joining Reflection means building from the ground up as part of a small talent-dense team. You will help define our future as a company, and help define the frontier of open foundational models.

We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.
  • Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally.
  • Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance.
  • Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning.
  • Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time.
  • Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Research Program Manager - Model Evals and Safety in San Francisco, CA vacancy
  •  ...Research Program Manager Reflection's mission is to build open superintelligence and make it...  ...all. We're developing open weight models for individuals, agents, enterprises...  ...role. Reflection is building model evals and safety from the ground up, and this RPM will... 
    Suggested
    Relocation package

    Reflection AI

    San Francisco, CA
    2 days ago
  • $207k - $285k

     ...surfacing vulnerabilities, and collaborating closely with researchers to strengthen model reliability and public trust. About the Role As a Technical Program Manager, you will lead initiatives that test the safety and robustness of OpenAI's models through creative... 
    Suggested
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    2 days ago
  • $120k - $200k

     ...network trains frontier AI models in the same way teachers teach...  ...team. You'll work alongside researchers, operators, and AI companies...  ...AI models. As a Research Program Manager, you will play a central role...  ...Face) Run and monitor new evals Support with marketing for... 
    Suggested
    Work at office
    Relocation package

    Mercor Alabaster

    San Francisco, CA
    2 days ago
  • $65 - $75 per hour

    Primary Skills: Program Management (Intermediate), Cross-functional Collaboration (Proficient), Detailed Tracking (Advanced), Stakeholder Communication...  ...Manager to enhance coordination and consistency within a research and strategy team. You will manage cross-functional... 
    Suggested
    Hourly pay
    Contract work

    Akraya

    San Francisco, CA
    4 days ago
  • $95 - $105 per hour

     ...Our client, a leading technology organization specializing in global communication platforms, is seeking a Research Program Manager IV to join their team. As a Research Program Manager IV, you will be part of the Data & Operations department supporting cross‑functional... 
    Suggested
    Hourly pay
    Weekly pay
    Temporary work
    Remote work
    Flexible hours

    ManpowerGroup Global, Inc.

    Daly City, CA
    1 day ago
  • $67.61 - $84.51 per hour

     ...Description Research Program Manager Full-time San Francisco, CA, US You'll be joining Adobe on a contract opportunity, employed through NextDeavor Benefits You'll Love NextDeavor offers health, vision and dental benefits for contract employees... 
    Hourly pay
    Permanent employment
    Full time
    Contract work

    NextDeavor

    San Francisco, CA
    1 day ago
  • $365k

     ...growing group of committed researchers, engineers, policy...  ...works across the full model development lifecycle,...  ...interpretability, and safety, each operating at the...  .... As a Technical Program Manager for Research, you'll define...  ...areas like compute, evals, RL environments, and... 
    Work at office
    Visa sponsorship
    Flexible hours
    Shift work

    Anthropic

    San Francisco, CA
    2 days ago
  •  ...to all. We're developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind...  ...beyond. About the Role Research Program Managers at Reflection are high-leverage leaders... 
    Relocation package

    Reflection AI

    San Francisco, CA
    2 days ago
  •  ...A leading technology organization is seeking a Research Program Manager IV to support cross-functional teams. This remote role requires at least 8 years of program management experience and 3 years in analytics. Key responsibilities include managing program operations... 
    Remote work

    ManpowerGroup Global, Inc.

    Daly City, CA
    1 day ago
  •  ...A leading technology organization is seeking a Research Program Manager IV to manage cross-functional projects. This role involves operational cadence, documentation improvement, and data analysis. The ideal candidate will have extensive program/project management experience... 
    Remote work

    ManpowerGroup Global, Inc.

    Daly City, CA
    1 day ago
  • $105k - $170k

     ...business needs of the team. We're looking for a Insights Program Manager to join our Marketing organization as a measurement and analytics...  ...methodologies, and explore emerging data technologies and AI modeling approaches that elevate how we work. You'll also serve as a... 
    For contractors
    Work at office
    Flexible hours

    LinkedIn

    San Francisco, CA
    2 days ago
  •  ...Safety Research PM We are seeking a Safety Research PM to bridge Cohere...  ...sits at the intersection of model research and product...  ...Requirements ~5+ years of product management or research operations...  ...researchers: you don't need to run evals yourself, but you need to... 
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    4 days ago
  • $207k - $295k

     ...About the Team Our Safety Systems team is at the forefront of OpenAI's mission to build...  .... Within Safety Systems, the Model Policy team aligns model behavior with desired...  ...model behavior. You will work closely with research, engineering, product, preparedness, and... 
    Work at office
    Work from home
    Relocation package
    Shift work

    OpenAI

    San Francisco, CA
    5 days ago
  • $310k - $380k

     ...About the team The Frontier Evals team builds north star model evaluations to drive progress towards safe...  ...loops to steer our training, safety, and launch decisions. Some of the team...  ...About you We are seeking exceptional research engineers that can push the... 
    Work at office
    Local area
    Relocation package
    Flexible hours

    OpenAI

    San Francisco, CA
    more than 2 months ago
  • $110k - $150k

     ...The Center for AI Safety (CAIS) is a leading research and advocacy organization focused on mitigating societal-scale risks from AI. We address...  ...Action Fund. We're seeking a highly skilled Program Manager who is excited by our mission to develop and promote the... 
    Work at office
    Local area
    Shift work

    Center for AI Safety

    San Francisco, CA
    5 days ago
  • $147.68k - $236.28k

     ...AI Evangelist - Program Manager San Francisco, California, United States...  ...society's most critical safety and justice issues with our...  ...platform that provides GPT-class models, chat assistants, and secure...  ...). You don't need to be a researcher, but you need to be a... 
    Work experience placement
    Work at office

    Axon

    San Francisco, CA
    14 hours ago
  • $90k - $95k

     ...Position Title: BSAFE Program Manager FLSA: Exempt, Full Time Salary: $90k to $95k annually with full benefits Schedule: 40 hours...  ...support to senior services. Summary: The BTWCSC BSAFE (Black Safety Access Freedom and Equity) Program Manager purpose is to... 
    Full time
    Contract work
    Temporary work
    Work experience placement
    For subcontractor
    Work at office
    Local area
    Monday to Friday
    Flexible hours
    Shift work
    Night shift
    Afternoon shift

    Booker T. Washington Community Center, INC.

    San Francisco, CA
    2 days ago
  • $93.6k - $220.4k

     ...Program Manager, T&S Global Integrity Programs Location: San Francisco Employment Type: Regular Job Code: A139278A The Global...  ...Integrity Programs (GIP) team is a central pillar of Trust & Safety, dedicated to safeguarding our platform information integrity... 
    Temporary work

    Tik Tok

    San Francisco, CA
    2 days ago
  •  ...A leading AI research accelerator based in San Francisco is seeking a medical expert in internal or emergency medicine for a remote contractor...  ...capabilities, ensuring high-quality patient care and safety. Ideal candidates will have an MD, strong clinical experience,... 
    For contractors
    Remote work
    Flexible hours

    Turing

    San Francisco, CA
    1 day ago
  • $290k - $365k

     ...growing group of committed researchers, engineers, policy...  ...systems that sit between our models and the real world....  ..., but reliable: when a safety-critical pipeline goes...  ...closely. As a Technical Program Manager for Safeguards Infrastructure and Evals, you'll own the... 
    Work at office
    Visa sponsorship
    Flexible hours
    Shift work

    Anthropic

    San Francisco, CA
    5 days ago
  • Job Summary :We are looking to hire a Clinical Research Manager to help develop, shape and grow the clinical research team in the Oncology research program. The Heme Malignancy research program is a fast-paced environment and the successful candidate will be an independent... 

    University of California , San Francisco

    San Francisco, CA
    2 days ago
  • $162k - $240k

     ...About the Team The Safety Systems team works to ensure OpenAI's most capable models can be developed and deployed responsibly...  ...is in need of a Safety Program Manager to streamline our safety review...  ...multiple stakeholders - across research, product, engineering, legal,... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    1 day ago
  •  ...Head of Research About the Company Respected AI research lab Industry Research Type Privately...  ..., publication strategy, and managing the relationship between research and applied...  ...thinking, particularly in the areas of safety and the balance between research freedom... 

    Confidential

    San Francisco, CA
    3 days ago
  •  ...Technical Program Manager - Multimodal Luma's mission is to build multimodal AI to expand...  ...for intelligence. To go beyond language models and build more aware, capable, and useful...  ...Manager to partner closely with researchers and engineers building state-of-the-art... 

    Luma AI

    San Francisco, CA
    2 days ago
  •  ...phase 1 clinical trial testing safety, colonization, acceptability...  ...The SF-based clinical research coordinator will perform independently...  ...the overall study with data management, generating reports, specimen...  ...leads research and training programs around the world to... 
    Traineeship
    Work at office
    Worldwide

    University of California , San Francisco

    San Francisco, CA
    5 days ago
  • $210k - $336k

     ...explorers, pursuing society's most critical safety and justice issues with our ecosystem of...  ...Your Impact As a Principal NPI Program Manager, you will own the operational engine that...  ...systems - translating acquired pricing models, product structures, and go-to-market approaches... 
    Work experience placement
    Live in
    Work at office
    Remote work
    Flexible hours

    Axon

    San Francisco, CA
    2 days ago
  • $162k - $240k

     ...design and run end-to-end programs that capture the depth...  ...-stakes uses of our models. Our remit spans bespoke...  ...partner closely across all research teams to translate...  ...Role As a Program Manager (PGM) in the Human Data...  ...in broader topics like safety, you find satisfaction... 
    Flexible hours
    Shift work

    OpenAI

    San Francisco, CA
    5 days ago
  •  ...schedules of billion-dollar infrastructure projects and improving safety on job sites. Backed by $350M in funding, we're working...  ...to have you join us. We're looking for a dynamic Technical Program Manager to help build Bedrock's platform from the ground up, driving... 
    Work at office
    Flexible hours

    Bedrock Robotics

    San Francisco, CA
    1 day ago
  •  ...stretching resources. Our system combines industry-leading safety guardrails with the largest dynamic knowledge graph built on...  ...Business Insider's top startups in healthcare. As a Technical Program Manager on the Customer Enablement team, you will lead the technical... 
    Work at office

    Infinitus LLC

    San Francisco, CA
    4 days ago
  • $175k

     ...Research Product Manager San Francisco Thinking Machines Lab's mission is...  ...Character.ai, open-weights models like Mistral, as well as popular...  ...technical products and programs that span research,...  ...contributions to areas like evals, multimodality, human-ai interaction... 
    Local area
    Immediate start
    Visa sponsorship
    Work visa
    Relocation package

    Thinking Machines Lab

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Program Manager - Model Evals and Safety. Be the first to apply!