Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Product Manager, Public Sector GenAI Test & Evaluation (T&E)

$205.6k - $257k

Scale AI

Product Manager, Public Sector GenAI Test & Evaluation (T&E) San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC At Scale, our mission is to develop reliable AI systems for the world’s most important decisions. The Public Sector team is at the forefront of this mission, partnering with government agencies to deploy mission-critical agentic solutions. The Public Sector GenAI T&E Product Manager will be a high-horsepower technical leader, defining the vision and owning the roadmap for our evaluation capabilities. This role requires thriving in unscripted, high-stakes environments, as you will be the primary owner for the T&E tech stack—the robust infrastructure required to continuously measure, improve, and prove the superiority and sustained performance of our agentic applications. Traversing multiple engineering organizations across Scale, you will identify bottlenecks, distill technical friction into actionable plans, and drive execution. You will work across Scale’s commercial and public sector teams to define requirements, ensuring our evaluation services are robust enough for the most demanding government use cases. Key objectives include refining the tech stack that allows ML teams to hillclimb, and surfacing critical performance information to stakeholders. Minimum Qualifications (Quantifiable) Engineering Depth: 3+ years of experience in software engineering, systems architecture, or highly technical program management. You must be able to read code, understand system architecture, and participate in technical design reviews alongside engineering teams. Evaluation Systems Expertise: Proven experience designing, owning the roadmap for, or operating the infrastructure required to continuously measure, improve, and show the performance of AI applications. Problem Distillation: Demonstrated experience taking a vaguely defined problem (e.g., "our evaluation cycles are too slow") and delivering a technical roadmap, resource requirements, and measurable success metrics within a narrow time window. Ambiguity Management: Proven track record of taking a project from "stalled/undefined" to "shipped" in a high-pressure environment. You can point to at least two instances where you inherited a failing project and saw it through to production. Cross-Functional Leadership: Led multiple projects that required direct alignment between at least three distinct engineering organizations (e.g., Infrastructure, ML Research, and Product). Operational Execution: Experience using technical project management frameworks (e.g., Linear) to provide consistent weekly reporting on delivery velocity and blockers to executive stakeholders. Preferred Qualifications (Nice to Haves) Security Clearance: Active Secret, Top Secret, or TS/SCI clearance. GenAI Implementation: Practical experience developing or evaluating features built specifically on LLMs, RAG, or autonomous agent workflows. Technical Rigor: Advanced degree in Computer Science, Engineering, or a related field. Public Sector Expertise: 2+ years of experience working with DoD, IC, or Civil agencies on mission-critical software deployments. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You’ll also receive benefits including, but not limited to: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $205,600 - $257,000 USD The base salary range for this full-time position in the locations of Hawaii, Washington DC, Texas, Colorado is: $184,800 - $231,000 USD The base salary range for this full-time position in the location of St. Louis is: $154,400 - $193,000 USD PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Cisco, DLA Piper, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications. We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status. We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at View email address on click.appcast.io. Please see the United States Department of Labor's Know Your Rights poster for additional information. We comply with the United States Department of Labor's Pay Transparency provision. We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants’ needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information. #J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Product Manager, Public Sector GenAI Test & Evaluation (T&E) in Washington DC vacancy
  • $184.8k - $231k

     ...Scale AI is seeking a Product Manager for Public Sector GenAI Test & Evaluation in Washington, DC. This role involves leading the roadmap for evaluation capabilities, identifying technical bottlenecks, and ensuring robust evaluation services for government applications... 
    Suggested

    Scale AI

    Washington DC
    4 days ago
  • $205.6k - $257k

     ...important decisions. For the Public Sector, we translate this...  ..., high-horsepower Product Manager to lead the evolution...  ..., and model evaluation for internal and external...  ...state‑of‑the‑art model testing. You Will: Architect...  ...evaluation frameworks (T&E). Operationalize Collaboration... 
    Suggested
    Immediate start

    Scale AI

    Washington DC
    2 days ago
  • $89k - $110k

     ...seeking a self‑starting AI Product Manager to help build and scale internal...  ...more people on behalf of public sector organizations than Granicus—...  ...and labeling, prompt design, evaluation metrics, human‑in‑the‑loop...  ...ask sharp questions, pressure‑test assumptions, and use data to... 
    Suggested
    Work from home
    Home office
    Flexible hours

    Granicus

    Washington DC
    2 days ago
  •  ...E T Consultant - Product Owner Job #:...  ...World Bank Sector: Information Technology...  ..., we work with public and private sector...  ...Lean Portfolio Management. Duties...  ...coordinates user acceptance testing and validates that...  ...requirements, evaluate solution options,... 
    Suggested
    Work at office
    Local area
    Worldwide
    Flexible hours

    World Bank Group

    Washington DC
    5 days ago
  • $130.2k - $143.9k

     ...Public Consulting Group LLC (PCG) is a leading public sector solutions implementation and operations improvement...  ...Duties & Responsibilities Product Strategy & Portfolio Management Lead a team of Product...  ...Product Managers in evaluating product capabilities, identifying... 
    Suggested
    For contractors
    H1b
    Work at office
    Local area
    Remote work

    Public Consulting Group

    Washington DC
    3 days ago
  • $130.2k - $143.9k

    A leading public sector solutions firm is seeking an experienced Product Manager to lead a team in defining product strategy and managing enterprise SaaS products. This...  ...engagements, guiding product capabilities, and evaluating product ROI. With a necessary Bachelor's degree... 
    Remote work

    Public Consulting Group

    Washington DC
    5 days ago
  • $40 per hour

    DataAnnotation is seeking a Product Design Lead to evaluate AI-generated designs and provide feedback to improve aesthetics and usability. This independent contractor role allows you to choose projects and work from home, with project rates starting at $40+ per hour for... 
    Remote job
    Hourly pay
    For contractors
    Work from home

    DataAnnotation

    Washington DC
    4 days ago
  • $100k - $115k

     ...Description We are SGS – the world's leading testing, inspection and certification company....  ...world. Job Description The Global Product Manager, SQF ensures that the certification services...  ...for globally managed food schemes to evaluate opportunities for development and... 
    Work at office
    Local area
    Immediate start
    Remote work

    SGS

    Washington DC
    4 days ago
  •  ...Employment Type Full Time Location Type Hybrid Department Product Product Manager Arlington, VA — Full Time The defense market is surging, but...  ...funds and DoD/national security veterans. We believe that public sector mission sets matter above anything else. If you feel the... 
    Full time
    Work at office
    Flexible hours

    Obviant

    Arlington, VA
    4 days ago
  •  ...companies & insurance brokers. Our products are designed to make it easier for our customers to manage the complexity of employee...  ...implemented by developers and tested by QA. Lead strategic discussions...  ...and internal support teams to evaluate usage feedback. Define and track... 

    Employee Navigator

    Bethesda, MD
    5 days ago
  • $125k - $145k

     ...Job Title: Product Manager, Digital Product & Experience American Chemical Society | Washington...  ...Design and conduct assumption and A/B testing to validate solutions Proactively identify...  ...with AI tools in a product context - evaluating AI-powered features, integrating LLM... 
    Temporary work
    Work at office
    Remote work
    Flexible hours
    2 days per week

    Chemical Abstracts Service

    Washington DC
    4 days ago
  •  ...financial institutions grow, manage liquidity, and serve their communities...  ...profitable and usable products that customers love. Your...  ...team of product stakeholders to evaluate opportunities for new and...  ...conduct research and usability testing aimed at aligning product design... 
    Work experience placement
    Flexible hours

    IntraFi

    Arlington, VA
    5 days ago
  • $105k - $135k

     ...Position Overview The Product Manager at Tuckernuck owns the end-to-end checkout and post-...  ...dependencies, and ensuring thorough QA/testing Partner closely with Engineering, CX...  ...and clear communication across teams Evaluate, onboard, and manage third-party vendors... 
    Full time
    Local area

    Tuckernuck

    Washington DC
    4 days ago
  •  ...encouraged to apply. Senior Product Manager Softrams, a Tria...  ...of Tria Federal's new Public Health business unit....  ...across the federal sector. At Tria, we are a...  ...toidentifyand evaluate alternatives toward arriving...  ...research, and usability testing and engage customers... 
    Contract work
    Work at office
    Remote work

    Tria Federal

    Riverdale, MD
    21 days ago
  • $102.3k - $147.05k

     ...Overview: We are seeking a Product Owner – ESE Customer...  ...1. Business Intake & Demand Management Act as the primary intake...  ...requirements and user stories Evaluate solution options with a focus...  ...or administer a lie detector test as a condition of employment... 
    Local area

    UKG

    Washington DC
    4 days ago
  • $91k - $93k

     ...differentiate, but we prefer to. What to know As a Product Manager at Blue State, you’ll work with leading...  ...plans, write acceptance criteria, and test and iterate on deliverables for Blue...  .... Manage stakeholder feedback, evaluating all proposed features against the product... 
    Temporary work
    Work at office
    Remote work
    Flexible hours

    Blue State

    Washington DC
    5 days ago
  • $180k - $215k

     ...Product Manager Role at LangChain We're looking for a Product Manager to own key parts of LangSmith. Building AI...  ...debugging multi-step agent failures, or designing evaluation systems that scale from 10 to 10,000 test cases. You'll work directly with customers — from... 
    Work at office
    Flexible hours

    LangChain

    Washington DC
    6 days ago
  •  ...Senior Product Manager, AI As Senior Product Manager, AI, you will own one of the most ambitious...  ...use throughout their lives. Rapidly evaluate emerging AI capabilities, models,...  ...execution and iteration loop using A/B testing, user feedback, analytics, customer research... 
    Remote work
    Shift work

    XRC Ventures

    Washington DC
    13 hours ago
  •  ...encouraged to apply. Senior Product Manager Softrams, a...  ...of Tria Federal's new Public Health business unit....  ...across the federal sector. At Tria, we...  ...to identify and evaluate alternatives toward arriving...  ..., and usability testing and engage customers to... 
    Contract work
    Work at office
    Remote work

    Tria Federal

    Riverdale, MD
    2 days ago
  • $207.49k - $244.1k

     ...are looking for a high-caliber Senior Product Manager to lead the Trading Growth strategy for...  ...data-driven culture by running A/B tests on onboarding funnels, abandonment flows...  ...period. We encourage you to carefully evaluate how your skills and interests align with... 
    Local area

    Coinbase

    Washington DC
    1 day ago
  • $91.4k - $187k

     ...are seeking a highly motivated Senior Product Manager to lead product strategy, roadmap, and...  ...of AI/ML development lifecycles, model evaluation methodologies, and responsible AI practices...  ...health mandates, and/or drug testing requirements. Range and benefit information... 
    Temporary work
    Worldwide
    Flexible hours

    Oracle

    Washington DC
    6 days ago
  • $124k

     ...expect Responsible for outlining the product roadmap, setting feature priorities,...  ...best practices; designing, running, and evaluating A/B tests to optimize key flows; partnering on...  ...looking for Bring 8+ years of product management experience focused on eCommerce... 
    Work at office
    Remote work

    Zoom Corporation

    Washington DC
    2 days ago
  • $150k

     ...the Life: Hertz is seeking a Senior Product Manager - Pricing Strategy to lead the development...  ...Engineering. * Conduct pricing A/B tests to understand customer behavior, elasticity...  .... Collaborate closely with Finance to evaluate pricing impacts on revenue, margin,... 
    Worldwide

    The Hertz Corporation

    Washington DC
    3 days ago
  •  ...Product Data Manager Duration: 12+ months (possible extension) Location: Washington, DC 20...  ...communicate what the data means, and a Testing leader who understands how every layer...  ...log, write a database-level assertion, evaluate whether a security test is producing... 
    Day shift
    Afternoon shift

    Veterans Sourcing Group LLC

    Washington DC
    15 days ago
  • $207.49k - $244.1k

     ...are looking for a high-caliber Senior Product Manager to lead the charge on a critical frontier...  ...period. We encourage you to carefully evaluate how your skills and interests align...  ...candidate. The above pilots are for testing purposes and Coinbase will not use AI to... 
    Local area

    Coinbase

    Washington DC
    5 days ago
  • $74.7k - $187k

     ...capabilities. We are seeking a Senior Product Manager to help define and execute the future...  ..., automation, and workflow efficiency. Evaluate emerging technologies, including AI and...  ...occupational health mandates, and/or drug testing requirements. Range and benefit... 
    Temporary work
    Flexible hours

    Oracle

    Washington DC
    2 days ago
  • $112.5k - $151.88k

     ...Rocket Software is looking for a Senior Product Manager to own a high-impact portfolio spanning...  ...to all stakeholders.? Identify?and evaluate partnership, licensing, and ecosystem opportunities...  ...to require or administer a lie detector test as a condition of employment or... 
    Remote work
    Worldwide

    Rocket Software

    Washington DC
    3 days ago
  •  ...Senior Product Manager Title: Senior Product Manager – DOS R/PPR...  ...Planning, and Resources for Public Diplomacy (R/PPR) to coordinate...  ...Coordinates and performs testing and prototyping of new capabilities...  ...and monitoring and evaluation Experience working with AI... 
    Contract work
    Work at office
    Immediate start

    INALAB

    Washington DC
    4 days ago
  • $120.75k - $251.25k

     ...Senior Product Manager (Games) Yahoo serves as a trusted guide for hundreds of millions of...  ...external partners and developers to source, evaluate, and integrate third-party content....  ...prioritize work. ~ Experience with A/B testing, experimentation frameworks, and... 
    Casual work
    Work at office
    Flexible hours

    Yahoo

    Washington DC
    3 days ago
  • $207.49k - $244.1k

     ...are looking for a high-caliber Senior Product Manager to lead the charge on a critical frontier...  ...period. We encourage you to carefully evaluate how your skills and interests align...  ...candidate. The above pilots are for testing purposes and Coinbase will not use AI to... 
    Local area

    Coinbase

    Washington DC
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Product Manager, Public Sector GenAI Test & Evaluation (T&E). Be the first to apply!