Product Manager, Public Sector GenAI Test & Evaluation (T&E)
$205.6k - $257kScale AI
Product Manager, Public Sector GenAI Test & Evaluation (T&E)
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. The Public Sector team is at the forefront of this mission, partnering with government agencies to deploy mission-critical agentic solutions.
Role Overview
The Public Sector GenAI T&E Product Manager will be a high-horsepower technical leader, defining the vision and owning the roadmap for our evaluation capabilities. This role requires thriving in unscripted, high-stakes environments, as you will be the primary owner for the T&E tech stack—the robust infrastructure required to continuously measure, improve, and prove the superiority and sustained performance of our agentic applications.
Traversing multiple engineering organizations across Scale, you will identify bottlenecks, distill technical friction into actionable plans, and drive execution. You will work across Scale's commercial and public sector teams to define requirements, ensuring our evaluation services are robust enough for the most demanding government use cases. Key objectives include refining the tech stack that allows ML teams to hillclimb, and surfacing critical performance information to stakeholders.
Minimum Qualifications (Quantifiable)
- Engineering Depth: 3+ years of experience in software engineering, systems architecture, or highly technical program management. You must be able to read code, understand system architecture, and participate in technical design reviews alongside engineering teams.
- Evaluation Systems Expertise: Proven experience designing, owning the roadmap for, or operating the infrastructure required to continuously measure, improve, and show the performance of AI applications.
- Problem Distillation: Demonstrated experience taking a vaguely defined problem (e.g., "our evaluation cycles are too slow") and delivering a technical roadmap, resource requirements, and measurable success metrics within a narrow time window.
- Ambiguity Management: Proven track record of taking a project from "stalled/undefined" to "shipped" in a high-pressure environment. You can point to at least two instances where you inherited a failing project and saw it through to production.
- Cross-Functional Leadership: Led multiple projects that required direct alignment between at least three distinct engineering organizations (e.g., Infrastructure, ML Research, and Product).
- Operational Execution: Experience using technical project management frameworks (e.g., Linear) to provide consistent weekly reporting on delivery velocity and blockers to executive stakeholders.
Preferred Qualifications (Nice to Haves)
- Security Clearance: Active Secret, Top Secret, or TS/SCI clearance.
- GenAI Implementation: Practical experience developing or evaluating features built specifically on LLMs, RAG, or autonomous agent workflows.
- Technical Rigor: Advanced degree in Computer Science, Engineering, or a related field.
- Public Sector Expertise: 2+ years of experience working with DoD, IC, or Civil agencies on mission-critical software deployments.
Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.
Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is:
$205,600 - $257,000 USD
The base salary range for this full-time position in the locations of Hawaii, Washington DC, Texas, Colorado is:
$184,800 - $231,000 USD
The base salary range for this full-time position in the location of St. Louis is:
$154,400 - $193,000 USD
PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.
About Us:
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at View email address on click.appcast.io. Please see the United States Department of Labor's Know Your Rights poster for additional information.
We comply with the United States Department of Labor's Pay Transparency provision.
PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants' needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
$184.8k - $231k
Scale AI is seeking a Product Manager for Public Sector GenAI Test & Evaluation in Washington, DC. This role involves leading the roadmap for evaluation capabilities, identifying technical bottlenecks, and ensuring robust evaluation services for government applications....Suggested- ...company. We are currently looking for a Product Manager (GenAI) in the United States. As a Product... ...design and engineering teams to develop, test, and deploy new features using Agile/... ...talent efficiently and fairly. Our AI evaluates your CV and LinkedIn profile, analyzes...SuggestedSummer holidayWork at officeRemote workWorldwideHome office
$220.5k - $245k
...Ready for a career glow up? As Principal Product Manager – GenAI, you'll be leading the evolution of... ...Engineering and Enterprise Architecture to evaluate technical approaches, guide... ...Client Experience, leveraging rigorous testing and experimentation to measure success...Suggested$205.6k - $257k
...important decisions. For the Public Sector, we translate this... ..., high-horsepower Product Manager to lead the evolution... ..., and model evaluation for internal and external... ...state-of-the-art model testing. You will: Architect... ...frameworks (T&E). Operationalize Collaboration...SuggestedFull timeImmediate start$205.6k - $257k
...Drive the roadmap for Public Sector ML Ops tools, ensuring... ...foundation for building, evaluating, and deploying AI... ...evaluation frameworks (T&E). Operationalize Collaboration... ...data acquisition to Test & Evaluation. Grit: A... ...and necessary to manage applicants’ needs,...SuggestedFull time$103.75k - $174.75k
...AI Product Manager - GenAI and Agentic Capabilities New York, NY, United States Job Description Amex Digital Labs’ mission is to build... ...and continuous improvement using product analytics, model evaluation, and user feedback. Partner with engineering on the...Full timeWork at officeLocal areaImmediate startVisa sponsorshipFlexible hours- ...Leading AI-native product initiatives, the contract Senior Product Manager, GenAI will manage the product roadmap, drive the development of LLM-driven chat and... ...product development, including prompt design and evaluation methods Experience building or supporting chatbots...Contract workRemote work
- Senior Product Manager - AI & Public Sector About the Role We are hiring a senior product leader to define and... ...you should be comfortable building, testing, and iterating on software at a high... ...high‑impact solutions Continuously evaluate emerging AI technologies and...
- ...your family. As a Product Manager II, you will lead agile... ...on transforming how public sector organizations get work... ...ensure agentic and GenAI capabilities are integrated... ...presentations and evaluating role * Experience... ...best practices (e.g., test driven development, continuous...ApprenticeshipWork at officeLocal areaEasy workFlexible hoursShift work
$124.8k - $171.6k
...us put health first The Senior AI Product Managers work with cross-functional teams of... ...prototyping of AI solutions, including GenAI applications, AI agents,... ...product organization. Contribute to evaluation frameworks to test, benchmark, and compare AI models and...Bi-weekly payFull timeTemporary workApprenticeshipWork at officeWork from homeHome office$385k
...Product Manager, Developer Productivity San Francisco, CA | New York City, NY About Anthropic Anthropic... ...and researchers at Anthropic develop, build, test, and ship code—the foundation on which every model, evaluation, and product feature depends: Partner...Visa sponsorshipShift work- ...Public Consulting Group LLC (PCG) is a leading public sector solutions implementation and operations improvement... ...Duties & Responsibilities****Product Strategy & Portfolio Management*** Lead a team of Product... ...Product Managers in evaluating product capabilities, identifying...H1bWork at officeLocal areaRemote work
$195k - $303k
...We're a fast-growing public company where no one is... ...looking for a Senior Product Manager to own and drive critical... ...problems clearly, evaluate options and trade‑offs... ...design end‑to‑end: define test contracts and hold the... ...) Experience with GenAI/LLM applications in product...Work at officeLocal areaRemote workMonday to ThursdayFlexible hours$185k - $240k
...for our new AI-native product suite have grown over 2... ...comprehensive real-world uses of GenAI in healthcare finance... ...As a Senior Product Manager , you’ll own one of... ...design partners for rapid testing and deployment. This... ...and industry experts. Evaluate and recommend new product...$160k - $190k
...AI Visualization Platform Product Manager Hybrid NYC (3 days a week in office) $160k - $... ...months. These are the metrics you will be evaluated against. # Lift scan-to-booking... ...cadence of no fewer than 4 shipped A/B tests per month in GrowthBook. # Reduce scan...Work at officeLocal area3 days per week- ...Apex Systems LLC is seeking a Senior Product Manager, GenAI, to manage the product roadmap and drive the development of LLM-driven chat and agent experiences. You will work with cross-functional teams in a remote setting. Key qualifications include 5-8+ years of experience...Remote work
$350k
...A leading GenAI company is seeking a Head of Product to take ownership of their B2B GenAI product from concept to launch. This role requires deep experience in managing customer-facing GenAI products and leading technical teams. The ideal candidate has hands-on experience...Remote work- ...redefining how people manage owning a car, one of their... ...our Growth team as a Product Manager. This role is... ...millions of users discover, evaluate, and purchase car... ...trust Contribute to A/B testing, funnel analysis, and... ...conversations about our use of GenAI, such as this Forbes...Contract workPart timeFreelanceLocal areaFlexible hours
$97.2k - $150k
...Education is seeking a Sr. Product Owner, AI Authoring... ...drives revenue growth. Lead evaluation, selection, and management of AI technology partners... ...Lead business acceptance testing (BAT), product training, and... ...concepts as applied to LLM/GenAI systems (e.g., embeddings,...Remote workWorldwide- ...intelligence solutions that help public and private sector agencies investigate and... ...as possible Build and manage a conference and events... ...partners to brief and pressure-test content for distribution... ...and leverage You will be evaluated on applied AI fluency during...Worldwide
$175k - $250k
...neighborhoods in a nationwide public‑private safety... ...of scale. As our product footprint expands... ...and private sectors, our tools support... ...re hiring a Senior Manager of Product Design... ...with the ability to evaluate and elevate the... ...engaging in discovery, testing, validation, and...Work at officeRemote workWork from homeHome officeFlexible hours- ...expansion, and commercial productivity through AI-enabled... ...Lead end-to-end change management for every initiative —... ...understanding of GenAI/ML capabilities and delivery... .../platforms; vendor evaluation and partner management... ...completion of a drug test and background investigation...Work at officeWorldwide
- A technology company is seeking an experienced Product Manager to help train AI models. You will evaluate chatbots’ performance and implement improvements. Candidates should have strong financial reasoning skills, with a preference for those holding or pursuing advanced...Remote jobHourly pay
$237.6k - $297k
...of generative AI (GenAI). We are seeking a product leader to join our... ...proven experience in managing complex projects with... ..., ensuring Scale's public sector AI solution aligns... ...that can help evaluate thousands of pages... ...managing development, testing, and launches Lead...Full time$58k - $115k
...Automation & Workflow Product Owner - Wealth Management Platforms Associate Location... ...artificial intelligence (GenAI, Agentic AI), and... ...potential. Business Analysis & Testing : Conduct comprehensive... ...Process Improvement : Evaluate and optimize existing processes...Temporary workWork at officeWorldwideShift work- ...are seeking an experienced Product Owner / Product Manager to lead the end-to-end... ...lifecycle of Enterprise AI and GenAI products. This role is... ...governance 5. AI Model & Evaluation Ownership • Support LLM... ...AI adoption 6. Testing, Launch & Ongoing...Day shift
$170k - $320k
...We are seeking a Director of Product Marketing, Growth & Platforms... ...touchpoints. Portfolio Lifecycle Management (LCM): Develop our LCM system... ...high-velocity multivariate testing frameworks to optimize... ...Proficiency: Ability to translate GenAI capabilities (e.g., LLMs,...Temporary workLocal areaWorldwide- ...the backlog and roadmap; manage priorities and... .... Support vendor/tool evaluations for AI and automation platforms... ...5-7 years in product, ERP, process automation... ...knowledge of AI/ML and GenAI (RAG, copilots/agents,... ...successful completion of a drug test and background investigation...
- ...Staff Technical Product Manager London, UK; New York, NY; San Francisco, CA Scale has... ...most important decisions. Scale's Global Public Sector (GPS) team is growing quickly and... ...equivalent experience). Experience training or evaluating models is a plus ~ Deep intellectual...
$110.35k - $181.29k
...A leading insurance company in the United States seeks a Senior Product Manager for Risk Evaluation & Delivery. This role involves defining the product roadmap and collaborating with diverse teams to enhance digital platforms. Candidates should have at least 10 years of...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Product Manager, Public Sector GenAI Test & Evaluation (T&E). Be the first to apply!
- product manager data analytics New York, NY
- product manager New York, NY
- ux product manager New York, NY
- senior product manager mobile New York, NY
- product brand manager New York, NY
- workday product manager New York, NY
- regulatory product manager New York, NY
- product manager - entry level New York, NY
- iot product manager New York, NY
- product engineering manager New York, NY

