Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Agentic AI Evaluation Engineer

$142.65k - $213.98k

Blueface Ltd

Job Summary The Agent Evaluation team is responsible for testing whether AI agents return the correct and expected responses. We build the framework, metrics, and test cases that validate agent behavior, accuracy, and reliability before release. Our goal is to ensure agents perform consistently and meet product and user expectations. Responsibilities Design and develop agent evaluation pipelines across development, staging, and production environments. Define and standardize evaluation metrics and benchmarks for conversational AI quality (accuracy, relevance, CX, safety). Build automated and human-in-the-loop evaluation systems to assess agent performance. Manage and curate evaluation datasets, test sets, and annotation workflows. Enable continuous evaluation and monitoring of agents in production. Integrate evaluation into CI/CD pipelines to support safe and efficient releases. Conduct experiments, A/B testing, and case studies to drive improvements in agent quality. Partner with engineering and product teams to deliver high-quality AI solutions. Create technical documentation and drive best practices across teams. Mentor junior engineers and contribute to team growth. Qualifications 5‑7 years of relevant experience in AI evaluation, customer support AI, or chatbot platforms. Strong understanding of responsible AI, including bias, fairness, and hallucination mitigation. Proficiency with large language models (LLMs) and machine learning (ML) techniques. Experience with benchmarking, CI/CD, and automation of evaluation pipelines. Excellent communication skills and ability to work cross‑functionally. Preferred Skills Experience in customer support AI or chatbot platforms. Understanding of responsible AI (bias, fairness, hallucination mitigation). Education & Certifications Bachelor’s degree in a related field (preferred but not required). Relevant certifications are a plus. Compensation Primary location: Pay range $142,651.46 – $213,977.19. Base pay is one part of the Total Rewards package and may include bonuses. Benefits Eligible employees receive comprehensive benefits covering physical, financial, and emotional well-being, including health, dental, vision, and various support resources. Equal Opportunity Comcast is an equal opportunity workplace. We will consider all qualified applicants for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran status, genetic information, or any other basis protected by applicable law. #J-18808-Ljbffr Blueface Ltd

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Agentic AI Evaluation Engineer in Washington DC vacancy
  • $142.65k - $213.98k

     ...for the remote option.) Job Summary The Agent Evaluation team is responsible for testing whether AI agents return the correct and expected responses. We...  ...improvements in agent quality Partner with engineering, and product teams to deliver high-quality AI solutions... 
    Suggested
    Work experience placement
    Work at office
    Remote work
    Worldwide

    Comcast Corporation

    Washington DC
    4 days ago
  • $99k - $225k

     ...Agentic AI Engineer The Opportunity: As an experienced engineer, you know how to design, develop, and deliver production-grade agentic...  ...-augmented generation (RAG), agentic orchestration, evaluation pipelines, and human-in-the-loop systems to deliver measurable... 
    Suggested
    Full time
    Contract work
    Part time
    Work at office
    Local area
    Remote work

    Booz Allen Hamilton

    Arlington, VA
    1 day ago
  • Agentic AI Engineer Location: Remote / Alexandria, VA Clearance: Eligibility to be cleared Are you ready to be part of a team that creates...  ...practices in developing agent-based AI solutions. Continuously evaluate, test, and enhance the capabilities and performance of... 
    Suggested
    Work at office
    Remote work

    GeoDelphi

    Alexandria, VA
    4 days ago
  • $99k - $225k

    The Opportunity We are looking for a highly skilled Agentic AI Engineer to join our team specializing in building autonomous, goal‑oriented...  ...performance, including ONNX, GGML, or Ollama. Develop evaluation frameworks to test agent reliability, safety, and performance... 
    Suggested
    Contract work
    Local area
    Shift work

    Booz Allen Hamilton

    Washington DC
    2 days ago
  • $176k - $265k

     ...Location Type Remote Department Engineering Compensation Zone 1 $195K -...  ...rebuilding biotech for the AI era. When a breakthrough is...  ...‑functional and company‑wide agentic AI applications that span departmental...  ...memory and state management, evaluation, and observability. Make... 
    Suggested
    Full time
    Local area
    Remote work

    Albert Invent

    Washington DC
    3 days ago
  • $160k - $205k

     ...Space builds next-generation AI systems that help “supercharge...  ...platform. We are seeking an engineer who thrives in a dynamic, fast...  ...and traceability. Develop evaluation and regression testing for agent...  ...toidentifyand prioritize agentic automation opportunities tied... 
    Temporary work
    Work at office
    Flexible hours

    Valid8 Financial, Inc.

    Washington DC
    3 days ago
  • $233.3k - $385k

     ...seeking a distinguished and customer-facing Fellow AI Engineer to define and lead UKG’s enterprise-wide Agentic AI strategy across the full UKG Product Suite....  ...execution and decision intelligence • AI safety, evaluation, and guardrails Establish standards for... 
    Local area

    UKG

    Washington DC
    3 days ago
  • $172.2k - $236.9k

     ...Principal AI Engineer The Principal AI Engineer acts as Humana's senior technical authority for agentic AI platforms, overseeing strategy, architecture, and engineering execution...  ...including model deployment, monitoring, evaluation, and lifecycle management. ~ Expert-... 
    Bi-weekly pay
    Apprenticeship
    Work at office
    Work from home
    Home office

    Humana

    Washington DC
    3 days ago
  • $77.6k - $176k

    ## AI DeveloperApplylocations: Arlington, VA: San Antonio, TXtime...  ...****:** As an experienced AI engineer, you know that AI systems are...  ...by applying cutting-edge Agentic frameworks. You’ll be part of...  ...automated testing, or model evaluation and monitoring in constrained... 
    Full time
    Contract work
    Part time

    Booz Allen Hamilton

    Arlington, VA
    4 days ago
  •  ...results for the government. We are currently seeking an Agentic AI Systems Engineer with 6+ years of experience to join our team and fully embrace...  ..., and alignment with applicable standards.  Design evaluation frameworks and test harnesses to assess quality, factuality... 
    Full time
    Temporary work
    Work experience placement
    Work at office
    Local area
    Immediate start
    Afternoon shift

    Corner Alliance

    Washington DC
    19 hours ago
  • $99k - $225k

     ...Agentic Ai Machine Learning Engineer The Opportunity: As an experienced machine learning engineer, you understand good software is more than...  ...to APIs, Cloud platforms, or databases ~ Experience evaluating LLM performance and building observation layers for stakeholders... 
    Full time
    Contract work
    Part time
    Work at office
    Local area
    Remote work

    Booz Allen Hamilton

    Washington DC
    20 days ago
  • $99k - $225k

    ## Agentic AI Machine Learning EngineerApplylocations: Washington, DC: McLean, VA: Arlington...  ...: R0240554Agentic AI Machine Learning Engineer**The Opportunity:**As an experienced...  ...Cloud platforms, or databases* Experience evaluating architectural tradeoffs and designing... 
    Full time
    Contract work
    Part time
    Work at office
    Local area
    Remote work

    Booz Allen Hamilton

    Washington DC
    3 days ago
  • $195.5k - $264.5k

     ...Family: Data Science and Data Engineering Job Qualifications: Skills: AI Agents, Collaboration, Microsoft...  ...most complex challenges. As an Agentic AI Engineer, you will be...  ...embedding to deployment and continuous evaluation, optimized for reusability and... 
    Temporary work
    Work at office
    Immediate start
    Remote work
    Worldwide
    Flexible hours

    General Dynamics Information Technology

    Falls Church, VA
    19 hours ago
  •  ...We build AI agents that actually work in enterprise environments — not prototypes, not demos. We need engineer's who can own the entire agent stack: a production frontend, a robust...  ...lead technical architect and builder of agentic systems running in AWS, OCI, and Azure.... 
    Temporary work

    Trilagen

    Bethesda, MD
    4 days ago
  •  ...Senior Agentic AI Developer We are seeking a Senior Agentic AI Developer to design, build...  ...in modern AI systems, strong software engineering skills, and hands-on experience...  ...Ensure reliability, observability, and evaluation of AI agent outputs Establish guardrails... 

    PLANIT Group

    Arlington, VA
    13 days ago
  • $40 per hour

    A leading technology firm is seeking experienced cybersecurity professionals to evaluate AI-generated security content and provide feedback for improving AI systems. This remote position offers flexibility in project selection, with an hourly pay starting at $40+. Candidates... 
    Hourly pay
    Remote work

    DataAnnotation

    Washington DC
    6 days ago
  • Blueface Ltd in Washington seeks an experienced AI Evaluator to design and develop evaluation pipelines for conversational AI. The role involves defining metrics, conducting experiments, and ensuring high-quality AI solutions. The ideal candidate will have 5-7 years of... 

    Blueface Ltd

    Washington DC
    4 days ago
  • A leading technology firm is seeking a skilled Agentic AI Engineer to develop advanced AI agents for cybersecurity. This role requires creativity, expertise in Python and agent frameworks, and a commitment to enhancing national security through innovative AI solutions.... 
    Remote job
    Work at office

    GeoDelphi

    Alexandria, VA
    4 days ago
  • $99k - $225k

    Booz Allen Hamilton is seeking an Agentic AI Engineer to join their team in Washington, DC. The role focuses on building autonomous AI systems that actively engage in multi-agent orchestrations, utilizing advanced technologies such as agent orchestration frameworks and... 

    Booz Allen Hamilton

    Washington DC
    2 days ago
  • $40 per hour

    A leading cybersecurity firm is looking for experienced cybersecurity professionals to evaluate AI-generated content and solve technical problems. The role involves working with advanced AI models, providing feedback, and contributing to the cybersecurity industry's future... 
    Hourly pay
    Remote work
    Flexible hours

    DataAnnotation

    Washington DC
    1 day ago
  • $229.9k - $262.4k

    Senior Lead AI Engineer(MLX, Agentic AI, Gen AI platform Services)Overview:At Capital One, we are creating responsible and reliable AI systems...  ...model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.Leverage... 
    Full time
    Part time
    Local area

    Capital One

    Mc Lean, VA
    1 day ago
  • $168.33k - $252.49k

     ...are looking for a Senior Software Engineer to lead the technical direction of our AI Agent initiatives. This is a...  ...production‑grade backend services and agentic workflows that solve complex, non...  ...operable over time. Testing & Evaluation of Probabilistic Systems - Design... 
    Worldwide

    Blueface Ltd

    Washington DC
    4 days ago
  • Corner Alliance is seeking an Agentic AI Systems Engineer to design and develop AI-enabled interfaces in Washington, D.C. The role requires expertise in software engineering, cloud application development, and AI integration. Responsibilities include constructing a secure... 

    Corner Alliance

    Washington DC
    3 days ago
  • $99k - $225k

    Phase2 Technology is seeking a skilled Machine Learning Engineer to design and implement AI solutions for the Defense and Intelligence sectors. The...  ...should be experienced in cloud environments, evaluation of model performance, and have a strong background in machine... 

    Phase2 Technology

    Alexandria, VA
    4 days ago
  • $130k - $150k

    BLEN is looking for an AI Engineer in Washington, DC. The role involves designing and building agentic systems for federal and commercial clients, focusing on large language model applications. Candidates should have 5+ years of software engineering experience, hands-on... 

    BLEN

    Washington DC
    1 day ago
  • $99k - $225k

    Job Number: R0229614 Agentic AI Machine Learning Engineer The Opportunity As an experienced machine learning engineer, you understand good software...  ...to APIs, Cloud platforms, or databases Experience evaluating architectural tradeoffs and designing robust service-based... 
    Full time
    Contract work
    Part time
    Local area
    Remote work

    Phase2 Technology

    Mc Lean, VA
    3 days ago
  • $130k - $150k

    BLEN Corp is seeking an AI Engineer in Washington, DC to design and build AI systems for federal and commercial clients. The role involves developing agentic systems, creating LLM-powered applications, and working closely with stakeholders. Ideal candidates should have... 
    Work from home

    BLEN Corp

    Washington DC
    19 hours ago
  • $160k - $180k

     ...Platinum Technologies is seeking an AI Engineer to join our company. The AI Engineer will...  ...Google Cloud Platform (GCP), and AWS. • Evaluate and recommend hosting, infrastructure,...  ...Protocol), AI orchestration frameworks, and agentic workflows. • Demonstrated experience... 
    Remote work

    Platinum Technologies

    Washington DC
    2 days ago
  • $130k - $150k

     ...AI Engineer Location: Remote (US only) Compensation: $130,000 to $150,000 A small, technical...  ...-on engineering role. You will design agentic systems, build MCP servers and clients,...  ..., Qdrant, or similar), and retrieval evaluation ~ Comfort with prompt engineering,... 
    Remote work

    Knak Digital

    Washington DC
    2 days ago
  •  ...Developer Premium II – Agentic Ai Engineer Duration: 6 Months - Long Term Location: Washington, DC 20433 Hybrid Onsite: 4 days per week...  ...scalable compute environments for model training, tuning, and evaluation. Scope of Work Develop multi-agent pipelines... 

    Mindlance

    Washington DC
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Agentic AI Evaluation Engineer. Be the first to apply!