Agentic AI Evaluation Engineer

$142.65k - $213.98k

Blueface Ltd

Job Summary The Agent Evaluation team is responsible for testing whether AI agents return the correct and expected responses. We build the framework, metrics, and test cases that validate agent behavior, accuracy, and reliability before release. Our goal is to ensure agents perform consistently and meet product and user expectations. Responsibilities Design and develop agent evaluation pipelines across development, staging, and production environments. Define and standardize evaluation metrics and benchmarks for conversational AI quality (accuracy, relevance, CX, safety). Build automated and human-in-the-loop evaluation systems to assess agent performance. Manage and curate evaluation datasets, test sets, and annotation workflows. Enable continuous evaluation and monitoring of agents in production. Integrate evaluation into CI/CD pipelines to support safe and efficient releases. Conduct experiments, A/B testing, and case studies to drive improvements in agent quality. Partner with engineering and product teams to deliver high-quality AI solutions. Create technical documentation and drive best practices across teams. Mentor junior engineers and contribute to team growth. Qualifications 5‑7 years of relevant experience in AI evaluation, customer support AI, or chatbot platforms. Strong understanding of responsible AI, including bias, fairness, and hallucination mitigation. Proficiency with large language models (LLMs) and machine learning (ML) techniques. Experience with benchmarking, CI/CD, and automation of evaluation pipelines. Excellent communication skills and ability to work cross‑functionally. Preferred Skills Experience in customer support AI or chatbot platforms. Understanding of responsible AI (bias, fairness, hallucination mitigation). Education & Certifications Bachelor’s degree in a related field (preferred but not required). Relevant certifications are a plus. Compensation Primary location: Pay range $142,651.46 – $213,977.19. Base pay is one part of the Total Rewards package and may include bonuses. Benefits Eligible employees receive comprehensive benefits covering physical, financial, and emotional well-being, including health, dental, vision, and various support resources. Equal Opportunity Comcast is an equal opportunity workplace. We will consider all qualified applicants for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran status, genetic information, or any other basis protected by applicable law. #J-18808-Ljbffr Blueface Ltd

Apply

Vacancy posted 5 days ago

Similar jobs that could be interesting for youBased on the Agentic AI Evaluation Engineer in Washington DC vacancy

Agentic AI Engineer
$99k - $225k
...Job Number: R0238708 Agentic AI Engineer The Opportunity: We are looking for a highly skilledAgentic AI Engineerto join our team,... ...device performance, including ONNX, GGML, or Ollama. Develop evaluation frameworks to test agent reliability, safety, and...
Suggested
Full time
Contract work
Part time
Work at office
Local area
Remote work
Shift work
Booz Allen Hamilton
Bethesda, MD
1 day ago
AI Engineer - Agentic Workflows - Clearance
$160k - $195k
...Space builds next-generation AI systems that help “supercharge... ...platform. We are seeking an engineer who thrives in a dynamic, fast... ...and traceability. Develop evaluation and regression testing for agent... ...toidentifyand prioritize agentic automation opportunities tied...
Suggested
Temporary work
Work at office
Flexible hours
Valid8 Financial, Inc.
Washington DC
4 days ago
Agentic AI Engineer
Agentic AI Engineer Location: Remote / Alexandria, VA Clearance: Eligibility to be cleared We are seeking a creative and proven Agentic AI... ...practices in developing agent‑based AI solutions. Continuously evaluate, test, and enhance the capabilities and performance of...
Suggested
Work at office
Remote work
Whitespace, Ltd.
Alexandria, VA
5 days ago
Agentic AI Machine Learning Engineer
$99k - $225k
...to expect during your journey as a candidate with us. Agentic AI Machine Learning Engineer The Opportunity: As an experienced machine learning engineer... ...to APIs, Cloud platforms, or databases Experience evaluating LLM performance and building observation layers for stakeholders...
Suggested
Full time
Contract work
Part time
Work at office
Local area
Remote work
Booz Allen Hamilton
Washington DC
1 day ago
Agentic AI Systems Engineer
...results for the government. We are currently seeking an Agentic AI Systems Engineer to join our team and fully embrace our commitment to deliver... ..., and alignment with applicable standards. Design evaluation frameworks and test harnesses to assess quality, factuality...
Suggested
Full time
Temporary work
Work experience placement
Afternoon shift
Corner Alliance
Washington DC
4 days ago
Agentic AI Engineer
$169.6k - $229.46k
...Agentic AI Engineer Location: USA VA Falls Church Full Part/Time: Full time Job Req: RQ217980 Type of Requisition: Pipeline Clearance... ...data ingestion and embedding to deployment and continuous evaluation, optimized for reusability and scalability Develop, test...
Full time
Temporary work
Part time
Work at office
Immediate start
Remote work
Worldwide
Flexible hours
General Dynamics
Falls Church, VA
4 days ago
Agentic AI Engineer
...We build AI agents that actually work in enterprise environments — not prototypes, not demos. We need engineer's who can own the entire agent stack: a production frontend, a robust... ...lead technical architect and builder of agentic systems running in AWS, OCI, and Azure....
Temporary work
Trilagen
Bethesda, MD
5 days ago
Backend Infrastructure & Agentic AI Platforms Software Development Engineer, Senior
$86.8k - $198k
Backend Infrastructure & Agentic AI Platforms - Software Development Engineer, Senior The Opportunity: To achieve an organization’s mission, leaders need strong... ...usage, and LLM quality metrics and implement LLM evaluation pipelines including safety checks, regression...
Contract work
Local area
Booz Allen Hamilton
Washington DC
5 days ago
Senior Conversational AI Evaluation Engineer
Blueface Ltd in Washington seeks an experienced AI Evaluator to design and develop evaluation pipelines for conversational AI. The role involves defining metrics, conducting experiments, and ensuring high-quality AI solutions. The ideal candidate will have 5-7 years of...
Blueface Ltd
Washington DC
5 days ago
Agentic AI Engineer — Build Autonomous GenAI Apps
Phase2 Technology is seeking an experienced Agentic AI Engineer to design and deliver production-grade AI systems that demonstrate the existing potential of generative AI. You will work alongside data engineers and product owners to create applications that leverage advanced...
Phase2 Technology
Arlington, VA
2 days ago
Remote AI Evaluation Engineer — Life Sciences & Diagnostics
$150k - $170k
Danaher Corporation is seeking an AI Evaluation Engineer to join the Device Intelligence team. This remote role involves defining and executing AI evaluation strategies for cutting-edge systems used across Life Sciences and Diagnostics. The ideal candidate will have a...
Remote job
Payfuture Technologies
Washington DC
2 days ago
Remote AI Security Engineer - SOC & Model Evaluator
$40 per hour
A leading cybersecurity firm is looking for experienced cybersecurity professionals to evaluate AI-generated content and solve technical problems. The role involves working with advanced AI models, providing feedback, and contributing to the cybersecurity industry's future...
Hourly pay
Remote work
Flexible hours
DataAnnotation
Washington DC
1 day ago
AI Security Engineer: Train & Evaluate Cyber AI Models
$40 per hour
A leading technology firm is seeking experienced cybersecurity professionals to evaluate AI-generated security content and provide feedback for improving AI systems. This remote position offers flexibility in project selection, with an hourly pay starting at $40+. Candidates...
Hourly pay
Remote work
DataAnnotation
Washington DC
5 days ago
Senior Agentic AI & LLM Engineer for Multi-Agent Apps
Phase2 Technology is seeking an experienced Software Development Engineer to advance AI-enabled systems, supporting clients at the Advanced Research Projects Agency for Health (ARPA-H). You'll build agentic workflows, design LLM integrations, and develop AI-powered...
Phase2 Technology
Washington DC
1 day ago
Remote Agentic AI Engineer for Autonomous Cyber Agents
Whitespace, Ltd. is seeking a creative Agentic AI Engineer to join our dynamic team in Alexandria, VA. This role focuses on building intelligent AI agents that navigate complex environments and make autonomous decisions. Responsibilities include architecting AI systems...
Remote job
Whitespace, Ltd.
Alexandria, VA
5 days ago
AI Agent Evaluation Engineer CI/CD Quality & Metrics
$142.65k - $213.98k
...an experienced professional to design and manage agent evaluation pipelines ensuring AI solutions meet user expectations. Responsibilities include... ...metrics, conducting A/B testing, and mentoring junior engineers. Candidates should possess a Bachelor's degree and 5-7 years...
Comcast Corporation
Washington DC
2 days ago
Remote Agentic AI Engineer — Build Actionable AI Agents
A leading technology firm is seeking a skilled Agentic AI Engineer to develop advanced AI agents for cybersecurity. This role requires creativity, expertise in Python and agent frameworks, and a commitment to enhancing national security through innovative AI solutions....
Remote job
Work at office
GeoDelphi
Alexandria, VA
5 days ago
Sr. Python Engineer, Agentic AI
$168.33k - $252.49k
...are looking for a Senior Software Engineer to lead the technical direction of our AI Agent initiatives. This is a... ...production‑grade backend services and agentic workflows that solve complex, non... ...operable over time. Testing & Evaluation of Probabilistic Systems Design testing...
Work experience placement
Comcast Corporation
Washington DC
2 days ago
Remote Agentic AI ML Engineer — Scalable AI Solutions
$99k - $225k
Booz Allen Hamilton is seeking an experienced Agentic AI Machine Learning Engineer to design and implement AI systems that transform client operations in the Defense sector. Candidates should have over 3 years of machine learning experience, particularly in production-...
Remote job
Booz Allen Hamilton
Washington DC
2 days ago
Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)
$193.4k - $220.7k
Lead AI Engineer (AI Foundations, LLM Core and Agentic AI) Overview At Capital One, we are creating responsible and reliable AI systems, changing banking... ...model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc...
Full time
Part time
Local area
Capital One National Association
Mc Lean, VA
1 day ago
Senior AI Engineer - Agentic Systems & MCP
$130k - $150k
BLEN is looking for an AI Engineer in Washington, DC. The role involves designing and building agentic systems for federal and commercial clients, focusing on large language model applications. Candidates should have 5+ years of software engineering experience, hands-on...
BLEN
Washington DC
2 days ago
Senior AI Engineer - Agentic Systems & MCP
$130k - $150k
BLEN Corp is seeking an AI Engineer in Washington, DC to design and build AI systems for federal and commercial clients. The role involves developing agentic systems, creating LLM-powered applications, and working closely with stakeholders. Ideal candidates should have...
Work from home
BLEN Corp
Washington DC
1 day ago
Agentic AI Systems Engineer for Secure Gov Tech
Corner Alliance is seeking an Agentic AI Systems Engineer to design and develop AI-enabled interfaces in Washington, D.C. The role requires expertise in software engineering, cloud application development, and AI integration. Responsibilities include constructing a secure...
Corner Alliance
Washington DC
4 days ago
AI Engineer
$130k - $150k
...AI Engineer Location: Remote (US only) Compensation: $130,000 to $150,000 Must... ...-on engineering role. You will design agentic systems, build MCP servers and clients,... ..., Qdrant, or similar), and retrieval evaluation ~ Comfort with prompt engineering, structured...
Temporary work
Remote work
Knak Digital
Washington DC
3 days ago
AI Forward Deployed Engineer
...shape their hybrid cloud and AI journeys. With support from our... ...responsibilities As an AI Forward Deployed Engineer, you will work with customers... ...deployment and adoption. Evaluate Model Performance: Assess the... ...into real‑world applications Agentic AI & workflows: Experience...
Worldwide
IBM Computing
Alexandria, VA
3 days ago
Sr. Distinguished AI Engineer (Agentic AI Platform)
$314.8k - $359.3k
Sr. Distinguished AI Engineer (Agentic AI Platform)Overview:At Capital One, we are creating responsible and reliable AI systems, changing banking... ...standardizing and automating agentic workflows : you will evaluate agentic frameworks such LangGraph, AutoGen, Semantic Kernal,...
Full time
Part time
Work at office
Local area
Capital One
Mc Lean, VA
5 days ago
Lead AI Engineer (GenAI Platform, AI Foundations, LLM Core and Agentic AI)
$197.3k - $225.1k
Overview Lead AI Engineer (GenAI Platform, AI Foundations, LLM Core and Agentic AI) At Capital One, we are creating responsible and reliable AI systems, changing... ...inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability,...
Full time
Part time
Local area
Capital One
Mc Lean, VA
2 days ago
AI Engineer
$150k - $160k
Platinum Technologies is seeking an AI Engineer to join our company. We are looking for an... ...end-to-end AI solutions — RAG pipelines, agentic workflows, multi-modal applications, that... .... Lead LLM-agnostic design decisions: evaluate and integrate models from OpenAI, Anthropic...
Immediate start
Shift work
Platinum Technologies
Washington DC
2 days ago
Distinguished AI Engineer (Agentic AI Platform)
$269.1k - $307.2k
Distinguished AI Engineer (Agentic AI Platform)At Capital One, we are creating responsible and reliable AI systems, changing banking for good... ...around standardizing and automating agentic workflows : you will evaluate agentic frameworks such LangGraph, AutoGen, Semantic Kernal,...
Full time
Part time
Work at office
Local area
Capital One
Mc Lean, VA
4 days ago
AI Native Software Engineer
$70.35k - $235.1k
...company at the forefront of AI-native innovation. We partner... ...generation, agent-powered workflows engineered to scale in real-world... ...experience designing and deploying agentic systems, especially for... ...based routing, tool invocation, evaluation harnesses, and lifecycle...
Work experience placement
Live in
Work at office
Local area
Accenture
Arlington, VA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Agentic AI Evaluation Engineer. Be the first to apply!