Applied AI / Evaluation Engineer
$155kNavex Inc
At NAVEX, we're transforming the world-making it safer, more ethical, and ensuring every voice is heard. That's real impact.
Our high-performance culture is driven by our values. We move with speed, passion and purpose - as one team. We are bold in our ideas, accountable in our actions, and committed to doing the right things right.
You will join our Artificial Intelligence and Machine Learning team that shares a passion for designing quality solutions, embracing new technologies and delivering powerful products within our integrated risk and compliance management platform that help our customers protect their reputation and bottom line. We are changing the way people experience life at work!
As an Applied AI / Evaluation Engineer, you will own the quality, measurement, and behavioral assurance of the NAVEX AI Product System. You will build and operate evaluation harnesses, quality gating mechanisms, and human-in-the-loop tooling that ensure AI behavior is safe, consistent, and improving over time. In an agentic context, you will create the evaluation and regression testing systems that reduce drift and make agent behavior predictable-integrating continuous evaluation into CI/CD and production monitoring. You will be the guardian of AI quality, ensuring that no AI capability reaches production without passing rigorous evaluation. If you want to ensure enterprise agentic AI systems are trustworthy and measurably excellent, this role is for you.
You'll thrive in this hybrid role surrounded by an engaged, collaborative team deeply committed to your success. Join us and help shape what's next!
What you'll get:
- Meaningful Purpose. Your work helps organizations operate with integrity and protect their people-at a scale few companies can match.
- High-Performance Environment. We move with urgency, set ambitious goals, and expect excellence. You'll be trusted with real ownership and supported to do the best work of your career.
- Candid, Supportive Culture. We communicate openly, challenge ideas-not people-and value teammates who embrace bold thinking and continuous improvement.
- Growth That Matters. You can count on authentic feedback, strong accountability, and leaders invested in your success so you can achieve real growth.
- R ewards for Results. We provide clear, competitive compensation designed to recognize measurable outcomes and real impact.
What you'll do:
- Design, build, and operate the AI evaluation and regression harness that gates all AI releases-developing scenario suites, golden traces, and automated quality gates to reduce drift and make behavior predictable
- Define and maintain evaluation dimensions including groundedness, accuracy, relevance, safety, and policy adherence
- Build and curate versioned reference datasets (golden sets) covering common usage patterns and known failure modes
- Implement LLM-as-judge evaluation pipelines and rationale validation frameworks
- Develop and operate human-in-the-loop (HITL) tooling and signal capture systems
- Build drift detection and regression tracking capabilities to monitor AI behavioral stability over time
- Design quality gates that enforce measurable thresholds before AI capabilities are promoted to production
- Instrument agent observability-including end-to-end tracing for agent runs (tool-call success rates, failure analysis, latency and cost monitoring)-and use observability to debug and continuously improve
- Normalize and associate human review signals with AI interactions for continuous improvement
- Collaborate with data scientists and platform engineers to instrument telemetry across AI system components
- Produce evaluation reports and quality metrics that support governance, compliance, and leadership review
What you'll bring:
- Bachelor's or Master's degree in Computer Science, Data Science, Statistics, or a related STEM field
- 5+ years' experience in ML engineering, AI evaluation, or applied AI quality assurance
- Strong experience building evaluation harnesses, regression testing frameworks, and quality gating pipelines for LLM-based systems
- Experience with LLM-as-judge methodologies and automated evaluation techniques
- Evaluation-first mindset-experience implementing continuous evaluation pipelines that integrate with CI/CD and production monitoring, including stress testing against edge cases and adversarial scenarios
- Proficiency in Python with experience in ML/NLP evaluation libraries and frameworks
- Knowledge of statistical methods for measuring AI quality, drift, and behavioral stability
- Observability literacy for agent decisions-ability to implement or use tooling that evaluates agent behaviors like tool selection and tool argument correctness
- Experience designing and implementing human-in-the-loop review workflows
- Understanding of AI safety, bias detection, and policy compliance evaluation, with practical security awareness for LLM applications
- Comfort working in an iterative "build, test, ship, observe, refine" cycle
- Culture Agility. Comfort working in a fast-paced, candid environment that values innovation, healthy debate, and follow-through
- Fuel performance and outcomes. Leverage your job competencies and champion NAVEX's core values
Our side of the deal:
- We'll be clear, we'll move fast, and we'll invest in your success. You deserve to be supported, challenged, and rewarded for the impact you make-and we commit to doing that every step of the way.
- The starting pay for this role is $155,000+ per annum with 15% MBO. Discover how you can grow, lead, and make an impact by visiting our career page to learn more. NAVEX is an equal opportunity employer committed to including individuals of all backgrounds, including those with disabilities and veteran status.
$124k - $280k
...Competency: Data, Analytics & AI Industry/Sector: Health... ...people in data and analytics engineering focus on leveraging advanced technologies... ...you increase in autonomy, you apply sound judgment, recognising... ...closely with team members. We evaluate these factors thoughtfully to...SuggestedFull timeH1b$124k - $280k
...Competency: Data, Analytics & AI Industry/Sector: Health... ...people in data and analytics engineering focus on leveraging advanced technologies... ...you increase in autonomy, you apply sound judgment, recognising... ...closely with team members. We evaluate these factors thoughtfully to...SuggestedFull timeH1b$73.8k - $218.8k
...Are: Accenture's Oracle Business Group AI Center of Excellence is one of the most... ...conversation. You might come from engineering, consulting, product, or pre-sales - what... ...lines, including retrieval pipelines and evaluation frameworks Accelerate AI-assisted sales...SuggestedWork experience placementLive inWork at officeLocal area$155k
...people experience life at work! As an AI Platform Runtime Engineer, you will build, test, deploy, and... ...RAG pipelines and continuously evaluating and iterating to improve quality Build... ...security controls for tool-using agents-apply strict output validation and secure...Suggested- ...Sr. AI Engineer The Sr. AI Engineer plays a key role in building and scaling applied artificial intelligence capabilities across the organization. This role focuses... ...reliable model training and inference Evaluate and integrate third party AI platforms, APIs...Suggested
- ...Principal AI Engineer The Principal AI Engineer serves as the organization's most senior applied artificial intelligence (AI) technical expert, responsible for defining... ...augmented generation, fine tuning strategies, evaluation frameworks, and production guardrails...
$91k - $321.5k
...people in data and analytics engineering focus on leveraging advanced technologies... ...you increase in autonomy, you apply sound judgment, recognising... ...development of innovative AI solutions that drive... ...closely with team members. We evaluate these factors thoughtfully to...Full timeH1b$144.6k - $198.8k
...technical team at the absolute forefront of the AI revolution in enterprise software? As the... ...and productivity for Architecture, Engineering, Construction, and Operations. What... ...following the employee start date. How to Apply: Please submit an online application for...Full timeLocal areaWorldwideFlexible hours$78.4k - $107.9k
...Pioneer the Future of Autonomous Quality: Agentic AI QA Engineer Ready to redefine the boundaries of software testing by building goal... ...will be provided following the employee start date. How to Apply: Please submit an online application for this position by clicking...Ongoing contractFull timeLocal areaWorldwide$105.8k - $174.8k
...Technology – Data and Decision Science – AI Native Engineering Physical AI Engineering Consultant,... ...Intelligence and Data team helps apply cutting edge technology and techniques... ...ingesting, analysing, transforming data and evaluating results to make meaningful predictions...Full timeWork experience placementSummer holidayFlexible hours- ...leading insurance company in the Pacific Northwest is seeking an AI Data Engineer to help design, implement, and scale next-generation AI... ...leadership around AI tools, frameworks, and best practices. Evaluation and recommendation for emerging tools such as Claude Code...
$86.5k - $142.7k
...prototypes and builds modern, AI‑enabled applications and... ...proofs‑of‑concept, and guiding engineering teams through complex technical... ...search, prompt orchestration, evaluation and guardrails. • Author... ...code, tests and documentation; apply engineering judgment to validate...Summer holidayFlexible hours$105.6k - $145.2k
...Innovate at the Intersection of AI and Construction: Software Engineer (ProjectSight) Are you driven to build... ..., with a specific focus on evaluating and adopting new AI tools and methodologies... ...employee start date. How to Apply: Please submit an online...Full timeWork at officeLocal areaWorldwide$70.35k - $235.1k
...thinking services company at the forefront of AI-native innovation. We partner with... ...next-generation, agent-powered workflows engineered to scale in real-world settings. Our... ...policy-based routing, tool invocation, evaluation harnesses, and lifecycle observability....Work experience placementLive inWork at officeLocal area$150k
...the right things right. You will join our Product Engineering team that shares a passion for writing great code, embracing... ...Demonstrate a strong interest in Generative AI and the ability to evaluate and apply emerging capabilities in a practical, responsible, and...- ...Your role and responsibilities As a Data Engineer specializing in Data Platforms-SnowFlake,... ...Snowflake platforms for client's Data and AI usecases. This position focuses on... ...AI applications. • Leverage Key Skills: Apply expertise in Snowflake, Data Engineering,...
$125.5k - $230.2k
...Technology – Data and Decision Science – AI Native Engineering AI/Machine Learning Engineer,... ...Artificial Intelligence and Data team helps apply cutting-edge technology and techniques... .... Additionally, you will monitor and evaluate learning processes to continuously...Full timeWork experience placementSummer holidayFlexible hours$17 - $19 per hour
...increasing trends, unusual activity or repeated activity. Ongoing evaluation of software products in addition to recommending product or... ...and interpret bar graphs. REASONING ABILITY Ability to apply commonsense understanding to carry out instructions furnished...Full timeTemporary workWork at officeMonday to FridayFlexible hours$150k - $200k
...initiatives that are integrated into the fabric of how we work every day. To learn more, please see . The AI Automation Engineer - Finance & Accounting applies AI to finance operations at Cengage to reduce cost, increase accuracy, and eliminate manual work from AP, AR...Live inLocal areaWorldwide$15k
...Mid-Level Ai And Automation Engineer D.A. Davidson Companies is an independent, employee-owned company with a rich history spanning 90 years.... ...with Copilot agents for real-time information retrieval). • Apply AI Best Practices: Implement prompt engineering techniques...Work experience placementLocal area$55.2 - $61.33 per hour
...renowned for its commitment to engineering excellence and fostering a... ...engineers build software using AI? We're looking for a... ...and automation at scale Evaluate and implement AI developer tools... ...Identity management (SCIM) Why Apply: Shape how AI is used in...Hourly payContract workTemporary workWork experience placementRemote workWorldwide- ...Country: USA Summary The DevTestOps engineer handles daily requests from the... ...infrastructure issues, manage VMs, explore new AI Tools and develop new automated engineering... ...federal, state, and local laws. This policy applies to hiring, promotion, discharge, pay,...Temporary workWork at officeLocal areaRemote workWorldwideShift work
- ...Head of Product and AI Engineering (CTO) About the Company Expanding market research technology company Industry Market... ...will also be instrumental in establishing processes for the evaluation, prioritization, and launch of new capabilities, and in ensuring...
$155.66k - $225.16k
...one place to chat, explore and build with a wide variety of AI language models (bots), including o3, o4-mini, Claude 3.7... ...the Team and Role: We’re hiring our first AI Automation Engineer to lead how we apply AI internally across the company. This is a unique opportunity...Remote jobFull timeShift work- ...Software Development Engineer 3 Summary: The main function of a software engineer is to apply the principles of computer science and mathematical analysis to the design, development, testing, and evaluation of the software and systems that make computers work. A typical...For contractors
$118k - $178k
...March 2025) Day to Day As a Software Engineer III on the AI Gateway & Guardrails team at Indeed,... ...skills, experience, and expertise, as evaluated during the interview process. The... ...military experience are encouraged to apply. Equivalent expertise demonstrated through...Work experience placementLocal area- ...skilled and motivated Software Development Engineer to join our dynamic team. This is a... ...main function of a software engineer is to apply the principles of computer science and... ...to the design, development, testing, and evaluation of the software and systems that make computers...Hourly payContract workFor contractorsImmediate startMonday to Friday
$110k - $270k
...validating the next generation of Nitro AI Agents . We are looking for a seasoned engineer who lives at the intersection of... ...and maintain scalable automated evaluations to ensure agent behavior remains... ...disability or special need when applying for a role or in our recruitment...Work at officeLocal areaRemote workWork from homeFlexible hours3 days per week$100k - $150k
...Day to Day As a Software Engineer II on the Meta Profile team, you... ...transformation stages, and instrumenting evaluation so we can measure how new... ...experience are encouraged to apply. Equivalent expertise... ...a resume for that opening. AI Notice Indeed is committed...Work experience placementLocal area$120k - $150k
...notice. Work Location: We work out of the top floor of a beautiful, newly remodeled, office park in Lake Oswego, Oregon. If you apply today and get hired soon, you have a good chance of getting desk with a view! Full compensation packages are based on candidate...Live inWork at officeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Applied AI / Evaluation Engineer. Be the first to apply!


