Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Engineer - Evaluation Analysis, Metric and Data Strategy

Apple Oakbrook

Weekly Hours: 40

Role Number: 200657984-3337

Summary

The Productivity and Machine Learning Evaluation team ensures the quality of AI-powered features across a suite of productivity and creative applications; including Creator Studio, used by hundreds of millions of people. This team serves as the primary evaluation function, and its analysis directly informs decisions about model development, feature launches, and product direction. This role is the analytical core of the team; responsible for making sense of evaluation signals and real-world user behavior. The work involves designing feature-level quality metrics, collaborating with partner teams on data collection strategies, and translating evaluation data into concise, actionable insights that drive decisions. This is an opportunity to define how AI feature quality is measured and to directly shape what gets shipped. As AI features evolve into multi-turn, agentic experiences, this role will define what “quality” means when the unit of evaluation is a conversation, not a single response.

Description

Day-to-day work involves analyzing evaluation results, identifying trends, regressions, and segment-level patterns across multiple AI features. This includes collaborating with partner teams on data collection strategies, ensuring evaluation data is representative of real-world usage, and designing the metrics framework that leadership uses to make decisions on AI features. Typical deliverables include: feature-level quality metrics and dashboards, evaluation analysis reports, data collection requirements, dataset representativeness audits, multi-turn evaluation frameworks and session-level scoring rubrics, and concise metric summaries for decision-makers.

Minimum Qualifications

  • Bachelor’s degree in Statistics, Data Science, Applied Mathematics, Computer Science, or a related quantitative field

  • 5+ years of experience in applied science, data science, or evaluation research, with a focus on defining and operationalizing quality metrics

  • Experience with statistical analysis methods including significance testing, sampling design, effect size estimation, and experimental design

  • Experience working with production user data, understanding its biases and limitations compared to controlled evaluation data, including familiarity with sequential interaction data where context and turn order affect quality assessment

  • Ability to design evaluation approaches where the unit of analysis is a session or conversation rather than a single model output

  • Track record of independently designing metrics frameworks and driving data-informed decisions across cross-functional teams

  • Proficiency in Python (pandas, scipy, scikit-learn) or R for data analysis and visualization

Preferred Qualifications

  • Experience designing evaluation or quality metrics for AI-powered or ML-driven features in consumer-facing products

  • Familiarity with productivity software or creative applications, with an ability to distinguish between technically correct and genuinely useful AI outputs

  • Experience partnering with engineering or data teams to define data collection requirements and schemas

  • Track record of translating complex analytical findings into concise recommendations for non-technical decision-makers

  • Experience evaluating tool-use accuracy, retrieval quality, or function-calling reliability within AI systems

  • Experience with evaluation methodology including inter-annotator agreement, evaluation bias detection, and dataset representativeness auditing

  • Familiarity with agentic orchestration frameworks (LangChain, LangGraph, CrewAI, AutoGen) and emerging agent interoperability protocols (A2A, MCP), with an understanding of how architectural choices in agent design affect evaluability

  • Understanding of ML model development processes, with the ability to specify what evaluation signals are useful for model improvement

  • Experience managing evaluation across multiple features or product areas simultaneously, with systematic rather than ad-hoc approaches

  • Graduate degree in a relevant quantitative field

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Engineer - Evaluation Analysis, Metric and Data Strategy in Seattle, WA vacancy
  • $139.5k - $258.1k

    Apple Inc. in Seattle, Washington, seeks an ML Engineer for the Productivity and Machine Learning Evaluation team. This role involves defining quality metrics and analyzing evaluation results to inform decisions on AI features across productivity applications. Candidates... 
    Data

    Apple Inc.

    Seattle, WA
    3 days ago
  •  ...Productivity and Machine Learning Evaluation team ensures the...  ...extending ML evaluation systems, including...  ...approaches where the unit of analysis is a conversation or...  ...understanding of how technical metrics connect to user-...  ...to scale evaluation data generation and analysis... 
    Data
    Shift work

    Apple

    Seattle, WA
    1 day ago
  • $139.5k - $258.1k

     ...Evaluation & Insights Machine Learning Engineer Imagine what you could do...  ...review and analyze data, and evaluate...  ...model behavior analysis, and qualitative...  ...quality metrics (e.g., helpfulness...  ...Generation (RAG) strategies, and model fine...  ...Apply advanced ML techniques (e.g... 
    Data
    Relocation

    Apple

    Seattle, WA
    1 day ago
  • $139.5k - $258.1k

    ML Engineer - Automated Evaluation and Adversarial Design Seattle, Washington, United States...  ...where the unit of analysis is a conversation or session...  ...understanding of how technical metrics connect to user-perceived...  ...to scale evaluation data generation and analysis Experience... 
    Data
    Relocation
    Shift work

    Apple Inc.

    Seattle, WA
    1 day ago
  • $171.6k - $302.2k

     ...Senior/Staff Applied ML Engineer – AI/ML Evaluation & Simulation We're building the...  ...simulation and behavior analysis. This role sits at the intersection...  ..., and analyze evaluation data Develop scalable...  ...and operationalize success metrics aligned with product and research... 
    Data
    Relocation

    Apple

    Seattle, WA
    3 days ago
  • $60 - $70 per hour

     ...Machine Learning Engineer to join a high-...  ...advancing LLM evaluation, NLP, and AI-...  ...optimizing prompt strategies, and building...  ...engineering, and data-driven model...  ...Create and maintain metrics, KPIs, and...  ...Conduct error analysis, root-cause investigations...  ...experience in ML engineering,... 
    Data
    Contract work
    Temporary work
    Remote work
    3 days per week

    TEKsystems

    Seattle, WA
    1 day ago
  •  ...we are a team of engineers and technologists...  ...Software Engineer Evaluation, you will design and...  ...machine learning and data engineering teams...  ...dataset coverage analysis to understand...  ...collaborate with ML engineers to improve...  ...performance and evaluation metrics Collaborate... 
    Data

    VTI Aerospace

    Seattle, WA
    1 day ago
  •  ...capabilities? The Data and Machine...  ...Machine Learning Engineer to explore new...  ...challenge existing metrics and protocols,...  ...for real-world ML challenges. As...  ...post-training evaluation and fine-tuning...  ...fine-tuning strategies for tasks like...  ...optimization, causality analysis, natural... 
    Data
    Worldwide

    Apple

    Seattle, WA
    3 days ago
  • $181.1k - $318.4k

     ...AIML - Senior ML Engineer, Responsible AI and Safety...  ...developing mitigation strategies, and driving continuous...  ...directly influence how we evaluate, align, and monitor...  ...Proficiency in Python and data science libraries (e.g...  ...strong skills in data analysis, visualization, and... 
    Data
    Relocation

    Apple

    Seattle, WA
    4 days ago
  • $120.3k - $210.1k

     ...next generation of AI evaluation systems — and we’re looking...  ...motivated early‑career engineer who’s excited to work at the intersection of ML, software, and product....  ...systems, support data tooling, and contribute...  ...infrastructure Learn how to define metrics that connect model... 
    Data
    Internship
    Relocation

    Apple Inc.

    Seattle, WA
    2 days ago
  • $229k - $343k

     ...Spectacles. Snap Engineering teams build fun...  ...product managers, data scientists, and...  ...define success metrics, experimentation strategy, and long-term ranking...  ...robust offline evaluation, online...  ...engineers working on ML ranking systems...  ...debugging, and tradeoff analysis Proven ability... 
    Data
    Full time
    Work experience placement
    Live in
    Work at office
    Local area

    Snap Inc.

    Bellevue, WA
    1 day ago
  •  ...Principal AI Agent / ML Software Engineer The Senior Principal...  ...durable technical strategy, lead multi-team execution...  ..., memory, retrieval, evaluation, guardrails, and...  ...service boundaries, APIs, data models, state...  ...automation, incident analysis, documentation, and AI... 
    Data

    Oracle

    Seattle, WA
    3 days ago
  •  ...This team also focuses on ML-driven forecasting,...  ...As a Sr. ML Optimization Engineer, you will work at the intersection...  ..., infrastructure strategy, applied analytics,...  ...role involves in-depth analysis of infrastructure usage patterns, and data-driven capacity planning... 
    Data

    Apple

    Seattle, WA
    18 hours ago
  •  ...inference. With roots in ML, computer vision, and energy...  ...hands-on Machine Learning Engineer to drive the data & evaluation lifecycle for our...  ...performing in-depth failure analysis on production models, and...  ...video datasets, designing metrics to understand user behavior... 
    Data

    Apple

    Seattle, WA
    1 hour ago
  •  ...Machine Learning Engineer Expedia Technology...  ...powered by data and machine learning...  ...batch and real-time ML systems that power...  ...alerting, and root-cause analysis ~ Experience...  ..., model evaluation, bias/variance tradeoffs...  ...offline vs online metrics Hands-on experience... 
    Data

    Expedia Group

    Seattle, WA
    2 days ago
  • $181.1k - $318.4k

     .... Machine Learning Engineer - Answers, Knowledge...  ...for an experienced ML engineer with hands...  ...from opportunity analysis, exploration, and prototyping to data collection, feature...  ...engineering, training, evaluation, and deployment in...  ..., and evaluation metrics. You have proven... 
    Data
    Local area
    Relocation

    Apple

    Seattle, WA
    3 days ago
  • $139.5k - $258.1k

     ...Data Engineer - AIML Evaluation Apple is where individual imaginations gather together, committing to...  ...visualizations to surface experimentation metrics and operational health. Engage...  ...testing methodologies, statistical analysis, or experimentation platforms.... 
    Data
    Relocation

    Apple

    Seattle, WA
    4 days ago
  • $148.2k - $300.96k

     ...Machine Learning Engineer, E-commerce Governance Algorithms...  ..., images, behavioral data) to detect false...  ...logistics performance metrics (e.g., delivery delays...  ...quality qualification strategies to cut missing recalls...  ...networks, time series analysis, or LLM. Domain Expertise... 
    Data
    Temporary work
    Local area

    ByteDance

    Seattle, WA
    4 days ago
  • $139.5k - $258.1k

     ...AIML - Machine Learning Engineer, MIND As a Machine Learning...  ...to be part of an ML innovation organization...  ...up model training, build data pipelines, and tuning to...  ...models. Able to define metrics, evaluate ML models, and perform error analysis. Familiar with recent... 
    Data
    Temporary work
    Relocation

    Apple

    Seattle, WA
    4 days ago
  • $69.76 - $111.61 per hour

     ...Machine Learning Engineer At Brooks, we believe...  ...to harness our data assets across a...  ...Machine Learning (ML) Engineer, you will...  ...exploratory data analysis and drive end-to-end...  ...needs. Lead Al strategy and implementation...  ...engineering and evaluate models for accuracy... 
    Data
    Full time
    Temporary work
    Local area

    Brooks Running

    Seattle, WA
    2 days ago
  •  ...Machine Learning Engineer to join our team to...  ...logs and synthetic data, your work will...  ...the intersection of ML and data science....  ...You'll incorporate metrics and information on...  ...learning models, evaluation, and optimization....  ...scripting and data analysis languages like SQL... 
    Data
    Temporary work
    Relocation package

    Zoox

    Seattle, WA
    1 day ago
  • $242k - $333k

     ...Machine Learning Engineer to join our team to...  ...logs and synthetic data, your work will...  ...the intersection of ML and data science....  ...You'll incorporate metrics and information on...  ...learning models, evaluation, and optimization....  ...scripting and data analysis languages like SQL... 
    Data
    Temporary work
    Remote work
    Relocation package

    Zoox

    Seattle, WA
    1 day ago
  • $139.5k - $258.1k

     ...Machine Learning Engineer, Information Security Join...  ...cybersecurity through data-driven intelligence....  ...customers. The Security ML Engineer will bring...  ..., model development, evaluation metrics design, deployment,...  ...scale data processing and analysis using tools such as... 
    Data
    Local area
    Relocation

    Apple

    Seattle, WA
    4 days ago
  • $171.6k - $302.2k

     ...This team also focuses on ML-driven forecasting,...  ...As a Sr. ML Optimization Engineer, you will work at the intersection...  ..., infrastructure strategy, applied analytics,...  ...role involves in-depth analysis of infrastructure usage patterns, and data‑driven capacity planning... 
    Data
    Relocation

    Apple Inc.

    Seattle, WA
    1 day ago
  • $129.3k

     ...seeking a Senior Firmware Engineer to join our Power...  ...management on ML Acceleration Chips. In...  ...algorithms, optimization strategies, and real-time decision...  ..., with collected data optionally post-processed...  ...data pipelines for metric collection, analysis, and visualization of... 
    Data
    Local area

    Amazon

    Seattle, WA
    3 days ago
  • $164k - $313.3k

     ...Senior Machine Learning (ML) Systems & Efficiency Engineer to join our R&D team...  ...advanced serving strategies including batching,...  ...including data, tensor, pipeline, expert...  ...Conduct deep performance analysis using tools such as...  ...and track efficiency metrics such as cost per... 
    Data
    Temporary work
    Local area
    Worldwide

    Adobe

    Seattle, WA
    18 hours ago
  •  ...Sesame Engineer Position Sesame believes in a future...  ...of embedded systems and ML to enable rich,...  ...embedded hardware. Evaluate and adapt larger ML models...  ...cycle: system design, data collection & curation,...  ...processing and/or time-series analysis for sensor data. ~ Excellent... 
    Data
    Full time
    Contract work
    Flexible hours

    SESAME

    Bellevue, WA
    1 day ago
  • $224k - $313.5k

    Strategy Director - Hotels.com Hotels.com is one of Expedia...  ...positioning Evaluate strategic Business Travel...  ...tradeoffs, and develop clear, data‑backed business cases...  ...processes Performance Metrics: Regularly evaluate...  ...business case leveraging analysis in an ambiguous and fast... 
    Data
    Temporary work

    PowerToFly

    Seattle, WA
    18 hours ago
  • $136k - $184k

     ...forecast modeling engine. As we expand...  ...a strong Data Scientist II to...  ...the technical strategy for these advanced...  ...learning, time series analysis, econometrics)...  ...• Develop metrics to quantify the...  ...working with or evaluating AI systems experience...  ...in a ML or data scientist... 
    Data
    Flexible hours

    Amazon

    Bellevue, WA
    4 days ago
  • $136k - $184k

     ...of scientists, engineers, and product managers...  ...massive-scale data, measuring the...  ...shaping product strategy through rigorous analysis? Do you want to...  ...seller pain points, evaluate feature...  ...KPI development, metric integrity validation...  ...Experience in a ML or data scientist... 
    Data
    Flexible hours

    Amazon

    Seattle, WA
    6 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Engineer - Evaluation Analysis, Metric and Data Strategy. Be the first to apply!