ML Engineer - Automated Evaluation and Adversarial Design

$139.5k - $258.1k

Apple Inc.

ML Engineer - Automated Evaluation and Adversarial Design Seattle, Washington, United States Software and Services The Productivity and Machine Learning Evaluation team ensures the quality of AI-powered features across a suite of productivity and creative applications; including Creator Studio, used by hundreds of millions of people. This team serves as the primary evaluation function, providing critical quality signals that directly influence model development decisions and product launches. This role focuses on building and scaling automated evaluation systems and designing adversarial and stress-testing methodologies across multiple AI features. The work requires a deep understanding of how AI systems fail and how to measure quality rigorously. As features evolve from single-turn interactions into multi-turn, agentic experiences, the evaluation challenge shifts from assessing individual outputs to stress-testing entire conversation flows and agent decision chains. This is an opportunity to shape the evaluation infrastructure that determines whether AI features meet the bar for hundreds of millions of users. Description Day-to-day work involves designing, building, and maintaining automated evaluation systems that assess AI feature quality at scale, including multi-turn conversation evaluation and end-to-end agent workflow testing. This includes creating adversarial test suites that probe model weaknesses and running stress tests to ensure features perform under demanding conditions, with particular focus on failure modes that only emerge across extended interactions, such as: context degradation, goal drift, and compounding errors. Typical deliverables include: evaluation frameworks and rubrics, quality assessment reports, adversarial test case libraries, multi-turn stress-test pipelines, and recommendations on model readiness. Responsibilities Define and own the automated evaluation approach for AI features, translating qualitative notions of quality into measurable, reproducible assessments across both single-turn and multi-turn agentic experiences Build adversarial test suites that target known and emerging model failure modes, including edge cases relevant to productivity application workflows including conversation-level failures such as context loss, instruction forgetting, and cascading errors across multi-step tasks Develop and execute stress test protocols that validate minimum performance thresholds under atypical input conditions including extended conversation lengths, adversarial mid-conversation topic shifts, and complex tool-use sequences Ensure alignment between automated and human evaluation methods on an ongoing basis, identifying and resolving systematic disagreements Collaborate with engineering partners to integrate evaluation into development and release workflows Scale adversarial test case generation and stress test execution, leveraging automation where appropriate, including programmatic generation of multi-turn conversation scenarios and agent interaction traces Influence model and feature quality decisions by communicating evaluation findings and readiness assessments to cross-functional partners Minimum Qualifications Bachelor’s degree in Computer Science, Machine Learning, Statistics, or a related field 4+ years of experience building or significantly extending ML evaluation systems, including designing evaluation benchmarks or quality assessment frameworks including evaluation of sequential or multi-step AI outputs Experience independently defining evaluation architecture and methodology for AI or ML systems with the ability to design evaluation approaches where the unit of analysis is a conversation or session rather than a single output Experience designing adversarial or red-teaming test methodologies for ML models or AI-powered features including adversarial scenarios that target failures across multi-turn interactions Experience with Python and ML frameworks (PyTorch, TensorFlow, or equivalent) in production or near-production settings Track record of owning technical direction for evaluation efforts across multiple features or product areas Preferred Qualifications Experience evaluating user-facing AI features in consumer applications, with an understanding of how technical metrics connect to user-perceived quality Familiarity with productivity software or creative tools, with the ability to assess output quality from a user workflow perspective Experience ensuring alignment between automated and human evaluation methods, including inter-annotator agreement analysis and bias detection Track record of designing evaluation systems that scale across multiple features or product areas without requiring bespoke solutions for each Experience evaluating different types of AI systems, including API-based and custom-trained models Demonstrated ability to communicate evaluation findings and readiness assessments to cross-functional partners Experience leveraging automation to scale evaluation data generation and analysis Experience building evaluation pipelines for conversational AI, dialogue systems, or agentic workflows, including turn-level and session-level automated scoring Familiarity with agent orchestration frameworks (LangChain, LangGraph, CrewAI, AutoGen) and observability tooling (LangSmith, Braintrust, Arize), with an understanding of how to instrument and evaluate multi-step agent runs Experience designing adversarial tests for tool-use reliability, function-calling accuracy, or agent planning quality At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $139,500 and $258,100, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant. Apple accepts applications to this posting on an ongoing basis. #J-18808-Ljbffr Apple Inc.

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the ML Engineer - Automated Evaluation and Adversarial Design in Seattle, WA vacancy

ML Engineer: Build Automated Evaluation & Adversarial Tests
$139.5k - $258.1k
Apple Inc. is seeking an ML Engineer for its Seattle location to build and scale automated evaluation systems for AI features. The ideal candidate will have a Bachelor... ...ML evaluation. Responsibilities include designing adversarial test suites and developing stress test...
Suggested
Apple
Seattle, WA
3 days ago
ML Engineer - Evaluation Analysis, Metric and Data Strategy
$139.5k - $258.1k
ML Engineer - Evaluation Analysis, Metric and Data Strategy Seattle, Washington, United States Software and Services The Productivity and Machine... ...signals and real‑world user behavior. The work involves designing feature-level quality metrics, collaborating with partner...
Suggested
Relocation
Apple Inc.
Seattle, WA
4 days ago
AIML - Machine Learning Engineer - Computer Vision & Audio, MIND
$139.5k - $258.1k
AIML - Machine Learning Engineer - Computer Vision &... ...Machine Intelligence, Neural Design (MIND) team, part of... ...inference. With roots in ML, computer vision, and... ...Engineer to drive the data & evaluation lifecycle for our... ...data observability and automated data validation checks...
Suggested
Relocation
Apple
Seattle, WA
1 day ago
Sr. Machine Learning Engineer - Advertising Technology
$184.5k
...everyone, everywhere. We design cutting-edge tech to... ...Senior Machine Learning Engineer Expedia Technology teams... ...batch and real-time ML systems that power pricing... ...model versioning, and automated retraining Strong ownership... ...engineering, model evaluation, bias/variance tradeoffs...
Suggested
Local area
Expedia, Inc.
Seattle, WA
6 days ago
Senior Machine Learning engineer - Distribution & Supply
$184.5k
...everyone, everywhere. We design cutting-edge tech to... ...Machine Learning Engineer role is part of the Distribution... ...model development and evaluation through implementation... ...problems into clear ML‑driven solutions,... ...reducing manual toil through automation (for example, CI/CD...
Suggested
Local area
Flexible hours
Expedia Group
Seattle, WA
2 days ago
ML Evaluation Metrics Engineer | Data Strategy
$139.5k - $258.1k
Apple Inc. in Seattle, Washington, seeks an ML Engineer for the Productivity and Machine Learning Evaluation team. This role involves defining quality metrics and analyzing evaluation results to inform decisions on AI features across productivity applications. Candidates...
Apple Inc.
Seattle, WA
18 hours ago
Senior ML Infrastructure Engineer - Embodied AI Scaling Foundations
$153.2k - $234.1k
...mobility. Through a human-centered design process, we create vehicles... ...powers every machine learning engineer working on our cutting-edge... ...machine learning model training and evaluation workflows across GM. Own... ...systems/applications or advanced ML Applications. Proven track...
Work at office
Local area
Remote work
Work from home
Flexible hours
Israelvcforum
Seattle, WA
4 days ago
Staff ML Engineer - Embodied AI Offboard Perception
$189.3k - $320.7k
...mobility. Through a human-centered design process, we create vehicles... ...’ll Do Design and implement ML solutions aligned with GM’s... .... Support and mentor engineers through technical collaboration... ...participate in a company vehicle evaluation program, through which you...
Local area
Remote work
Relocation
Relocation package
Flexible hours
Israelvcforum
Seattle, WA
1 day ago
Remote ML Engineer: Frontier Code Agent Evaluator
$400 per month
...contributors to support a Frontier Code Agents project in Bellevue, Washington. You will evaluate and improve AI coding models through structured assessments, applying your machine learning engineering skills to realistic scenarios. The ideal candidate has 2+ years in the field...
Remote job
Mercor
Bellevue, WA
1 day ago
Remote Python Infra Engineer for AI Model Evaluation
Alignerr is seeking a Senior Python Infrastructure Engineer to work remotely on critical AI model development tasks. You will design, build, and optimize data pipelines, annotation tools, and evaluation systems essential for next-generation AI models. This contract role...
Remote job
Contract work
Flexible hours
Alignerr
Seattle, WA
3 days ago
Sr./Staff ML Infrastructure Engineer, Compute (TPU Scheduling) - Foundation Model
$171.6k - $302.2k
...Description As a Senior/Staff Engineer on the Foundation Model Compute... ...team, you will lead the design and development of scheduling... ...orchestration systems for distributed ML workloads running on... ...operational scalability through automation of provisioning, resource management...
Relocation
Apple Inc.
Seattle, WA
4 days ago
AI/ML Tools and Automation Engineer
$134.8k - $245.8k
...accomplish! The Apple Services Engineering team is one of the most... ...up solid domain knowledge and automated testing strategies and systems... ...services offerings in the AI/ML space, we would love to talk... ...as features are implemented Design and evolve automation frameworks...
Relocation package
Apple Inc.
Seattle, WA
1 day ago
Senior ML Engineer - Scale AI for Global Travel
Expedia Group is seeking a Senior Machine Learning Engineer for their Seattle office to design and scale robust ML systems. The role involves collaborating across teams to provide high-quality solutions that enhance traveler experiences. Candidates must have a Bachelor'...
Work at office
Expedia Group
Seattle, WA
2 days ago
Senior Machine Learning Engineer
$171.6k - $258.1k
...to productionize large-scale ML solutions. Provide technical... ...improve workflows for training, evaluation, model optimization,... ...andmultimodal generative AI models. Design, implement, and maintain production... .... Mentor and guide junior engineers and interns in best practices...
Relocation
Apple
Seattle, WA
2 days ago
ML Engineer: Vision & Audio Intelligence
$139.5k - $258.1k
A leading technology company is seeking a Machine Learning Engineer to design and implement innovative features related to data processing in Seattle. The role involves building robust data pipelines, conducting failure analyses, and optimizing machine learning models....
Apple
Seattle, WA
1 day ago
Senior ML Engineer, Embodied AI (Offboard Perception)
General Motors is seeking a Machine Learning Engineer to design and implement innovative ML solutions that align with their autonomous driving objectives. This role involves working with large-scale datasets and collaborating across cross-functional teams to deploy and...
Work at office
Remote work
Israelvcforum
Seattle, WA
1 day ago
Intern - ML Engineering
$25 - $45 per hour
...leads into customers through AI/ML marketing optimization and... .... Intern - Machine Learning Engineering About The Role We are looking... ...You’ll Do Assist in training, evaluating, and deploying ML models... ...marketing and sales systems Help automate and maintain data and ML...
Hourly pay
Internship
Immediate start
Remote work
Scowtt
Seattle, WA
1 day ago
ML Engineer (New Grad) — RL Environments
Preference Model in Seattle is seeking a new graduate Machine Learning Engineer to design and build reinforcement learning environments. This role combines research and engineering, requiring up-to-date knowledge and innovative coding skills. The ideal candidate will have...
Preference Model
Seattle, WA
4 days ago
ML Engineering
..., and reasoning workflows. Evaluate and challenge model selection... ...methodologies, prompt engineering strategies, and fine‑tuning... ...Learning, including 3-4+ years designing and deploying enterprise NLP... ...Experience implementing AI‑driven automation and workflow orchestration...
BrickRed Systems
Bellevue, WA
4 days ago
Senior Machine Learning Engineer
$184.5k
...for everyone, everywhere. We design cutting-edge tech to make travel... ...open world. Join us.Senior ML/Gen AI EngineerIntroduction to... ...we need technically passionate engineers with an entrepreneurial approach... ...ML modelsExperience with automated testing across different layers...
Local area
Flexible hours
Expedia Group
Seattle, WA
1 day ago
AIML - Machine Learning Engineer, MIND
$139.5k - $258.1k
...Learning and AI As a Machine Learning Engineer in the Machine Intelligence Neural Design (MIND) team, you will have an opportunity to be part of an ML innovation organization within Apple... ...models. Able to define metrics, evaluate ML models, and perform error analysis...
Temporary work
Relocation
Apple
Seattle, WA
1 day ago
ML Engineer, Real-Time Payment Risk & Fraud
The Consulting Solutions in Seattle is seeking a Machine Learning Engineer to own the lifecycle of ML model development for payment systems. You will design, build, and operate ML-powered systems to enhance fraud detection and protect users. The ideal candidate has over...
The Consulting Solutions
Seattle, WA
1 day ago
ML Evaluation Platform Engineer Build Production AI Tools
Apple Inc. in Seattle is seeking a Software Engineer to build and ship features for its generative AI evaluation platform. In this hands-on role, you will collaborate closely with research engineers and integrate ML research into reliable services. Strong Python skills...
Apple
Seattle, WA
4 days ago
Principal AI Agent / ML Software Engineer (OCI)
$99.6k - $234.6k
Principal AI Agent / ML Software Engineer (OCI) Job... ...deeply hands‑on in design, code, reviews, operations... ...memory, retrieval, evaluation, guardrails, and cloud... ...strategy, deployment automation, incident analysis,... ...experiments, golden tasks, adversarial testing, regression...
Temporary work
Flexible hours
Ll Oefentherapie
Seattle, WA
2 days ago
Staff Machine Learning Platform Engineer, AI Evaluation
$201.3k - $302.2k
Staff Machine Learning Platform Engineer, AI Evaluation Seattle, Washington, United... ...to lead the architectural design and development of the high... ...upholding the code quality, automation, and testing rigor required... ...integrate seamlessly with existing ML infrastructure and developer...
Relocation
Apple Inc.
Seattle, WA
1 day ago
ML Infrastructure Engineer — Scalable GPU & Kubernetes
$148.5k - $313.7k
100 Salesforce, Inc. is seeking a Software Engineer for ML Infrastructure to design and operate core systems that power AI at Slack. Candidates should have significant experience in software engineering, particularly in infrastructure and distributed systems, as well as...
100 Salesforce, Inc.
Seattle, WA
4 days ago
Senior ML Infrastructure Engineer for Scale & Inference
Snapchat seeks a Software Engineer, ML Infrastructure to design and optimize infrastructure for machine learning workloads. This role involves building scalable ML model training and serving systems, enhancing feature generation pipelines, and collaborating with machine...
Work at office
Snapchat
Bellevue, WA
3 days ago
Principal ML Engineer: Edge AI & On-Device Solutions
Menlo Ventures is seeking a Principal ML Engineer in Seattle, WA, to design and implement AI solutions for counter UAS devices. The ideal candidate will have extensive experience in software engineering, particularly in deploying AI models at the edge. Responsibilities...
Menlo Ventures
Seattle, WA
2 days ago
Senior ML Engineer: Build Production-Grade AI Agents Remote
Workday, Inc. seeks a Senior Machine Learning Engineer to design and build core ML systems for AI agents in Seattle. In this role, you'll work within a senior, cross-functional team to create production-grade AI solutions that integrate deeply into Workday's platform....
Remote job
Workday
Seattle, WA
2 days ago
AI/ML Automation Engineer: Scale Quality & Pipelines
$134.8k - $245.8k
Apple Inc. in Seattle, Washington, is seeking a Software Engineer in Test to define test strategies for AI/ML-powered services, develop automated testing frameworks, and drive quality engineering best practices. Candidates should have a degree in a relevant field and at...
Apple
Seattle, WA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Engineer - Automated Evaluation and Adversarial Design. Be the first to apply!