Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Engineer - Automated Evaluation and Adversarial Design

$139.5k - $258.1k

Apple Oakbrook

ML Engineer - Automated Evaluation and Adversarial Design Culver City, California, United States Software and Services The Productivity and Machine Learning Evaluation team ensures the quality of AI-powered features across a suite of productivity and creative applications; including Creator Studio, used by hundreds of millions of people. This team serves as the primary evaluation function, providing critical quality signals that directly influence model development decisions and product launches. This role focuses on building and scaling automated evaluation systems and designing adversarial and stress-testing methodologies across multiple AI features. The work requires a deep understanding of how AI systems fail and how to measure quality rigorously. As features evolve from single-turn interactions into multi-turn, agentic experiences, the evaluation challenge shifts from assessing individual outputs to stress-testing entire conversation flows and agent decision chains. This is an opportunity to shape the evaluation infrastructure that determines whether AI features meet the bar for hundreds of millions of users. Description Day-to-day work involves designing, building, and maintaining automated evaluation systems that assess AI feature quality at scale, including multi-turn conversation evaluation and end-to-end agent workflow testing. This includes creating adversarial test suites that probe model weaknesses and running stress tests to ensure features perform under demanding conditions, with particular focus on failure modes that only emerge across extended interactions, such as: context degradation, goal drift, and compounding errors. Typical deliverables include: evaluation frameworks and rubrics, quality assessment reports, adversarial test case libraries, multi-turn stress-test pipelines, and recommendations on model readiness. Responsibilities Define and own the automated evaluation approach for AI features, translating qualitative notions of quality into measurable, reproducible assessments across both single-turn and multi-turn agentic experiences Build adversarial test suites that target known and emerging model failure modes, including edge cases relevant to productivity application workflows including conversation-level failures such as context loss, instruction forgetting, and cascading errors across multi-step tasks Develop and execute stress test protocols that validate minimum performance thresholds under atypical input conditions including extended conversation lengths, adversarial mid-conversation topic shifts, and complex tool-use sequences Ensure alignment between automated and human evaluation methods on an ongoing basis, identifying and resolving systematic disagreements Collaborate with engineering partners to integrate evaluation into development and release workflows Scale adversarial test case generation and stress test execution, leveraging automation where appropriate, including programmatic generation of multi-turn conversation scenarios and agent interaction traces Influence model and feature quality decisions by communicating evaluation findings and readiness assessments to cross-functional partners Minimum Qualifications Bachelor’s degree in Computer Science, Machine Learning, Statistics, or a related field 4+ years of experience building or significantly extending ML evaluation systems, including designing evaluation benchmarks or quality assessment frameworks including evaluation of sequential or multi-step AI outputs Experience independently defining evaluation architecture and methodology for AI or ML systems with the ability to design evaluation approaches where the unit of analysis is a conversation or session rather than a single output Experience designing adversarial or red‑teaming test methodologies for ML models or AI‑powered features including adversarial scenarios that target failures across multi‑turn interactions Experience with Python and ML frameworks (PyTorch, TensorFlow, or equivalent) in production or near‑production settings Track record of owning technical direction for evaluation efforts across multiple features or product areas Preferred Qualifications Experience evaluating user-facing AI features in consumer applications, with an understanding of how technical metrics connect to user‑perceived quality Familiarity with productivity software or creative tools, with the ability to assess output quality from a user workflow perspective Experience ensuring alignment between automated and human evaluation methods, including inter‑annotator agreement analysis and bias detection Track record of designing evaluation systems that scale across multiple features or product areas without requiring bespoke solutions for each Experience evaluating different types of AI systems, including API-based and custom‑trained models Demonstrated ability to communicate evaluation findings and readiness assessments to cross‑functional partners Experience leveraging automation to scale evaluation data generation and analysis Experience building evaluation pipelines for conversational AI, dialogue systems, or agentic workflows, including turn‑level and session‑level automated scoring Familiarity with agent orchestration frameworks (LangChain, LangGraph, CrewAI, AutoGen) and observability tooling (LangSmith, Braintrust, Arize), with an understanding of how to instrument and evaluate multi‑step agent runs Experience designing adversarial tests for tool‑use reliability, function‑calling accuracy, or agent planning quality At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $139,500 and $258,100, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant . #J-18808-Ljbffr Apple

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Engineer - Automated Evaluation and Adversarial Design in Culver City, CA vacancy
  • $139.5k - $258.1k

    Apple Inc. is seeking an ML Engineer in Culver City, California, to work on automated evaluation systems for AI features. The role involves defining evaluation methods, building adversarial test suites, and collaborating with engineering teams to ensure quality across... 
    Suggested

    Apple Inc.

    Culver City, CA
    1 day ago
  • AI/ML Engineer - Architectural Drawing Understanding (US) Responsibilities...  ...format. The role emphasizes designing and training computer vision...  ...training. Benchmark, evaluate, and continuously improve model...  ...vision models into design automation and CAD/BIM workflows. Qualifications... 
    Suggested

    Genia

    Los Angeles, CA
    19 hours ago
  •  ...drawing data. Train and evaluate deep learning models (e...  ...the guidance of senior engineers. Support the data...  ...Python and at least one ML/CVframework (e.g., PyTorch...  ...product, Structural CoPilot, automates the generation of structural engineering design drawings for the... 
    Suggested
    Full time
    Internship

    Genia

    Los Angeles, CA
    3 days ago
  • $171.6k - $230.1k

     ...Staff GenAI/ML Engineer (Emerging Tech & AI Automation) At Disney, we’re storytellers. We make the impossible, possible...  ...support long‑term innovation. Lead design and rapid prototyping of GenAI‑...  ...‑impact business opportunities. Evaluate and integrate LLMs and modern GenAI... 
    Suggested
    Permanent employment
    Full time

    1008 Disney Worldwide Services, Inc.

    Los Angeles, CA
    19 hours ago
  • $139.5k - $258.1k

    ML Engineer - Evaluation Analysis, Metric and Data Strategy Culver City, California, United States Software and Services The Productivity and...  ...signals and real‑world user behavior. The work involves designing feature-level quality metrics, collaborating with partner... 
    Suggested
    Relocation

    Apple Inc.

    Culver City, CA
    1 day ago
  • A leading manufacturing technology firm in Los Angeles is seeking a Senior Machine Learning Engineer to design and build advanced software systems for automating precision manufacturing. The engineer will work on cutting-edge deep learning models, contribute to the Machine... 

    Hadrian Automation

    Los Angeles, CA
    1 day ago
  • $175k - $225k

     ...today! POSITION PURPOSE The Senior ML Ops Engineer leads the design and maintenance of scalable, secure...  ...of AI value realization by automating and scaling ML models and GenAI applications...  ...engineering best practices and LLM evaluation frameworks to ensure output quality... 
    16 hours
    Local area

    Medium

    Los Angeles, CA
    1 day ago
  •  ...hardware, firmware, and software development. You will design cutting-edge robotic automation systems, build robust test frameworks, and drive...  ...Requirements Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, Mechanical Engineering, or a... 

    EPAM

    Los Angeles, CA
    19 hours ago
  •  ...a full‑time, fully onsite, hands‑on AI/ML Engineer contract role. You’ll use state‑of‑the‑art...  ..., Vector Databases, and Azure OpenAI—to design, build and deploy production‑grade...  ...preferred) Observability, monitoring, and evaluation frameworks Retrieval‑Augmented Generation... 
    Full time
    Contract work

    Motion Recruitment Partners LLC

    Los Angeles, CA
    19 hours ago
  • $160k - $195k

    Meredith Corporation is seeking a Senior Software Engineer for ML to enhance user personalization through advanced recommendation algorithms. You will own the design and implementation of a core personalization engine, collaborating with product and data teams to deliver... 
    Remote job

    Meredith Corporation

    Los Angeles, CA
    19 hours ago
  •  ...About Machina Labs   Engineering moves at software...  ...directly from digital design. By integrating advanced...  ...forming, robotics, and automated production inside a...  ...architecting scalable ML pipelines. If you’re passionate...  ...Design, build, train, evaluate, and deploy machine... 
    Flexible hours

    Machina Labs

    Los Angeles, CA
    11 days ago
  • $134.25k - $196.9k

     ...leader About The Role The Audio ML Engineer (Research) develops learning...  ...and drive adaptive behavior—designed from the start for embedded...  ...in cloud pipelines (batch evaluation, fleet learning, offline...  ...assistants, data analysis copilots, automated report generation) to... 
    Full time
    Temporary work
    Immediate start
    Remote work
    Flexible hours

    HARMAN

    Los Angeles, CA
    1 day ago
  • Ramboll Group A/S is seeking a Control Systems Engineer based in Los Angeles, CA. You will lead automation design teams and mentor engineers while enhancing HVAC control systems. The ideal candidate has 5+ years of experience in data center engineering and control system... 

    Ramboll Group A/S

    Los Angeles, CA
    1 day ago
  • $101.9k - $163k

     ...fabric of how we work every day. To learn more, please see The AI/ML Engineer - Higher Education builds AI capabilities for Cengage's higher...  ..., learning outcomes, and instructor productivity. You will design, build, and ship production AI features integrated directly... 
    Live in
    Local area
    Worldwide

    Cengage Group

    Los Angeles, CA
    2 days ago
  • $160k - $250k

     ...advanced software systems to automate Design for Manufacturing (DFM)...  ...augment or automate complex engineering judgment. As a Senior Machine...  ...training, inference, labeling, and evaluation Judiciously combine open-...  ...throughout the entire ML Lifecycle Proficiency in Python... 
    Permanent employment
    For contractors
    Local area
    Immediate start
    Relocation
    Flexible hours

    Hadrian Automation

    Los Angeles, CA
    3 days ago
  • $160k - $180k

     ...seeking a Senior Machine Learning Engineer to join our growing team...  ..., and software engineering to design systems that can reason, adapt...  ...applications Continuously evaluate and improve model performance,...  ...Strong proficiency in Python and ML frameworks like PyTorch,... 
    Local area

    Fox

    Los Angeles, CA
    19 hours ago
  •  ...industry together! Machine Learning Engineer, Applied AI As a MLE you'll...  ...and work across the full applied ML stack - deploying models, building the evaluation systems that tell us whether they...  ...patterns and privacy-by-design data handling Open-source contributions... 
    Work at office
    Remote work
    Work from home
    Worldwide
    Home office
    Flexible hours

    CreatorIQ

    Los Angeles, CA
    4 days ago
  • $300k - $375k

     ...and delivery of offline/online ML systems, feature pipelines,...  ...loops, and monitoring. Lead the design, build, and evolution of...  ...frameworks, including offline evaluation, A/B testing, KPI design, and...  ...architecture. Partner closely with Data Engineering, BI, Product, Engineering,... 
    Full time
    Flexible hours

    Prodege

    El Segundo, CA
    19 hours ago
  • $145.6k - $240.24k

    Machine Learning (ML) Ops Engineer - IS Clinical Research - Full Time 8 Hour Days (Exempt) (Non...  ...of machine learning models, including design, build, and maintenance of machine learning...  ...will ensure seamless integration, automation, and scaling of AI solutions within the... 
    Full time
    Work experience placement
    Local area

    University of Southern California

    Los Angeles, CA
    1 day ago
  • $132k - $165k

    Machine Learning Engineer, Applied AI As a MLE you’ll join our Product Innovations...  ...and work across the full applied ML stack—deploying models, building the evaluation systems that tell us whether they...  ...patterns and privacy‑by‑design data handling. Open‑source contributions... 
    Work at office
    Work from home
    Home office

    CreatorIQ

    Los Angeles, CA
    1 day ago
  • $257k - $327k

     ...Data Center Controls Network Engineer Datacenter Design - San Francisco OpenAI is building the infrastructure...  ...into practical OT network designs, evaluates vendor solutions, and drives...  ...Key Responsibilities Define controls, automation, and OT network requirements for AI data... 
    For contractors
    Work at office
    Remote work

    OpenAI

    Los Angeles, CA
    19 hours ago
  • A next-generation loyalty platform is seeking a skilled Machine Learning Engineer in Los Angeles, CA. You'll design and implement machine learning models to enhance our platform and drive data-driven decisions. The role requires 5+ years of relevant experience, strong... 

    Hang

    Los Angeles, CA
    1 day ago
  • $171.6k - $230.1k

    Data Engineering Manager - Enterprise Technology, Data At...  ...Enterprise Technology. We design and develop enterprise data, analytics, and automation solutions used by...  ...reporting & analytics, and AI/ML applications. Lead...  ...and continuous drive to evaluate and adopt emerging data... 
    Work experience placement
    Worldwide

    1008 Disney Worldwide Services, Inc.

    Los Angeles, CA
    10 hours ago
  • $251.7k - $351.9k

    Principal Machine Learning Engineer (Personalization,...  ...expertise across data processing, automation, machine learning ("ML"), artificial intelligence ("AI"), and experimental design to inform decisions and develop...  .... Lead post-launch evaluations of algorithmic impact on player... 
    Temporary work
    Local area
    Flexible hours

    Riot Games

    Los Angeles, CA
    4 days ago
  •  ...Investment Operations Automation Analyst Tamar Securities is seeking an Investment Operations Automation Analyst to design, build, and maintain automated workflows supporting trading and investment operations as the firm scales. This role sits at the intersection of trading... 

    Tamar Securities LLC

    Los Angeles, CA
    1 day ago
  • $140k - $175k

     ...Senior Full Stack Engineer (Python, Serverless, AI Fluency) Los...  ...Angeles Vynyl's technologists, designers and product strategists are...  ...fluency with modern AI/ML development tools (e.g., GitHub...  ...Experience with CI/CD pipelines, automated testing, and Infrastructure-... 
    Full time
    Shift work

    VYNYL

    Los Angeles, CA
    3 days ago
  •  ...Manager, Data Engineering United States Brainlabs is the media...  ...to 5 years of experience in designing, building, and managing scalable...  ...for LLM applications and AI/ML model training, is a strong plus...  ...ML, or AutoML) for building, evaluating, or serving models is a... 
    Full time
    Work experience placement
    Work at office

    Brainlabs

    Los Angeles, CA
    19 hours ago
  •  ...Job Description: This Analytics Engineer role operates at the...  ...Architecture & DataMart Development Design and maintain analytic-ready datasets...  ...Python for data processing, automation, and analytics workflows...  ...and validation Exposure to AI/ML or LLM-based use cases, including... 
    Work at office
    Remote work

    Elevateprimesolutions

    Los Angeles, CA
    1 day ago
  • $295k

     ...leading AI research company is seeking a Research Engineer / Scientist in San Francisco, CA to enhance...  ...collaborating with other teams, and building robust evaluations for improvements. Ideal candidates should possess strong ML engineering skills and thrive in complex... 
    Relocation package

    OpenAI

    Los Angeles, CA
    19 hours ago
  • CreatorIQ in Los Angeles is seeking a Machine Learning Engineer to join our Product Innovations team. This role involves deploying and monitoring ML systems at scale, working with data science on evaluation workflows, and improving MLOps foundations. The ideal candidate... 

    CreatorIQ

    Los Angeles, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Engineer - Automated Evaluation and Adversarial Design. Be the first to apply!