ML Engineer - Automated Evaluation and Adversarial Design
$139.5k - $258.1kApple Oakbrook
ML Engineer - Automated Evaluation and Adversarial Design Culver City, California, United States Software and Services The Productivity and Machine Learning Evaluation team ensures the quality of AI-powered features across a suite of productivity and creative applications; including Creator Studio, used by hundreds of millions of people. This team serves as the primary evaluation function, providing critical quality signals that directly influence model development decisions and product launches. This role focuses on building and scaling automated evaluation systems and designing adversarial and stress-testing methodologies across multiple AI features. The work requires a deep understanding of how AI systems fail and how to measure quality rigorously. As features evolve from single-turn interactions into multi-turn, agentic experiences, the evaluation challenge shifts from assessing individual outputs to stress-testing entire conversation flows and agent decision chains. This is an opportunity to shape the evaluation infrastructure that determines whether AI features meet the bar for hundreds of millions of users. Description Day-to-day work involves designing, building, and maintaining automated evaluation systems that assess AI feature quality at scale, including multi-turn conversation evaluation and end-to-end agent workflow testing. This includes creating adversarial test suites that probe model weaknesses and running stress tests to ensure features perform under demanding conditions, with particular focus on failure modes that only emerge across extended interactions, such as: context degradation, goal drift, and compounding errors. Typical deliverables include: evaluation frameworks and rubrics, quality assessment reports, adversarial test case libraries, multi-turn stress-test pipelines, and recommendations on model readiness. Responsibilities Define and own the automated evaluation approach for AI features, translating qualitative notions of quality into measurable, reproducible assessments across both single-turn and multi-turn agentic experiences Build adversarial test suites that target known and emerging model failure modes, including edge cases relevant to productivity application workflows including conversation-level failures such as context loss, instruction forgetting, and cascading errors across multi-step tasks Develop and execute stress test protocols that validate minimum performance thresholds under atypical input conditions including extended conversation lengths, adversarial mid-conversation topic shifts, and complex tool-use sequences Ensure alignment between automated and human evaluation methods on an ongoing basis, identifying and resolving systematic disagreements Collaborate with engineering partners to integrate evaluation into development and release workflows Scale adversarial test case generation and stress test execution, leveraging automation where appropriate, including programmatic generation of multi-turn conversation scenarios and agent interaction traces Influence model and feature quality decisions by communicating evaluation findings and readiness assessments to cross-functional partners Minimum Qualifications Bachelor’s degree in Computer Science, Machine Learning, Statistics, or a related field 4+ years of experience building or significantly extending ML evaluation systems, including designing evaluation benchmarks or quality assessment frameworks including evaluation of sequential or multi-step AI outputs Experience independently defining evaluation architecture and methodology for AI or ML systems with the ability to design evaluation approaches where the unit of analysis is a conversation or session rather than a single output Experience designing adversarial or red‑teaming test methodologies for ML models or AI‑powered features including adversarial scenarios that target failures across multi‑turn interactions Experience with Python and ML frameworks (PyTorch, TensorFlow, or equivalent) in production or near‑production settings Track record of owning technical direction for evaluation efforts across multiple features or product areas Preferred Qualifications Experience evaluating user-facing AI features in consumer applications, with an understanding of how technical metrics connect to user‑perceived quality Familiarity with productivity software or creative tools, with the ability to assess output quality from a user workflow perspective Experience ensuring alignment between automated and human evaluation methods, including inter‑annotator agreement analysis and bias detection Track record of designing evaluation systems that scale across multiple features or product areas without requiring bespoke solutions for each Experience evaluating different types of AI systems, including API-based and custom‑trained models Demonstrated ability to communicate evaluation findings and readiness assessments to cross‑functional partners Experience leveraging automation to scale evaluation data generation and analysis Experience building evaluation pipelines for conversational AI, dialogue systems, or agentic workflows, including turn‑level and session‑level automated scoring Familiarity with agent orchestration frameworks (LangChain, LangGraph, CrewAI, AutoGen) and observability tooling (LangSmith, Braintrust, Arize), with an understanding of how to instrument and evaluate multi‑step agent runs Experience designing adversarial tests for tool‑use reliability, function‑calling accuracy, or agent planning quality At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $139,500 and $258,100, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant . #J-18808-Ljbffr Apple
$139.5k - $258.1k
Apple Inc. is seeking an ML Engineer in Culver City, California, to work on automated evaluation systems for AI features. The role involves defining evaluation methods, building adversarial test suites, and collaborating with engineering teams to ensure quality across...Suggested- AI/ML Engineer - Architectural Drawing Understanding (US) Responsibilities... ...format. The role emphasizes designing and training computer vision... ...training. Benchmark, evaluate, and continuously improve model... ...vision models into design automation and CAD/BIM workflows. Qualifications...Suggested
- ...drawing data. Train and evaluate deep learning models (e... ...the guidance of senior engineers. Support the data... ...Python and at least one ML/CVframework (e.g., PyTorch... ...product, Structural CoPilot, automates the generation of structural engineering design drawings for the...SuggestedFull timeInternship
$171.6k - $230.1k
...Staff GenAI/ML Engineer (Emerging Tech & AI Automation) At Disney, we’re storytellers. We make the impossible, possible... ...support long‑term innovation. Lead design and rapid prototyping of GenAI‑... ...‑impact business opportunities. Evaluate and integrate LLMs and modern GenAI...SuggestedPermanent employmentFull time$139.5k - $258.1k
ML Engineer - Evaluation Analysis, Metric and Data Strategy Culver City, California, United States Software and Services The Productivity and... ...signals and real‑world user behavior. The work involves designing feature-level quality metrics, collaborating with partner...SuggestedRelocation- A leading manufacturing technology firm in Los Angeles is seeking a Senior Machine Learning Engineer to design and build advanced software systems for automating precision manufacturing. The engineer will work on cutting-edge deep learning models, contribute to the Machine...
$175k - $225k
...today! POSITION PURPOSE The Senior ML Ops Engineer leads the design and maintenance of scalable, secure... ...of AI value realization by automating and scaling ML models and GenAI applications... ...engineering best practices and LLM evaluation frameworks to ensure output quality...16 hoursLocal area- ...hardware, firmware, and software development. You will design cutting-edge robotic automation systems, build robust test frameworks, and drive... ...Requirements Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, Mechanical Engineering, or a...
- ...a full‑time, fully onsite, hands‑on AI/ML Engineer contract role. You’ll use state‑of‑the‑art... ..., Vector Databases, and Azure OpenAI—to design, build and deploy production‑grade... ...preferred) Observability, monitoring, and evaluation frameworks Retrieval‑Augmented Generation...Full timeContract work
$160k - $195k
Meredith Corporation is seeking a Senior Software Engineer for ML to enhance user personalization through advanced recommendation algorithms. You will own the design and implementation of a core personalization engine, collaborating with product and data teams to deliver...Remote job- ...About Machina Labs Engineering moves at software... ...directly from digital design. By integrating advanced... ...forming, robotics, and automated production inside a... ...architecting scalable ML pipelines. If you’re passionate... ...Design, build, train, evaluate, and deploy machine...Flexible hours
$134.25k - $196.9k
...leader About The Role The Audio ML Engineer (Research) develops learning... ...and drive adaptive behavior—designed from the start for embedded... ...in cloud pipelines (batch evaluation, fleet learning, offline... ...assistants, data analysis copilots, automated report generation) to...Full timeTemporary workImmediate startRemote workFlexible hours- Ramboll Group A/S is seeking a Control Systems Engineer based in Los Angeles, CA. You will lead automation design teams and mentor engineers while enhancing HVAC control systems. The ideal candidate has 5+ years of experience in data center engineering and control system...
$101.9k - $163k
...fabric of how we work every day. To learn more, please see The AI/ML Engineer - Higher Education builds AI capabilities for Cengage's higher... ..., learning outcomes, and instructor productivity. You will design, build, and ship production AI features integrated directly...Live inLocal areaWorldwide$160k - $250k
...advanced software systems to automate Design for Manufacturing (DFM)... ...augment or automate complex engineering judgment. As a Senior Machine... ...training, inference, labeling, and evaluation Judiciously combine open-... ...throughout the entire ML Lifecycle Proficiency in Python...Permanent employmentFor contractorsLocal areaImmediate startRelocationFlexible hours$160k - $180k
...seeking a Senior Machine Learning Engineer to join our growing team... ..., and software engineering to design systems that can reason, adapt... ...applications Continuously evaluate and improve model performance,... ...Strong proficiency in Python and ML frameworks like PyTorch,...Local area- ...industry together! Machine Learning Engineer, Applied AI As a MLE you'll... ...and work across the full applied ML stack - deploying models, building the evaluation systems that tell us whether they... ...patterns and privacy-by-design data handling Open-source contributions...Work at officeRemote workWork from homeWorldwideHome officeFlexible hours
$300k - $375k
...and delivery of offline/online ML systems, feature pipelines,... ...loops, and monitoring. Lead the design, build, and evolution of... ...frameworks, including offline evaluation, A/B testing, KPI design, and... ...architecture. Partner closely with Data Engineering, BI, Product, Engineering,...Full timeFlexible hours$145.6k - $240.24k
Machine Learning (ML) Ops Engineer - IS Clinical Research - Full Time 8 Hour Days (Exempt) (Non... ...of machine learning models, including design, build, and maintenance of machine learning... ...will ensure seamless integration, automation, and scaling of AI solutions within the...Full timeWork experience placementLocal area$132k - $165k
Machine Learning Engineer, Applied AI As a MLE you’ll join our Product Innovations... ...and work across the full applied ML stack—deploying models, building the evaluation systems that tell us whether they... ...patterns and privacy‑by‑design data handling. Open‑source contributions...Work at officeWork from homeHome office$257k - $327k
...Data Center Controls Network Engineer Datacenter Design - San Francisco OpenAI is building the infrastructure... ...into practical OT network designs, evaluates vendor solutions, and drives... ...Key Responsibilities Define controls, automation, and OT network requirements for AI data...For contractorsWork at officeRemote work- A next-generation loyalty platform is seeking a skilled Machine Learning Engineer in Los Angeles, CA. You'll design and implement machine learning models to enhance our platform and drive data-driven decisions. The role requires 5+ years of relevant experience, strong...
$171.6k - $230.1k
Data Engineering Manager - Enterprise Technology, Data At... ...Enterprise Technology. We design and develop enterprise data, analytics, and automation solutions used by... ...reporting & analytics, and AI/ML applications. Lead... ...and continuous drive to evaluate and adopt emerging data...Work experience placementWorldwide$251.7k - $351.9k
Principal Machine Learning Engineer (Personalization,... ...expertise across data processing, automation, machine learning ("ML"), artificial intelligence ("AI"), and experimental design to inform decisions and develop... .... Lead post-launch evaluations of algorithmic impact on player...Temporary workLocal areaFlexible hours- ...Investment Operations Automation Analyst Tamar Securities is seeking an Investment Operations Automation Analyst to design, build, and maintain automated workflows supporting trading and investment operations as the firm scales. This role sits at the intersection of trading...
$140k - $175k
...Senior Full Stack Engineer (Python, Serverless, AI Fluency) Los... ...Angeles Vynyl's technologists, designers and product strategists are... ...fluency with modern AI/ML development tools (e.g., GitHub... ...Experience with CI/CD pipelines, automated testing, and Infrastructure-...Full timeShift work- ...Manager, Data Engineering United States Brainlabs is the media... ...to 5 years of experience in designing, building, and managing scalable... ...for LLM applications and AI/ML model training, is a strong plus... ...ML, or AutoML) for building, evaluating, or serving models is a...Full timeWork experience placementWork at office
- ...Job Description: This Analytics Engineer role operates at the... ...Architecture & DataMart Development Design and maintain analytic-ready datasets... ...Python for data processing, automation, and analytics workflows... ...and validation Exposure to AI/ML or LLM-based use cases, including...Work at officeRemote work
$295k
...leading AI research company is seeking a Research Engineer / Scientist in San Francisco, CA to enhance... ...collaborating with other teams, and building robust evaluations for improvements. Ideal candidates should possess strong ML engineering skills and thrive in complex...Relocation package- CreatorIQ in Los Angeles is seeking a Machine Learning Engineer to join our Product Innovations team. This role involves deploying and monitoring ML systems at scale, working with data science on evaluation workflows, and improving MLOps foundations. The ideal candidate...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Engineer - Automated Evaluation and Adversarial Design. Be the first to apply!
- machine learning scientist Culver City, CA
- machine learning Culver City, CA
- data engineer machine learning Culver City, CA
- machine learning research scientist Culver City, CA
- ml developer
- lead machine learning engineer
- google ml engineer
- graduate machine learning engineer
- google machine learning engineer
- machine learning engineer

