Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Engineer, Proactive - Agentic Systems Evaluation

$126.8k - $220.9k

Apple Oakbrook

ML Engineer, Proactive - Agentic Systems Evaluation

Are you passionate about working on the next generation of personalized intelligence systems? In this role, you will be developing and deploying robust evaluation frameworks across the data lifecycle -- from data collection and processing, to analytic dashboards for reporting. You will be part of the larger Proactive Intelligence team, which builds features that anticipate customer's needs and create personalized experiences by adapting to user behaviors with machine learning running locally on-device or in PCC. Join our cross functional team of specialists dedicated to the evaluation of agentic systems.

Description

We are looking for a high-impact ML Evaluation Engineer to help architect rigorous evaluations systems for autonomous agents. With the rise of generative AI, the ability to quantify the reliability and quality of these systems is more critical than ever. You will design and deploy qualitative and quantitative metrics to measure the quality, reasoning, and tool-use accuracy of agentic systems. You will be working with very sensitive data, so leveraging existing and developing new privacy enhancing technologies -- such as differential privacy, PII redaction, and data minimization -- will be crucial. The team you will be joining is focused on advancing scalable automated processes for evaluation. To succeed, you will need a deep understanding of system-level software operations to deliver next-generation capabilities. Join the Proactive Intelligence team to build the evaluation platforms for the future of intelligent, personalized experiences.

Responsibilities
  • Design and implement evaluation frameworks to measure quality, reasoning, and tool-use accuracy of agentic systems
  • Develop MCP servers and API orchestration layers to enable reliable tool-use for agentic systems.
  • Orchestrate end-to-end ML workflows by integrating heterogeneous internal systems — spanning data services, compute infrastructure, model deployment, and results visualization — into cohesive, production-ready pipelines
  • Create and manage analytic dashboards to surface evaluation insights to key stakeholders.
  • Collaborate cross-functionally with various teams across ML and SWE teams.
Minimum Qualifications
  • MS or PhD in Computer Science, Machine Learning, Statistics, or equivalent practical experience in a quantitative field.
  • 3+ years of industry experience in ML Engineering or Applied Science.
  • Strong software engineering fundamentals (Python is a must) with experience building scalable, automated data or evaluation pipelines.
Preferred Qualifications
  • Demonstrated experience applying Differential Privacy, Federated Learning, or advanced PII redaction techniques to large-scale datasets.
  • Hands-on experience building or testing LLM-based systems, including a deep understanding of chain-of-thought reasoning, prompt engineering, and agentic planning.
  • Proficiency in building or evaluating systems that integrate with external tools/APIs.
  • Experience with specialized agent evaluation frameworks and analyzing execution traces to identify failure modes in multi-turn interactions.
  • Experience with compiled languages (e.g., Swift) and a curiosity about how ML interacts with OS-level software operations.
  • A track record of developing custom metrics (e.g., "LLM-as-a-Judge") or publishing research on model reliability, safety, or algorithmic bias.
Pay & Benefits

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $126,800 and $220,900, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant At Apple, we believe accessibility is a fundamental human right. You'll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong. Learn about accessibility in Apple's workplace Learn about reasonable accommodations for job applicants Apple accepts applications to this posting on an ongoing basis.

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the ML Engineer, Proactive - Agentic Systems Evaluation in Cupertino, CA vacancy
  • $147.4k - $272.1k

     ...Sr Machine Learning Engineer, Proactive - ML Systems Engineering Apple's products combine the best hardware and incredible software to deliver...  ...learning systems; establish scalable automated processes for evaluation and monitoring; contribute to a healthy team culture... 
    Suggested
    Relocation

    Apple

    Cupertino, CA
    2 days ago
  •  ...Productivity and Machine Learning Evaluation team ensures the quality of AI...  ...scaling automated evaluation systems and designing adversarial and...  ...interactions into multi-turn, agentic experiences, the evaluation...  ...building or significantly extending ML evaluation systems, including... 
    Suggested
    Shift work

    Apple

    Cupertino, CA
    5 days ago
  •  ...and Machine Learning Evaluation team ensures the quality...  ...into multi-turn, agentic experiences, this role...  ...metrics for AI-powered or ML-driven features in consumer...  ...partnering with engineering or data teams to define...  ...reliability within AI systems Experience with evaluation... 
    Suggested

    Apple

    Cupertino, CA
    5 days ago
  •  ...are Moveworks is the Agentic AI Assistant platform...  ...all of their business systems through natural language...  ...with Moveworks' Reasoning Engine and natural language...  ...infrastructure needed to fine-tune, evaluate, and serve your own...  ..., and keeping our ML at the cutting edge of... 
    Suggested
    Work at office
    Immediate start
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    5 days ago
  •  ...are Moveworks is the Agentic AI Assistant platform...  ...all of their business systems through natural language...  ...with Moveworks' Reasoning Engine and natural language...  ...help build cutting edge ML infrastructure for building...  ...models(LLM), model evaluation and monitoring framework... 
    Suggested
    Work at office
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    4 days ago
  • $172.5k - $306.63k

     ...Senior Machine Learning Engineer to compose, build, and...  ...scalable intelligent AI systems that power end-user AI...  ...retrieval and memory services, evaluation, safety/guardrails, and...  ...of production-grade agentic AI systems-from...  ...safety, governance, and ML Ops guidelines (guardrails... 
    Temporary work
    Local area
    Relocation

    Adobe

    San Jose, CA
    4 days ago
  • $172.5k - $306.63k

     ...Senior Machine Learning Engineer At Adobe's...  ...scalable intelligent AI systems that power end-user AI...  ...retrieval and memory services, evaluation, safety/guardrails, and...  ...delivery of production-grade agentic AI systems—from...  ...safety, governance, and ML Ops guidelines (guardrails... 
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    3 days ago
  •  ...27-0836 Summary For the engineer that obsesses on how software can enable OS developers to evaluate and improve their features, there...  ...of low-level operating system technologies while maintaining...  ...to measure, understand, and proactively improve all Apple features. We... 
    Work experience placement

    Apple

    Cupertino, CA
    3 days ago
  • $178.42k - $230.5k

     ...maintaining the tools and services engineers here at GM use every day to...  ...Productivity Org, the AV Agentic Developer Engineer is responsible...  ...team's technical roadmap Evaluate new tools and technologies...  ...building, and operating production systems at scale in the cloud ~... 
    Work experience placement
    Work at office
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours
    3 days per week

    General Motors

    Sunnyvale, CA
    5 days ago
  • $184k - $287.5k

     ...generation of driving behavior evaluation - moving beyond hand-...  ...evaluation using LLMs, VLMs, and agentic workflows. You'll define...  ...drives well, building systems that bridge ML research and production evaluation...  ...Computer Science, Computer Engineering, or a related technical... 
    Remote work

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $147.4k - $272.1k

     ...Machine Learning Engineer - Agentic AI The VCV organization has pioneered human-centric, real...  ...and intelligence of our agentic systems. We are looking for an experienced engineer...  ...handling. Develop infrastructure for evaluating and improving agentic system... 
    Relocation

    Apple

    Sunnyvale, CA
    3 days ago
  •  ...Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs...  ...to effortlessly run large-scale ML applications, without the hassle of...  ...intelligence via additional agentic computation. About The Role Engineers on the inference performance team... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    5 days ago
  •  ...Cerebras Systems builds the world's largest AI chip, 56 times larger...  ...effortlessly run large-scale ML applications, without the hassle...  ...intelligence via additional agentic computation. About the Role...  ...seeking a versatile and experienced engineer to join our SOTA Training... 
    Internship

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    4 days ago
  • $181.1k - $318.4k

     ...Senior ML Infrastructure Engineer, Proactive The Intelligence Platform team empowers clients across Apple's operating systems with high quality user-centric knowledge and inferences that enable next generation user experiences. We're a systems engineering team focused... 
    Worldwide
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  •  ...company located in Cupertino is seeking an experienced Machine Learning Engineer to develop data generation methodologies and quality assessment systems. This role involves designing automated evaluation systems and collaborating on data requirements. Candidates should... 

    Apple Inc.

    Cupertino, CA
    1 day ago
  •  ...Who we are Moveworks is the Agentic AI Assistant platform that empowers the entire workforce...  ...to converse with all of their business systems through natural language to quickly find...  ...automation with Moveworks' Reasoning Engine and natural language capabilities, we deliver... 
    Work at office
    Immediate start
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    4 days ago
  •  ...Senior Staff ML Engineer - AI Safety & Evaluation About the Team We're building a future where AI systems are not only powerful but safe, aligned, and robust against misuse...  ...a mission to build systems that can proactively detect and prevent jailbreaks, toxic behaviors... 

    A10 Networks

    San Jose, CA
    1 day ago
  •  ...Senior MLE Tech Lead to join a centralized evaluation organization and define the next...  ...autograder training and hillclimbing system from the ground up.\\n\\nThis is a high...  ...of model evaluation, data quality, and ML systems engineering. You will work closely with model developers... 

    Apple

    Cupertino, CA
    5 days ago
  • $281k - $356k

     ...Senior Staff ML Engineer, Driver Understanding and Evaluation Waymo is an autonomous driving technology company...  ...scalable machine learning and data systems, simulation workflow and insight...  ...to detect critical anomalies Proactively monitor and assimilate best practices... 
    Full time

    Waymo

    Mountain View, CA
    5 days ago
  • $60 - $70 per hour

     ...seeking a Machine Learning Engineer to join a high-impact...  ...on advancing LLM evaluation, NLP, and AI-driven automation...  ..., and building systems that ensure high-quality...  ...automation workflows and agentic evaluation systems to improve...  ...years of experience in ML engineering, NLP, or AI... 
    Contract work
    Temporary work
    Remote work
    3 days per week

    TEKsystems

    Cupertino, CA
    5 days ago
  • $150k - $387.6k

     ...Machine Learning Engineer - E-commerce Merchant Growth (LLM & Agentic Systems) Location: San Jose Employment Type: Regular Job Code: A33672 Responsibilities...  ...world. We're looking for brilliant and motivated ML engineers eager to apply their knowledge in machine... 
    Temporary work
    Local area

    Tik Tok

    San Jose, CA
    1 day ago
  •  ...Senior ML Engineer Medical Imaging Evaluation & AI Reliability About the Role: My client is building evaluation and evidence infrastructure for safety-critical AI systems, starting with diagnostic medical imaging. AI systems are increasingly used in settings... 
    Shift work

    Established Search

    Sunnyvale, CA
    1 day ago
  •  ...experienced Machine Learning Engineer to build, deploy, and optimize...  ...) and scalable production systems. At Apple, we believe in creating...  ...works closely with product, ML research, Data Science and...  ...models and Apple Intelligence evaluations. We are looking for a Machine... 

    Apple

    Cupertino, CA
    3 days ago
  • $120k - $235k

     ...companies to build strong engineering teams ready for what's...  ...How developers were evaluated previously was whether...  ...looks like in the agentic era. That methodology...  ...have shipped LLM-powered systems in production where consistency...  ...you You can defend ML judgment in plain... 
    Shift work

    HackerRank

    Santa Clara, CA
    1 day ago
  •  ...Applied AI/ML Lead This is a rare opportunity...  ...goal is to create an Agentic Private Bank - reimagining...  ...research on prompt engineering techniques to improve...  ...based model training, evaluation, and optimization....  ...GIT and version control systems. Hands-on experience... 

    Chase

    Palo Alto, CA
    5 days ago
  • $136.5k - $276.5k

     ...AI/ML Engineer - Agentic This role has been designed as ‘Hybrid’ with an expectation that you...  ...services, and high‑performance backend systems that power agent execution. This position...  ...-tuning concepts, prompt engineering, evaluations, Qlora, PEFT Infrastructure &... 
    Work experience placement
    Work at office
    2 days per week

    HPE

    San Jose, CA
    2 days ago
  • $225k - $245k

     ...Principal AI/ML Engineer - AI Safety & Evaluation About the Team We're building a future where AI systems are not only powerful but safe, aligned, and robust against misuse...  ...a mission to build systems that can proactively detect and prevent jailbreaks, toxic behaviors... 
    For subcontractor
    Local area

    A10 Networks

    San Jose, CA
    5 days ago
  • $153.75k - $225k

     ...Eightfold is at the forefront of agentic AI, delivering intelligent...  ...next era of agentic talent systems. What sets Eightfold apart...  ...collaboration, and high standards. Our engineers, product leaders, and go-to-...  ...-edge agentic AI that can proactively assist users, automate... 
    Work experience placement
    Work at office
    3 days per week

    Eightfold LLC

    Santa Clara, CA
    3 days ago
  • $147.4k - $272.1k

     ...Machine Learning Engineer, Proactive - Large Language Models & Generative AI Inference The Intelligence Platform team empowers clients across Apple's operating systems with a high quality user-centric search and data platform, and the primary inference platform that... 
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $181.1k - $318.4k

     ...Senior Machine Learning Engineer, Video Quality Systems Apple's Camera ISP Algorithm team is looking for dedicated engineers to shape the future...  ...perceived visual quality at scale. While human expert evaluation remains the gold standard for accuracy, it is resource-... 
    Relocation

    Apple

    Cupertino, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Engineer, Proactive - Agentic Systems Evaluation. Be the first to apply!