ML Engineer, Proactive - Agentic Systems Evaluation
$126.8k - $220.9kApple Oakbrook
ML Engineer, Proactive - Agentic Systems Evaluation
Are you passionate about working on the next generation of personalized intelligence systems? In this role, you will be developing and deploying robust evaluation frameworks across the data lifecycle -- from data collection and processing, to analytic dashboards for reporting. You will be part of the larger Proactive Intelligence team, which builds features that anticipate customer's needs and create personalized experiences by adapting to user behaviors with machine learning running locally on-device or in PCC. Join our cross functional team of specialists dedicated to the evaluation of agentic systems.
Description
We are looking for a high-impact ML Evaluation Engineer to help architect rigorous evaluations systems for autonomous agents. With the rise of generative AI, the ability to quantify the reliability and quality of these systems is more critical than ever. You will design and deploy qualitative and quantitative metrics to measure the quality, reasoning, and tool-use accuracy of agentic systems. You will be working with very sensitive data, so leveraging existing and developing new privacy enhancing technologies -- such as differential privacy, PII redaction, and data minimization -- will be crucial. The team you will be joining is focused on advancing scalable automated processes for evaluation. To succeed, you will need a deep understanding of system-level software operations to deliver next-generation capabilities. Join the Proactive Intelligence team to build the evaluation platforms for the future of intelligent, personalized experiences.
Responsibilities
- Design and implement evaluation frameworks to measure quality, reasoning, and tool-use accuracy of agentic systems
- Develop MCP servers and API orchestration layers to enable reliable tool-use for agentic systems.
- Orchestrate end-to-end ML workflows by integrating heterogeneous internal systems — spanning data services, compute infrastructure, model deployment, and results visualization — into cohesive, production-ready pipelines
- Create and manage analytic dashboards to surface evaluation insights to key stakeholders.
- Collaborate cross-functionally with various teams across ML and SWE teams.
Minimum Qualifications
- MS or PhD in Computer Science, Machine Learning, Statistics, or equivalent practical experience in a quantitative field.
- 3+ years of industry experience in ML Engineering or Applied Science.
- Strong software engineering fundamentals (Python is a must) with experience building scalable, automated data or evaluation pipelines.
Preferred Qualifications
- Demonstrated experience applying Differential Privacy, Federated Learning, or advanced PII redaction techniques to large-scale datasets.
- Hands-on experience building or testing LLM-based systems, including a deep understanding of chain-of-thought reasoning, prompt engineering, and agentic planning.
- Proficiency in building or evaluating systems that integrate with external tools/APIs.
- Experience with specialized agent evaluation frameworks and analyzing execution traces to identify failure modes in multi-turn interactions.
- Experience with compiled languages (e.g., Swift) and a curiosity about how ML interacts with OS-level software operations.
- A track record of developing custom metrics (e.g., "LLM-as-a-Judge") or publishing research on model reliability, safety, or algorithmic bias.
Pay & Benefits
At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $126,800 and $220,900, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant At Apple, we believe accessibility is a fundamental human right. You'll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong. Learn about accessibility in Apple's workplace Learn about reasonable accommodations for job applicants Apple accepts applications to this posting on an ongoing basis.
$147.4k - $272.1k
...Sr Machine Learning Engineer, Proactive - ML Systems Engineering Apple's products combine the best hardware and incredible software to deliver... ...learning systems; establish scalable automated processes for evaluation and monitoring; contribute to a healthy team culture...SuggestedRelocation- ...Productivity and Machine Learning Evaluation team ensures the quality of AI... ...scaling automated evaluation systems and designing adversarial and... ...interactions into multi-turn, agentic experiences, the evaluation... ...building or significantly extending ML evaluation systems, including...SuggestedShift work
- ...and Machine Learning Evaluation team ensures the quality... ...into multi-turn, agentic experiences, this role... ...metrics for AI-powered or ML-driven features in consumer... ...partnering with engineering or data teams to define... ...reliability within AI systems Experience with evaluation...Suggested
- ...are Moveworks is the Agentic AI Assistant platform... ...all of their business systems through natural language... ...with Moveworks' Reasoning Engine and natural language... ...infrastructure needed to fine-tune, evaluate, and serve your own... ..., and keeping our ML at the cutting edge of...SuggestedWork at officeImmediate startRemote workFlexible hours
- ...are Moveworks is the Agentic AI Assistant platform... ...all of their business systems through natural language... ...with Moveworks' Reasoning Engine and natural language... ...help build cutting edge ML infrastructure for building... ...models(LLM), model evaluation and monitoring framework...SuggestedWork at officeRemote workFlexible hours
$172.5k - $306.63k
...Senior Machine Learning Engineer to compose, build, and... ...scalable intelligent AI systems that power end-user AI... ...retrieval and memory services, evaluation, safety/guardrails, and... ...of production-grade agentic AI systems-from... ...safety, governance, and ML Ops guidelines (guardrails...Temporary workLocal areaRelocation$172.5k - $306.63k
...Senior Machine Learning Engineer At Adobe's... ...scalable intelligent AI systems that power end-user AI... ...retrieval and memory services, evaluation, safety/guardrails, and... ...delivery of production-grade agentic AI systems—from... ...safety, governance, and ML Ops guidelines (guardrails...Temporary workLocal areaWorldwide- ...27-0836 Summary For the engineer that obsesses on how software can enable OS developers to evaluate and improve their features, there... ...of low-level operating system technologies while maintaining... ...to measure, understand, and proactively improve all Apple features. We...Work experience placement
$178.42k - $230.5k
...maintaining the tools and services engineers here at GM use every day to... ...Productivity Org, the AV Agentic Developer Engineer is responsible... ...team's technical roadmap Evaluate new tools and technologies... ...building, and operating production systems at scale in the cloud ~...Work experience placementWork at officeLocal areaRemote workWork from homeRelocationRelocation packageFlexible hours3 days per week$184k - $287.5k
...generation of driving behavior evaluation - moving beyond hand-... ...evaluation using LLMs, VLMs, and agentic workflows. You'll define... ...drives well, building systems that bridge ML research and production evaluation... ...Computer Science, Computer Engineering, or a related technical...Remote work$147.4k - $272.1k
...Machine Learning Engineer - Agentic AI The VCV organization has pioneered human-centric, real... ...and intelligence of our agentic systems. We are looking for an experienced engineer... ...handling. Develop infrastructure for evaluating and improving agentic system...Relocation- ...Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs... ...to effortlessly run large-scale ML applications, without the hassle of... ...intelligence via additional agentic computation. About The Role Engineers on the inference performance team...
- ...Cerebras Systems builds the world's largest AI chip, 56 times larger... ...effortlessly run large-scale ML applications, without the hassle... ...intelligence via additional agentic computation. About the Role... ...seeking a versatile and experienced engineer to join our SOTA Training...Internship
$181.1k - $318.4k
...Senior ML Infrastructure Engineer, Proactive The Intelligence Platform team empowers clients across Apple's operating systems with high quality user-centric knowledge and inferences that enable next generation user experiences. We're a systems engineering team focused...WorldwideRelocation- ...company located in Cupertino is seeking an experienced Machine Learning Engineer to develop data generation methodologies and quality assessment systems. This role involves designing automated evaluation systems and collaborating on data requirements. Candidates should...
- ...Who we are Moveworks is the Agentic AI Assistant platform that empowers the entire workforce... ...to converse with all of their business systems through natural language to quickly find... ...automation with Moveworks' Reasoning Engine and natural language capabilities, we deliver...Work at officeImmediate startRemote workFlexible hours
- ...Senior Staff ML Engineer - AI Safety & Evaluation About the Team We're building a future where AI systems are not only powerful but safe, aligned, and robust against misuse... ...a mission to build systems that can proactively detect and prevent jailbreaks, toxic behaviors...
- ...Senior MLE Tech Lead to join a centralized evaluation organization and define the next... ...autograder training and hillclimbing system from the ground up.\\n\\nThis is a high... ...of model evaluation, data quality, and ML systems engineering. You will work closely with model developers...
$281k - $356k
...Senior Staff ML Engineer, Driver Understanding and Evaluation Waymo is an autonomous driving technology company... ...scalable machine learning and data systems, simulation workflow and insight... ...to detect critical anomalies Proactively monitor and assimilate best practices...Full time$60 - $70 per hour
...seeking a Machine Learning Engineer to join a high-impact... ...on advancing LLM evaluation, NLP, and AI-driven automation... ..., and building systems that ensure high-quality... ...automation workflows and agentic evaluation systems to improve... ...years of experience in ML engineering, NLP, or AI...Contract workTemporary workRemote work3 days per week$150k - $387.6k
...Machine Learning Engineer - E-commerce Merchant Growth (LLM & Agentic Systems) Location: San Jose Employment Type: Regular Job Code: A33672 Responsibilities... ...world. We're looking for brilliant and motivated ML engineers eager to apply their knowledge in machine...Temporary workLocal area- ...Senior ML Engineer Medical Imaging Evaluation & AI Reliability About the Role: My client is building evaluation and evidence infrastructure for safety-critical AI systems, starting with diagnostic medical imaging. AI systems are increasingly used in settings...Shift work
- ...experienced Machine Learning Engineer to build, deploy, and optimize... ...) and scalable production systems. At Apple, we believe in creating... ...works closely with product, ML research, Data Science and... ...models and Apple Intelligence evaluations. We are looking for a Machine...
$120k - $235k
...companies to build strong engineering teams ready for what's... ...How developers were evaluated previously was whether... ...looks like in the agentic era. That methodology... ...have shipped LLM-powered systems in production where consistency... ...you You can defend ML judgment in plain...Shift work- ...Applied AI/ML Lead This is a rare opportunity... ...goal is to create an Agentic Private Bank - reimagining... ...research on prompt engineering techniques to improve... ...based model training, evaluation, and optimization.... ...GIT and version control systems. Hands-on experience...
$136.5k - $276.5k
...AI/ML Engineer - Agentic This role has been designed as ‘Hybrid’ with an expectation that you... ...services, and high‑performance backend systems that power agent execution. This position... ...-tuning concepts, prompt engineering, evaluations, Qlora, PEFT Infrastructure &...Work experience placementWork at office2 days per week$225k - $245k
...Principal AI/ML Engineer - AI Safety & Evaluation About the Team We're building a future where AI systems are not only powerful but safe, aligned, and robust against misuse... ...a mission to build systems that can proactively detect and prevent jailbreaks, toxic behaviors...For subcontractorLocal area$153.75k - $225k
...Eightfold is at the forefront of agentic AI, delivering intelligent... ...next era of agentic talent systems. What sets Eightfold apart... ...collaboration, and high standards. Our engineers, product leaders, and go-to-... ...-edge agentic AI that can proactively assist users, automate...Work experience placementWork at office3 days per week$147.4k - $272.1k
...Machine Learning Engineer, Proactive - Large Language Models & Generative AI Inference The Intelligence Platform team empowers clients across Apple's operating systems with a high quality user-centric search and data platform, and the primary inference platform that...Relocation$181.1k - $318.4k
...Senior Machine Learning Engineer, Video Quality Systems Apple's Camera ISP Algorithm team is looking for dedicated engineers to shape the future... ...perceived visual quality at scale. While human expert evaluation remains the gold standard for accuracy, it is resource-...Relocation
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Engineer, Proactive - Agentic Systems Evaluation. Be the first to apply!
- machine learning ai engineer Cupertino, CA
- machine learning engineer Cupertino, CA
- junior machine learning research engineer Cupertino, CA
- machine learning software engineer Cupertino, CA
- ai ml engineer Cupertino, CA
- senior ml engineer Cupertino, CA
- computer vision machine learning engineer Cupertino, CA
- data scientist machine learning engineer Cupertino, CA
- operations support system engineer Cupertino, CA
- mission system engineer Cupertino, CA

