Annotation Data Scientist, Evaluation Integrity (Siri)
Apple Oakbrook
Weekly Hours: 40
Role Number: 200664186-1242
Summary
Play a part in the ongoing revolution in human-computer interaction. Siri is evolving — and the way we evaluate it has to evolve with it. Join the Evaluation Integrity team to help build the trusted quality signal behind every Siri release. Within the Siri evaluation organization, the Human Evaluation sub-team is responsible for answering the question: can we trust our evals? We do that by designing human-in-the-loop (HITL) annotation tasks that scrutinize every moving part of an agentic evaluation — the simulated user agent, the conversation it has with Siri, and the automated evaluators that grade the exchange. This role sits at the intersection of data science, human annotation engineering, and evaluation methodology, and is instrumental in turning human judgment into a rigorous, reproducible signal that directly informs pre-ship model and product decisions.
Description
As an Annotation Data Scientist on the Evaluation Integrity team, you will design and run HITL annotation projects that evaluate the quality and authenticity of agentic user personae, the validity of agent-to-agent conversations, and the reliability of LLM-as-judge and rule-based evaluators against Siri's product specifications. You will own annotation initiatives end-to-end; from rubric design and tooling, through annotator calibration, to data science analysis that turns annotator judgments into actionable signal for modeling, planning, and product teams.
Minimum Qualifications
Bachelor's or Master's degree in a quantitative or related field such as Data Science, Computer Science, Linguistics, Statistics, or Cognitive Science, or equivalent job-related experience.
5+ years of hands-on experience working with human-annotated datasets or human-in-the-loop evaluation methodologies for machine learning, natural language processing, or large language model systems.
5+ years of experience using Python for data processing, analysis, and prototyping, including experience with libraries such as pandas, Jupyter, and at least one data visualization library.
Experience designing, implementing, and communicating annotation schemas, rubrics, or ontologies for machine learning training or evaluation data.
Experience managing multiple concurrent dataset curation efforts, including scoping work, iterating on guidelines, coordinating with in-house or vendor annotators, and monitoring annotator performance metrics such as accuracy, throughput, and inter-annotator agreement.
Experience specifying or designing custom annotation tooling in collaboration with software engineers.
Preferred Qualifications
Experience evaluating LLM-powered or agentic systems, including familiarity with LLM-as-judge methodologies, rubric-based grading, or trajectory and tool-call evaluation.
Familiarity with statistical methods that address accuracy and variability in human annotation data, such as inter-annotator agreement, Cohen's or Fleiss' kappa, Krippendorff's alpha, or bootstrapping.
Data-querying experience with SQL, Spark, or similar, and comfort working with large, complex, real-world datasets.
Experience building pre-ship evaluation pipelines for conversational or assistant products.
Experience with prompt engineering, or with designing simulated user personae for agent evaluation.
Experience running annotation programs across multiple locales or at large scale.
Excellent written and verbal communication skills, with the ability to explain technical topics clearly to data scientists, engineers, annotators, and cross-functional partners.
Proven ability to collaborate effectively across functions and drive projects of varying sizes and scopes — knowing when to dive deep and when to delegate.
$120k - $130k
...industry expertise and unmatched data resources, Shift provides... ...consisting of over 200+ Data Scientists throughout the world. Our... ...the next generation of payment integrity solutions. Create custom "... ...data. Establish rigorous evaluation frameworks (LLM-as-a-judge)...SuggestedPermanent employmentFull timeApprenticeshipInternshipRemote workFlexible hoursShift work- ...Overview: Responsibilities: We are seeking an experienced Data Integration Developer to design, develop, and support custom database and ETL applications for Global Sales & Marketing within the Global Distribution team. This role involves working independently...Suggested
$175k - $200k
...Data Integration Engineer - Healthcare Startup Boston, Massachusetts, United States $ 175,000.00 - 200,000.00 (US Dollar) Our client is a venture-backed company that has created a cutting-edge system designed to enhance the analysis and understanding of electronic...SuggestedH1bFlexible hours$110k - $130k
...meaningful work and want to be part of something bigger than yourself, Caris is where your impact begins. Position Summary The Data Integration Engineer will install back-end, automated data integrations to customer electronic medical records, billing, data warehouses...SuggestedWork experience placementWork at officeFlexible hoursAfternoon shift- ...Integration (Data) Engineer Location: Mexico Duration: 3+ months About Job: We are seeking a talented Integration Engineer with a strong background in data integration, ETL (Extract, Transform, Load) processes, and data pipelines. The Integration Engineer will...Suggested
$129k - $209k
...Join Evolv as Senior Data Infrastructure Engineer... ...model training, evaluation, and continuous improvement... ...Functions, SageMaker integrations). Introduce... ...labeling services and annotation workflows. Enable... ...with AI/ML engineers, scientists, and data scientists...Full timeWork at officeFlexible hours3 days per week$36 - $41 per hour
A leading global biopharmaceutical firm is looking for an AI Data Integration Engineer (RAG Systems). This hybrid position involves developing AI-enabled assistants and supporting engineering workflows. Candidates should have experience in data engineering, awareness of...Permanent employmentContract work- ...experienced Sr. Software Development Engineer to enhance AI speech recognition. You will implement backend tools for speech data warehouses, automate data annotation, and collaborate across teams to improve querying and dataset creation. The ideal candidate has a strong software...
- ...Boston, MA. The ideal candidate should have over 5 years of experience in software development and integration, particularly with APIs and SQL. Experience in healthcare data management and familiarity with MDM platforms like IBM Infosphere are preferred. Strong...
- ...4 This is hybrid from day-1 Description: Overview We are looking to add talented informatica (IDMC) data integration engineers to our high-performing team to augment our collective efforts on a high visibility team Qualifications ~...
- ...specialty firms. IRI has built its reputation on excellent service and integrity since its inception in 1996. Our mission centers on delivering... ..., Rehabilitation Therapy and Nursing. Job Description Title: Data Warehouse Specialist I Location: BOSTON, MA Duration: 6 Months...
$110k - $130k
...looking for a Clinical Data Engineer who will own... ...research associates and data scientists. You will operate at... ...Ensure data integrity, reproducibility, and... ...behavioral datasets to evaluate product performance and... ...applications for visualizing and annotating biometric data...Full timeImmediate startWorldwideFlexible hours- About the Role We're looking for a Data Scientist to own the quality, reliability, and trustworthiness of our clinical AI outputs. You'll... ...systems that ensure our AI "knows what it doesn't know"—developing evaluation frameworks, calibrated confidence scoring, and automated...
- ...using Informatica PowerCenter, Informatica Data Quality, and Informatica Master Data... ...database changes. Meets with vendors, evaluates products, and makes recommendations... ...Informatica PowerCenter or equivalent data integration software. Development experience with...
$60 per hour
A leading AI development company is seeking experienced quantitative professionals to evaluate and shape AI-generated analyses. This fully remote position offers flexibility in projects and competitive hourly pay up to $60 USD. Candidates should have 2+ years in quantitative...Hourly payRemote work- ...experienced and highly skilled data scientist to join the Perception Data... ...for training and evaluation data powering the perception... ...Determine trade-offs and integrations between human-labeled, human... ...data collection, including annotation task design There are three...Temporary workRelocation package
- ...Engineering team in Boston, MA. This role involves managing a team of engineers and overseeing the technical foundation for data, analytics, APIs, and integrations. The ideal candidate will have strong technical depth, a proven track record in team leadership, and the ability to...
$160k - $220k
...Lead Data Engineer Deliberate AI | Hybrid (NYC or Boston) | Full-Time About Deliberate... ...signal processing and wearable API integrations — and you understand that both feed into... ...time zones and connectivity conditions Evaluate and select the core data stack —...Full timeWorldwideRelocationFlexible hoursShift workNight shiftDay shift$155k - $410k
...Requirements: Up to 100% At PwC, our people in integration and platform architecture focus on... ...for clients. They enable efficient data flow and optimise technology infrastructure... ...closely with team members. We evaluate these factors thoughtfully to establish...Full timeTemporary workWork experience placementH1b$77k - $202k
...Specialty/Competency: Data, Analytics & AI Industry/Sector: Not Applicable Time... ...and implementing data pipelines, data integration, and data transformation solutions.... ...collaborating closely with team members. We evaluate these factors thoughtfully to establish...Full timeH1b$172k - $229k
...Senior Machine Learning Engineer, Data Mining Boston, MA February... ...regressions. Research and Integrate Agentic Systems: Explore and... ...: Work closely with ML scientists, data engineers, and autonomy... ...practices for model training, evaluation, and deployment. What We...Work at officeRemote work- ...Serco has an exciting opportunity for a Data Engineer/Scientist to support U.S. Navy's Team Submarine... ..., implementation, sustainment, and integration of systems supported by the PMS 450... ...is an equal opportunity employer. We evaluate qualified applicants without regard to...Full timeContract workPart timeInternshipWork at officeLocal areaFlexible hours
$117.6k - $161.7k
...part of our caring community The Senior Data Engineer designs, builds, and maintains... ...performance. The successful candidate will evaluate and select appropriate technologies,... ...software engineering and analytics teams to integrate data solutions into shared platforms and...Bi-weekly payFull timeTemporary workApprenticeshipWork at officeImmediate startRemote workWork from homeHome office- ...Data Engineer - Location: Boston, MA / St. Louis, MO - onsite : Apache Airflow,... ...jobs for heavy-duty data transformations, integrating them into NiFi and Airflow orchestration... ...discrimination. All applicants will be evaluated solely on the basis of their ability, competence...
- ...Staff Data Platform Engineer (AI / Data Fabric / Iceberg Lakehouse) Location: Boston... ...powered insights and document extraction, and integrate across diverse cloud-powered databases.... ...quality, and operational ownership Evaluate emerging technologies, run PoCs, and...Work at office
$120k - $160k
...values-based organization where respect, integrity, excellence, collaboration, and passion define... ...has an exciting opportunity for a Senior Data Engineer to join its Platform Engineering... ...and compliance. Continuously evaluate and improve data engineering processes for...Temporary workFlexible hours$160k - $190k
...Senior Data Engineer The Senior Data Engineer will help transform our cloud data... ...applications, and AI/RAG use cases. • Evaluate and apply the appropriate platform (Snowflake... ..., Databricks, Python, and Snowflake. • Integrate data from external vendors and internal...- ...Lead Data Engineer We are searching for a Lead Data Engineer to implement data engineering and analytics... ...data platform Partner with data architect to evaluate and finalize the unified data model Partner with integration architect to upgrade and integrate data...Work at office
$140k - $160k
...in the efforts to design, develop, and maintain databases and data integration (ETL) systems to support business applications and business... ...Collaborates with stakeholders and third-party vendors to define, evaluate, and align business requirements for scalable data...Work experience placementRemote workWork from home- ...Data Engineer Location: Boston, MA (Onsite- 4 days/week)... ...ingestion, transformation, and integration from various sources. Use... ...for model training, evaluation, deployment, and monitoring... ...environments. Work with data scientists, analysts, and business stakeholders...Contract work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Annotation Data Scientist, Evaluation Integrity (Siri). Be the first to apply!
- python data scientist Cambridge, MA
- healthcare data scientist Cambridge, MA
- junior data scientist remote Cambridge, MA
- data scientist Cambridge, MA
- data scientist (hedge fund) Cambridge, MA
- entry level data scientist remote Cambridge, MA
- energy data scientist Cambridge, MA
- python data scientist (contract) Cambridge, MA
- senior data scientist Cambridge, MA
- clinical data Cambridge, MA


