Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Annotation Data Scientist, Evaluation Integrity (Siri)

Apple Oakbrook

Weekly Hours: 40

Role Number: 200664186-1242

Summary

Play a part in the ongoing revolution in human-computer interaction. Siri is evolving — and the way we evaluate it has to evolve with it. Join the Evaluation Integrity team to help build the trusted quality signal behind every Siri release. Within the Siri evaluation organization, the Human Evaluation sub-team is responsible for answering the question: can we trust our evals? We do that by designing human-in-the-loop (HITL) annotation tasks that scrutinize every moving part of an agentic evaluation — the simulated user agent, the conversation it has with Siri, and the automated evaluators that grade the exchange. This role sits at the intersection of data science, human annotation engineering, and evaluation methodology, and is instrumental in turning human judgment into a rigorous, reproducible signal that directly informs pre-ship model and product decisions.

Description

As an Annotation Data Scientist on the Evaluation Integrity team, you will design and run HITL annotation projects that evaluate the quality and authenticity of agentic user personae, the validity of agent-to-agent conversations, and the reliability of LLM-as-judge and rule-based evaluators against Siri's product specifications. You will own annotation initiatives end-to-end; from rubric design and tooling, through annotator calibration, to data science analysis that turns annotator judgments into actionable signal for modeling, planning, and product teams.

Minimum Qualifications

  • Bachelor's or Master's degree in a quantitative or related field such as Data Science, Computer Science, Linguistics, Statistics, or Cognitive Science, or equivalent job-related experience.

  • 5+ years of hands-on experience working with human-annotated datasets or human-in-the-loop evaluation methodologies for machine learning, natural language processing, or large language model systems.

  • 5+ years of experience using Python for data processing, analysis, and prototyping, including experience with libraries such as pandas, Jupyter, and at least one data visualization library.

  • Experience designing, implementing, and communicating annotation schemas, rubrics, or ontologies for machine learning training or evaluation data.

  • Experience managing multiple concurrent dataset curation efforts, including scoping work, iterating on guidelines, coordinating with in-house or vendor annotators, and monitoring annotator performance metrics such as accuracy, throughput, and inter-annotator agreement.

  • Experience specifying or designing custom annotation tooling in collaboration with software engineers.

Preferred Qualifications

  • Experience evaluating LLM-powered or agentic systems, including familiarity with LLM-as-judge methodologies, rubric-based grading, or trajectory and tool-call evaluation.

  • Familiarity with statistical methods that address accuracy and variability in human annotation data, such as inter-annotator agreement, Cohen's or Fleiss' kappa, Krippendorff's alpha, or bootstrapping.

  • Data-querying experience with SQL, Spark, or similar, and comfort working with large, complex, real-world datasets.

  • Experience building pre-ship evaluation pipelines for conversational or assistant products.

  • Experience with prompt engineering, or with designing simulated user personae for agent evaluation.

  • Experience running annotation programs across multiple locales or at large scale.

  • Excellent written and verbal communication skills, with the ability to explain technical topics clearly to data scientists, engineers, annotators, and cross-functional partners.

  • Proven ability to collaborate effectively across functions and drive projects of varying sizes and scopes — knowing when to dive deep and when to delegate.

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Annotation Data Scientist, Evaluation Integrity (Siri) in Cambridge, MA vacancy
  • $120k - $130k

     ...industry expertise and unmatched data resources, Shift provides...  ...consisting of over 200+ Data Scientists throughout the world. Our...  ...the next generation of payment integrity solutions. Create custom "...  ...data. Establish rigorous evaluation frameworks (LLM-as-a-judge)... 
    Suggested
    Permanent employment
    Full time
    Apprenticeship
    Internship
    Remote work
    Flexible hours
    Shift work

    Shift Technology

    Boston, MA
    5 days ago
  •  ...Overview: Responsibilities: We are seeking an experienced Data Integration Developer to design, develop, and support custom database and ETL applications for Global Sales & Marketing within the Global Distribution team. This role involves working independently... 
    Suggested

    Purple Drive

    Boston, MA
    3 days ago
  • $175k - $200k

     ...Data Integration Engineer - Healthcare Startup Boston, Massachusetts, United States $ 175,000.00 - 200,000.00 (US Dollar) Our client is a venture-backed company that has created a cutting-edge system designed to enhance the analysis and understanding of electronic... 
    Suggested
    H1b
    Flexible hours

    Tech Stars Group LLC

    Boston, MA
    1 day ago
  • $110k - $130k

     ...meaningful work and want to be part of something bigger than yourself, Caris is where your impact begins. Position Summary The Data Integration Engineer will install back-end, automated data integrations to customer electronic medical records, billing, data warehouses... 
    Suggested
    Work experience placement
    Work at office
    Flexible hours
    Afternoon shift

    Caris Life Sciences

    Boston, MA
    3 days ago
  •  ...Integration (Data) Engineer Location: Mexico Duration: 3+ months About Job: We are seeking a talented Integration Engineer with a strong background in data integration, ETL (Extract, Transform, Load) processes, and data pipelines. The Integration Engineer will... 
    Suggested

    Saviance

    Boston, MA
    3 days ago
  • $129k - $209k

     ...Join Evolv as Senior Data Infrastructure Engineer...  ...model training, evaluation, and continuous improvement...  ...Functions, SageMaker integrations). Introduce...  ...labeling services and annotation workflows. Enable...  ...with AI/ML engineers, scientists, and data scientists... 
    Full time
    Work at office
    Flexible hours
    3 days per week

    Evolv Technology

    Watertown, MA
    4 days ago
  • $36 - $41 per hour

    A leading global biopharmaceutical firm is looking for an AI Data Integration Engineer (RAG Systems). This hybrid position involves developing AI-enabled assistants and supporting engineering workflows. Candidates should have experience in data engineering, awareness of... 
    Permanent employment
    Contract work

    3key Consulting, Inc.

    Cambridge, MA
    2 days ago
  •  ...experienced Sr. Software Development Engineer to enhance AI speech recognition. You will implement backend tools for speech data warehouses, automate data annotation, and collaborate across teams to improve querying and dataset creation. The ideal candidate has a strong software... 

    Apple Inc.

    Cambridge, MA
    4 days ago
  •  ...Boston, MA. The ideal candidate should have over 5 years of experience in software development and integration, particularly with APIs and SQL. Experience in healthcare data management and familiarity with MDM platforms like IBM Infosphere are preferred. Strong... 

    Polarits

    Boston, MA
    4 days ago
  •  ...4 This is hybrid from day-1 Description: Overview We are looking to add talented informatica (IDMC) data integration engineers to our high-performing team to augment our collective efforts on a high visibility team Qualifications ~... 

    ShiftCode Analytics

    Boston, MA
    4 days ago
  •  ...specialty firms. IRI has built its reputation on excellent service and integrity since its inception in 1996. Our mission centers on delivering...  ..., Rehabilitation Therapy and Nursing. Job Description Title: Data Warehouse Specialist I Location: BOSTON, MA Duration: 6 Months... 

    Integrated Resources Inc.

    Boston, MA
    5 days ago
  • $110k - $130k

     ...looking for a Clinical Data Engineer who will own...  ...research associates and data scientists. You will operate at...  ...Ensure data integrity, reproducibility, and...  ...behavioral datasets to evaluate product performance and...  ...applications for visualizing and annotating biometric data... 
    Full time
    Immediate start
    Worldwide
    Flexible hours

    Eight Sleep

    Boston, MA
    2 days ago
  • About the Role We're looking for a Data Scientist to own the quality, reliability, and trustworthiness of our clinical AI outputs. You'll...  ...systems that ensure our AI "knows what it doesn't know"—developing evaluation frameworks, calibrated confidence scoring, and automated... 

    Bioscope.ai, Inc.

    Boston, MA
    1 day ago
  •  ...using Informatica PowerCenter, Informatica Data Quality, and Informatica Master Data...  ...database changes. Meets with vendors, evaluates products, and makes recommendations...  ...Informatica PowerCenter or equivalent data integration software. Development experience with... 

    CERES Group

    Boston, MA
    5 days ago
  • $60 per hour

    A leading AI development company is seeking experienced quantitative professionals to evaluate and shape AI-generated analyses. This fully remote position offers flexibility in projects and competitive hourly pay up to $60 USD. Candidates should have 2+ years in quantitative... 
    Hourly pay
    Remote work

    DataAnnotation

    Boston, MA
    5 days ago
  •  ...experienced and highly skilled data scientist to join the Perception Data...  ...for training and evaluation data powering the perception...  ...Determine trade-offs and integrations between human-labeled, human...  ...data collection, including annotation task design There are three... 
    Temporary work
    Relocation package

    Zoox

    Boston, MA
    more than 2 months ago
  •  ...Engineering team in Boston, MA. This role involves managing a team of engineers and overseeing the technical foundation for data, analytics, APIs, and integrations. The ideal candidate will have strong technical depth, a proven track record in team leadership, and the ability to... 

    ABCorp NA Inc.

    Boston, MA
    5 days ago
  • $160k - $220k

     ...Lead Data Engineer Deliberate AI | Hybrid (NYC or Boston) | Full-Time About Deliberate...  ...signal processing and wearable API integrations — and you understand that both feed into...  ...time zones and connectivity conditions Evaluate and select the core data stack —... 
    Full time
    Worldwide
    Relocation
    Flexible hours
    Shift work
    Night shift
    Day shift

    Deliberate AI

    Boston, MA
    1 day ago
  • $155k - $410k

     ...Requirements: Up to 100% At PwC, our people in integration and platform architecture focus on...  ...for clients. They enable efficient data flow and optimise technology infrastructure...  ...closely with team members. We evaluate these factors thoughtfully to establish... 
    Full time
    Temporary work
    Work experience placement
    H1b

    PwC

    Boston, MA
    8 days ago
  • $77k - $202k

     ...Specialty/Competency: Data, Analytics & AI Industry/Sector: Not Applicable Time...  ...and implementing data pipelines, data integration, and data transformation solutions....  ...collaborating closely with team members. We evaluate these factors thoughtfully to establish... 
    Full time
    H1b

    PwC

    Boston, MA
    3 days ago
  • $172k - $229k

     ...Senior Machine Learning Engineer, Data Mining Boston, MA February...  ...regressions. Research and Integrate Agentic Systems: Explore and...  ...: Work closely with ML scientists, data engineers, and autonomy...  ...practices for model training, evaluation, and deployment. What We... 
    Work at office
    Remote work

    Venturefizz Product Management Community

    Boston, MA
    2 days ago
  •  ...Serco has an exciting opportunity for a Data Engineer/Scientist to support U.S. Navy's Team Submarine...  ..., implementation, sustainment, and integration of systems supported by the PMS 450...  ...is an equal opportunity employer. We evaluate qualified applicants without regard to... 
    Full time
    Contract work
    Part time
    Internship
    Work at office
    Local area
    Flexible hours

    Serco

    Boston, MA
    3 days ago
  • $117.6k - $161.7k

     ...part of our caring community The Senior Data Engineer designs, builds, and maintains...  ...performance. The successful candidate will evaluate and select appropriate technologies,...  ...software engineering and analytics teams to integrate data solutions into shared platforms and... 
    Bi-weekly pay
    Full time
    Temporary work
    Apprenticeship
    Work at office
    Immediate start
    Remote work
    Work from home
    Home office

    Humana

    Boston, MA
    4 days ago
  •  ...Data Engineer - Location: Boston, MA / St. Louis, MO - onsite : Apache Airflow,...  ...jobs for heavy-duty data transformations, integrating them into NiFi and Airflow orchestration...  ...discrimination. All applicants will be evaluated solely on the basis of their ability, competence... 

    Diverse Lynx

    Boston, MA
    3 days ago
  •  ...Staff Data Platform Engineer (AI / Data Fabric / Iceberg Lakehouse) Location: Boston...  ...powered insights and document extraction, and integrate across diverse cloud-powered databases....  ...quality, and operational ownership Evaluate emerging technologies, run PoCs, and... 
    Work at office

    NxT Level

    Boston, MA
    4 days ago
  • $120k - $160k

     ...values-based organization where respect, integrity, excellence, collaboration, and passion define...  ...has an exciting opportunity for a Senior Data Engineer to join its Platform Engineering...  ...and compliance. Continuously evaluate and improve data engineering processes for... 
    Temporary work
    Flexible hours

    Berkshire Hathaway Specialty Insurance

    Boston, MA
    4 days ago
  • $160k - $190k

     ...Senior Data Engineer The Senior Data Engineer will help transform our cloud data...  ...applications, and AI/RAG use cases. • Evaluate and apply the appropriate platform (Snowflake...  ..., Databricks, Python, and Snowflake. • Integrate data from external vendors and internal... 

    Diamond Generating Corporation

    Boston, MA
    3 days ago
  •  ...Lead Data Engineer We are searching for a Lead Data Engineer to implement data engineering and analytics...  ...data platform Partner with data architect to evaluate and finalize the unified data model Partner with integration architect to upgrade and integrate data... 
    Work at office

    Software Technology Inc

    Boston, MA
    1 day ago
  • $140k - $160k

     ...in the efforts to design, develop, and maintain databases and data integration (ETL) systems to support business applications and business...  ...Collaborates with stakeholders and third-party vendors to define, evaluate, and align business requirements for scalable data... 
    Work experience placement
    Remote work
    Work from home

    Carrington

    Boston, MA
    3 days ago
  •  ...Data Engineer Location: Boston, MA (Onsite- 4 days/week)...  ...ingestion, transformation, and integration from various sources. Use...  ...for model training, evaluation, deployment, and monitoring...  ...environments. Work with data scientists, analysts, and business stakeholders... 
    Contract work

    ShiftCode Analytics

    Boston, MA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Annotation Data Scientist, Evaluation Integrity (Siri). Be the first to apply!