Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Machine Learning Scientist Agentic data pipelines

Iambic

Scientist Position at Iambic Therapeutics

We are seeking a scientist to join our team at Iambic Therapeutics, working on data acquisition and curation for Enchant, our multimodal transformer model trained at scale on a wide variety of biomedical data. In this role, you will design and build agentic systems that acquire, clean, format, and quality-control the large-scale datasets that power Enchant training. You will work at the intersection of LLM-based automation and biomedical data engineering—developing AI agents that can navigate heterogeneous data sources, enforce quality standards, and operate reliably at scale.

This role is ideal for candidates who combine strong software engineering instincts with scientific understanding of biomedical data, and who are excited about using LLMs as tools to solve practical data problems.

Key Responsibilities
  • Design, build, and maintain agentic systems for automated data acquisition from public and proprietary biomedical data sources
  • Develop LLM-based pipelines for data cleaning, normalization, and formatting across diverse data modalities (e.g., molecular, genomic, clinical, literature)
  • Implement automated quality-control workflows that detect anomalies, flag inconsistencies, and enforce data standards
  • Evaluate and iterate on agent architectures, prompting strategies, and tool-use patterns to improve reliability and throughput
  • Collaborate with ML scientists on the Enchant team to understand data requirements and translate them into scalable acquisition and processing systems
  • Monitor and maintain data pipelines in production, diagnosing failures and improving robustness over time
  • Document data provenance, processing decisions, and quality metrics to support reproducibility and auditing
Qualifications

Required:

  • Master's or PhD in a computational STEM field, or equivalent industry experience
  • Strong Python engineering skills, including experience building and maintaining production-quality software
  • Hands-on experience with LLM APIs (e.g., Claude, GPT) and agentic patterns such as tool use, orchestration, and multi-step reasoning
  • Familiarity with biomedical or chemical data sources and formats (e.g., PDB, UniProt, ChEMBL, SDF/MOL, FASTA, or similar)
  • Comfort with data engineering fundamentals: ETL design, data validation, and working with structured and unstructured data at scale

Desired:

  • Experience with agent orchestration frameworks
  • Familiarity with cloud infrastructure and workflow orchestration (e.g., AWS, Docker, Kubernetes)
  • Knowledge of multimodal biomedical data—spanning small molecules, proteins, assays, images, 'omics, and/or clinical records
  • Experience with large-scale dataset construction or curation for ML model training

Location: Remote (US or UK). On-site available in Bristol, UK and Boston, US.

About Iambic Therapeutics

Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need.

Mission & Core Values

Our mission is to deliver better medicines through innovations in AI-based discovery technologies. The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.

Pay and Benefits

We offer industry leading competitive pay, company paid healthcare, flexible spending accounts, voluntary life insurance, 401K matching, and uncapped vacation to our team. We are in a brand-new state-of-the art facility in beautiful San Diego with an onsite gym, dining, and easy access to great places to live and play.

Vacancy posted 12 hours ago
Similar jobs that could be interesting for youBased on the Machine Learning Scientist Agentic data pipelines in United States vacancy
  • $187k

     ...world. Join us. Senior Machine Learning Scientist Introduction to the...  ...technology platform powered by data and machine learning...  ...of innovation in AI-driven agentic systems. We're dedicated to...  ...calling agents, multi-agent pipelines) in production or research... 
    Pipeline
    Data
    Local area
    Worldwide
    Flexible hours

    Expedia Group

    San Jose, CA
    2 days ago
  • $96.13k - $155.95k

     ...Description: The Applied Machine Learning Scientist II is responsible for...  ...analytics functions including data and modelling frameworks, tools...  ...— including Generative AI, Agentic AI systems, machine...  ...scalable data science and AI pipelines leveraging technologies such... 
    Pipeline
    Data
    Work experience placement
    Work at office
    Local area
    Work from home
    Flexible hours

    TD Bank Group

    New York, NY
    11 hours ago
  • Senior Machine Learning Scientist Introduction to the Team: Expedia Technology teams...  ...platform powered by data and machine learning provides...  ...of innovation in AI‑driven agentic systems. We’re dedicated to...  ...calling agents, multi‑agent pipelines) in production or research... 
    Pipeline
    Data
    Worldwide
    Flexible hours

    PowerToFly

    Seattle, WA
    8 hours ago
  •  ...of innovation in AI-driven agentic systems. We're dedicated to...  ...relationships through cutting‑edge machine learning solutions. Our work directly...  ...calling agents, multi‑agent pipelines) in production or research...  ...tools, APIs, and structured data sources for real‑world task... 
    Pipeline
    Data
    Worldwide

    Expedia, Inc.

    Austin, TX
    2 days ago
  • $191.3k - $305.7k

     ...deeply in next-generation AI and machine learning to power intelligent...  ...team brings together Applied Scientists, ML engineers, and Software...  ...models with applied ML-from data and modeling through evaluation...  ...and implement post-training pipelines that combine techniques such... 
    Pipeline
    Data
    Permanent employment
    Live in
    Work at office
    Local area
    Remote work

    Zillow

    United States
    2 days ago
  • $116k - $182.27k

     ...Description Title Research Scientist, AI/ML - Agentic Systems Position Overview...  ...to experimental design and data analysis. This role requires...  ...augmented generation (RAG) pipelines connecting agents to internal...  ...with reinforcement learning or planning algorithms for... 
    Pipeline
    Data
    Temporary work
    Local area

    Initial Therapeutics, Inc.

    Boston, MA
    2 days ago
  • $147.3k - $245k

     ...Principal Machine Learning Scientist (US Remote) We are looking for an applied science group...  ...datasets using responsible data collection, design pipelines, and ensure data quality. Investigate...  ...with advanced prompting/agentic systems, LLM fine‑tuning or training... 
    Pipeline
    Data
    Remote work

    Turnitin

    Atlanta, GA
    13 hours ago
  •  ...AI Scientist for Customer Service Team We are seeking...  ...Conduct exploratory data analysis to uncover data...  ...Develop and deploy machine learning models in production by...  ...as smart chatbots and agentic AI tools, ensuring they...  ...scalable data pipelines for collecting, processing... 
    Pipeline
    Data

    Wayfair

    Boston, MA
    2 days ago
  •  ...Founded by Stanford AI scientists with deep clinical...  ...and evaluating novel machine learning approaches for medical...  ...language understanding, and agentic AI systems tailored...  ...clinical NLP pipelines for automated E&M coding...  ...constraints () Clinical data pipelines and... 
    Pipeline
    Data

    Knowtex

    San Francisco, CA
    1 day ago
  •  ...Principal Machine Learning Scientist, GenAI This role sits within Expedia's Traveler & Partner...  ...personalization skills along with Agentic AI skillsets within the servicing...  ...to productionize models, improve data and feature pipelines, and ensure reliability, latency,... 
    Pipeline
    Data

    Expedia Group

    Seattle, WA
    3 days ago
  • $137.5k

     ...singular technology platform powered by data and machine learning provides secure, differentiated, and...  ...forefront of innovation in AI-driven agentic systems. We're dedicated to enhancing...  ..., tool-calling agents, multi-agent pipelines) in production or research settings... 
    Pipeline
    Data
    Local area
    Worldwide
    Flexible hours

    Expedia Group

    Washington DC
    17 hours ago
  •  ...starter to join as a Senior Machine Learning Scientist for our Consulting and...  ...build, and evaluate multi-step agentic AI systems, including autonomous...  ...~ Multi-agent systems or pipelines ~ Strong proficiency in...  ..., APIs, and structured data sources for real-world tasks... 
    Pipeline
    Data

    IBS Software Services

    Austin, TX
    1 day ago
  •  ...Senior Machine Learning Scientist Location: UK or Poland (Remote or Hybrid) Compensation...  ...iterate on ML models and agentic systems for customer...  ...owning custom fine‑tuning pipelines. Run experiments end‑to‑end...  ...reranking systems, insight agents, data mining agents, and... 
    Pipeline
    Data
    Work at office
    Local area
    Remote work
    Work from home
    Flexible hours

    NLP PEOPLE

    New York, NY
    1 day ago
  • $197.27k - $267.04k

     ...at the intersection of machine learning research, real world data, and production systems...  ...As Principal Applied Scientist, you lead the science on...  ...and run the evaluation pipelines for the work you own: offline...  ...retrieval augmented generation, agentic workflows, time series,... 
    Pipeline
    Data
    Local area

    Siemens

    New York, NY
    7 days ago
  •  ...inference speeds and empowers machine learning users to effortlessly...  ...via additional agentic computation. About...  ...Machine Learning Research Scientist at Cerebras, you will...  ...building training pipelines, debugging complex system...  ..., and iterating on data and evaluation strategies... 
    Pipeline
    Data
    Internship

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    17 hours ago
  •  ...Senior Machine Learning Research Scientist At the SEI AI Division, we conduct research...  ...areas such as: Agentic AI for mission workflows (...  ...agents interact with tools, data systems, and operators....  ...evaluation strategies and test pipelines that assess performance, robustness... 
    Pipeline
    Data
    Full time
    Part time
    Work at office
    Flexible hours

    Software Engineering Institute

    Arlington, VA
    5 days ago
  •  ...Applied AI Machine Learning Lead We recognize that our people are our...  ...information retrieval, and agentic AI. You will play a key role...  ...model versioning, evaluation of pipelines, and rollback strategies....  ...with A/B experimentation and data/metric-driven product development... 
    Pipeline
    Data

    Chase

    Palo Alto, CA
    17 hours ago
  • $268k - $384k

     ...Senior / Principal ML Scientist, Foundation Models for Life...  ...intersection of state-of-the-art machine learning and life science data, spanning biological...  ...strategy, build pipeline models, and design feedback...  ...applications Experience with agentic frameworks or active... 
    Pipeline
    Data
    Full time
    Work at office
    Local area
    Flexible hours

    Lila Sciences

    San Francisco, CA
    1 day ago
  •  ...implement end-to-end modeling pipelines for machine assembly tasks, building...  ...evaluate architectural variants, data collection and curation...  ...supervised and reinforcement learning techniques for physical manipulation...  ...Explore how modern LLMs and agentic systems can be integrated to... 
    Pipeline
    Data
    Work experience placement
    Internship
    Local area
    Shift work

    Toyota Research Institute

    Los Altos, CA
    17 hours ago
  • $147.3k - $245k

     ...Principal Machine Learning Scientist (US Remote) ~ Full-time When you join...  ...write parallel and efficient pipelines is a necessary skill. You...  ...datasets following responsible data collection and model...  ...with advanced prompting / agentic‑systems and fine‑tuning or... 
    Pipeline
    Data
    Full time
    Local area
    Remote work
    Home office

    Turnitin

    Dallas, TX
    17 hours ago
  • $215k - $235k

     ...development. At insitro, we use machine learning to derive clinically...  .... As an Imaging ML Scientist, you will develop ML-...  ...computer vision pipelines to extract insights...  ...and leveraging modern agentic software tools to accelerate...  ...with microscopy data or similar biomedical... 
    Pipeline
    Data
    Home office
    Flexible hours
    3 days per week

    insitro

    South San Francisco, CA
    2 days ago
  • $160k - $190k

     ...combine large-scale single cell immune data, advanced machine learning, and strong engineering to help...  ...About the role: As a Principal Scientist - Computational Biologist at Immunai...  ..., including computational pipelines and agentic workflows, that improve outcomes for... 
    Pipeline
    Data

    Meron Capital

    New York, NY
    3 days ago
  • $180k - $250k

     ...for a Protein Design Scientist to own the development and deployment of agentic workflows for protein...  ...semi-automated design pipelines - closing the loop between...  ..., Computer Science, Machine Learning, Biochemistry, or a related...  ...model inference with data processing, filtering,... 
    Pipeline
    Data

    Profluent Bio Inc.

    Emeryville, CA
    4 days ago
  • $208k - $286k

    Principal Scientist, Machine Learning, Origination Cambridge, MA USA ABOUT PIONEERING...  ...AI, machine learning, and data to accelerate fundamental...  ..., protein design, LLM/agentic workflows), and ensure rigor...  ...week to mass‑spec or docking pipelines the next and then spin up LLM... 
    Pipeline
    Data
    Interim role

    Flagship Pioneering

    Annapolis, MD
    4 days ago
  • $137.5k

     ...open world. Join us. Machine Learning ScientistII I (Multi-...  ...other teams in the Data & AI organization to...  ...managers, engineers, and scientists to achieve Expedia’s...  ...end‑to‑end model pipelines to production across...  ...search techniques and agentic workflows. The total... 
    Pipeline
    Data
    Local area
    Flexible hours

    Expedia, Inc.

    Seattle, WA
    1 day ago
  • $176k - $253k

     ...and developing the learning infrastructure...  ...looking for a Research Scientist to join us in...  ...implement learning pipelines from scratch, run...  ...of architectural, data, and algorithmic choices...  ...we apply modern machine learning to the...  ...language models and agentic infrastructure can... 
    Pipeline
    Data
    Work experience placement
    Internship
    Local area
    Shift work

    Toyota Research Institute

    Los Altos, CA
    2 days ago
  • $175.53k - $222.56k

     ...for a Bioinformatics Scientist to conduct research, training...  ...bioinformatics and data‑management tools. As a...  ...intelligence, machine learning, molecular biology, software...  ...simulation‑based and agentic tools for hypothesis generation...  ...and integration with pipelines, AI/ML models and... 
    Pipeline
    Data
    Minimum wage
    For contractors
    Local area
    Relocation package
    Flexible hours

    The American Physical Society

    Livermore, CA
    4 days ago
  •  ...ML/AI Research Engineer — Agentic AI Lab (Founding Team) Location...  ...reasoning, and reinforcement learning — building the intelligence...  ...sits on top of our enterprise data fabric. This isn't a...  ...data Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex... 
    Pipeline
    Data
    Full time

    Fabrion

    San Francisco, CA
    17 hours ago
  • $130k - $170k

     ...Blue Rose Research is seeking a Data Scientist to develop applied statistical models and manage end-to-end ML pipelines. This role involves close collaboration with a multidisciplinary...  ...2 years of experience in data science or machine learning, possessing strong data wrangling and... 
    Pipeline
    Data
    Remote work

    Blue Rose Research

    New York, NY
    1 day ago
  • $139.5k - $258.1k

     ...ML Applied Scientist, Apple Services Engineering AI/ML...  ...Artificial Intelligence & Machine Learning. Thanks to Apple's...  ...that features, code, data, and models are successfully...  ...areas: Generative AI, Agentic AI systems, Natural...  ...and large-scale data pipelines. Preferred... 
    Pipeline
    Data
    Relocation
    Flexible hours

    Apple

    Seattle, WA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning Scientist Agentic data pipelines. Be the first to apply!