Machine Learning Scientist Agentic data pipelines
Iambic
Scientist Position at Iambic Therapeutics
We are seeking a scientist to join our team at Iambic Therapeutics, working on data acquisition and curation for Enchant, our multimodal transformer model trained at scale on a wide variety of biomedical data. In this role, you will design and build agentic systems that acquire, clean, format, and quality-control the large-scale datasets that power Enchant training. You will work at the intersection of LLM-based automation and biomedical data engineering—developing AI agents that can navigate heterogeneous data sources, enforce quality standards, and operate reliably at scale.
This role is ideal for candidates who combine strong software engineering instincts with scientific understanding of biomedical data, and who are excited about using LLMs as tools to solve practical data problems.
Key Responsibilities
- Design, build, and maintain agentic systems for automated data acquisition from public and proprietary biomedical data sources
- Develop LLM-based pipelines for data cleaning, normalization, and formatting across diverse data modalities (e.g., molecular, genomic, clinical, literature)
- Implement automated quality-control workflows that detect anomalies, flag inconsistencies, and enforce data standards
- Evaluate and iterate on agent architectures, prompting strategies, and tool-use patterns to improve reliability and throughput
- Collaborate with ML scientists on the Enchant team to understand data requirements and translate them into scalable acquisition and processing systems
- Monitor and maintain data pipelines in production, diagnosing failures and improving robustness over time
- Document data provenance, processing decisions, and quality metrics to support reproducibility and auditing
Qualifications
Required:
- Master's or PhD in a computational STEM field, or equivalent industry experience
- Strong Python engineering skills, including experience building and maintaining production-quality software
- Hands-on experience with LLM APIs (e.g., Claude, GPT) and agentic patterns such as tool use, orchestration, and multi-step reasoning
- Familiarity with biomedical or chemical data sources and formats (e.g., PDB, UniProt, ChEMBL, SDF/MOL, FASTA, or similar)
- Comfort with data engineering fundamentals: ETL design, data validation, and working with structured and unstructured data at scale
Desired:
- Experience with agent orchestration frameworks
- Familiarity with cloud infrastructure and workflow orchestration (e.g., AWS, Docker, Kubernetes)
- Knowledge of multimodal biomedical data—spanning small molecules, proteins, assays, images, 'omics, and/or clinical records
- Experience with large-scale dataset construction or curation for ML model training
Location: Remote (US or UK). On-site available in Bristol, UK and Boston, US.
About Iambic Therapeutics
Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need.
Mission & Core Values
Our mission is to deliver better medicines through innovations in AI-based discovery technologies. The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.
Pay and Benefits
We offer industry leading competitive pay, company paid healthcare, flexible spending accounts, voluntary life insurance, 401K matching, and uncapped vacation to our team. We are in a brand-new state-of-the art facility in beautiful San Diego with an onsite gym, dining, and easy access to great places to live and play.
$187k
...world. Join us. Senior Machine Learning Scientist Introduction to the... ...technology platform powered by data and machine learning... ...of innovation in AI-driven agentic systems. We're dedicated to... ...calling agents, multi-agent pipelines) in production or research...PipelineDataLocal areaWorldwideFlexible hours$96.13k - $155.95k
...Description: The Applied Machine Learning Scientist II is responsible for... ...analytics functions including data and modelling frameworks, tools... ...— including Generative AI, Agentic AI systems, machine... ...scalable data science and AI pipelines leveraging technologies such...PipelineDataWork experience placementWork at officeLocal areaWork from homeFlexible hours- Senior Machine Learning Scientist Introduction to the Team: Expedia Technology teams... ...platform powered by data and machine learning provides... ...of innovation in AI‑driven agentic systems. We’re dedicated to... ...calling agents, multi‑agent pipelines) in production or research...PipelineDataWorldwideFlexible hours
- ...of innovation in AI-driven agentic systems. We're dedicated to... ...relationships through cutting‑edge machine learning solutions. Our work directly... ...calling agents, multi‑agent pipelines) in production or research... ...tools, APIs, and structured data sources for real‑world task...PipelineDataWorldwide
$191.3k - $305.7k
...deeply in next-generation AI and machine learning to power intelligent... ...team brings together Applied Scientists, ML engineers, and Software... ...models with applied ML-from data and modeling through evaluation... ...and implement post-training pipelines that combine techniques such...PipelineDataPermanent employmentLive inWork at officeLocal areaRemote work$116k - $182.27k
...Description Title Research Scientist, AI/ML - Agentic Systems Position Overview... ...to experimental design and data analysis. This role requires... ...augmented generation (RAG) pipelines connecting agents to internal... ...with reinforcement learning or planning algorithms for...PipelineDataTemporary workLocal area$147.3k - $245k
...Principal Machine Learning Scientist (US Remote) We are looking for an applied science group... ...datasets using responsible data collection, design pipelines, and ensure data quality. Investigate... ...with advanced prompting/agentic systems, LLM fine‑tuning or training...PipelineDataRemote work- ...AI Scientist for Customer Service Team We are seeking... ...Conduct exploratory data analysis to uncover data... ...Develop and deploy machine learning models in production by... ...as smart chatbots and agentic AI tools, ensuring they... ...scalable data pipelines for collecting, processing...PipelineData
- ...Founded by Stanford AI scientists with deep clinical... ...and evaluating novel machine learning approaches for medical... ...language understanding, and agentic AI systems tailored... ...clinical NLP pipelines for automated E&M coding... ...constraints () Clinical data pipelines and...PipelineData
- ...Principal Machine Learning Scientist, GenAI This role sits within Expedia's Traveler & Partner... ...personalization skills along with Agentic AI skillsets within the servicing... ...to productionize models, improve data and feature pipelines, and ensure reliability, latency,...PipelineData
$137.5k
...singular technology platform powered by data and machine learning provides secure, differentiated, and... ...forefront of innovation in AI-driven agentic systems. We're dedicated to enhancing... ..., tool-calling agents, multi-agent pipelines) in production or research settings...PipelineDataLocal areaWorldwideFlexible hours- ...starter to join as a Senior Machine Learning Scientist for our Consulting and... ...build, and evaluate multi-step agentic AI systems, including autonomous... ...~ Multi-agent systems or pipelines ~ Strong proficiency in... ..., APIs, and structured data sources for real-world tasks...PipelineData
- ...Senior Machine Learning Scientist Location: UK or Poland (Remote or Hybrid) Compensation... ...iterate on ML models and agentic systems for customer... ...owning custom fine‑tuning pipelines. Run experiments end‑to‑end... ...reranking systems, insight agents, data mining agents, and...PipelineDataWork at officeLocal areaRemote workWork from homeFlexible hours
$197.27k - $267.04k
...at the intersection of machine learning research, real world data, and production systems... ...As Principal Applied Scientist, you lead the science on... ...and run the evaluation pipelines for the work you own: offline... ...retrieval augmented generation, agentic workflows, time series,...PipelineDataLocal area- ...inference speeds and empowers machine learning users to effortlessly... ...via additional agentic computation. About... ...Machine Learning Research Scientist at Cerebras, you will... ...building training pipelines, debugging complex system... ..., and iterating on data and evaluation strategies...PipelineDataInternship
- ...Senior Machine Learning Research Scientist At the SEI AI Division, we conduct research... ...areas such as: Agentic AI for mission workflows (... ...agents interact with tools, data systems, and operators.... ...evaluation strategies and test pipelines that assess performance, robustness...PipelineDataFull timePart timeWork at officeFlexible hours
- ...Applied AI Machine Learning Lead We recognize that our people are our... ...information retrieval, and agentic AI. You will play a key role... ...model versioning, evaluation of pipelines, and rollback strategies.... ...with A/B experimentation and data/metric-driven product development...PipelineData
$268k - $384k
...Senior / Principal ML Scientist, Foundation Models for Life... ...intersection of state-of-the-art machine learning and life science data, spanning biological... ...strategy, build pipeline models, and design feedback... ...applications Experience with agentic frameworks or active...PipelineDataFull timeWork at officeLocal areaFlexible hours- ...implement end-to-end modeling pipelines for machine assembly tasks, building... ...evaluate architectural variants, data collection and curation... ...supervised and reinforcement learning techniques for physical manipulation... ...Explore how modern LLMs and agentic systems can be integrated to...PipelineDataWork experience placementInternshipLocal areaShift work
$147.3k - $245k
...Principal Machine Learning Scientist (US Remote) ~ Full-time When you join... ...write parallel and efficient pipelines is a necessary skill. You... ...datasets following responsible data collection and model... ...with advanced prompting / agentic‑systems and fine‑tuning or...PipelineDataFull timeLocal areaRemote workHome office$215k - $235k
...development. At insitro, we use machine learning to derive clinically... .... As an Imaging ML Scientist, you will develop ML-... ...computer vision pipelines to extract insights... ...and leveraging modern agentic software tools to accelerate... ...with microscopy data or similar biomedical...PipelineDataHome officeFlexible hours3 days per week$160k - $190k
...combine large-scale single cell immune data, advanced machine learning, and strong engineering to help... ...About the role: As a Principal Scientist - Computational Biologist at Immunai... ..., including computational pipelines and agentic workflows, that improve outcomes for...PipelineData$180k - $250k
...for a Protein Design Scientist to own the development and deployment of agentic workflows for protein... ...semi-automated design pipelines - closing the loop between... ..., Computer Science, Machine Learning, Biochemistry, or a related... ...model inference with data processing, filtering,...PipelineData$208k - $286k
Principal Scientist, Machine Learning, Origination Cambridge, MA USA ABOUT PIONEERING... ...AI, machine learning, and data to accelerate fundamental... ..., protein design, LLM/agentic workflows), and ensure rigor... ...week to mass‑spec or docking pipelines the next and then spin up LLM...PipelineDataInterim role$137.5k
...open world. Join us. Machine Learning ScientistII I (Multi-... ...other teams in the Data & AI organization to... ...managers, engineers, and scientists to achieve Expedia’s... ...end‑to‑end model pipelines to production across... ...search techniques and agentic workflows. The total...PipelineDataLocal areaFlexible hours$176k - $253k
...and developing the learning infrastructure... ...looking for a Research Scientist to join us in... ...implement learning pipelines from scratch, run... ...of architectural, data, and algorithmic choices... ...we apply modern machine learning to the... ...language models and agentic infrastructure can...PipelineDataWork experience placementInternshipLocal areaShift work$175.53k - $222.56k
...for a Bioinformatics Scientist to conduct research, training... ...bioinformatics and data‑management tools. As a... ...intelligence, machine learning, molecular biology, software... ...simulation‑based and agentic tools for hypothesis generation... ...and integration with pipelines, AI/ML models and...PipelineDataMinimum wageFor contractorsLocal areaRelocation packageFlexible hours- ...ML/AI Research Engineer — Agentic AI Lab (Founding Team) Location... ...reasoning, and reinforcement learning — building the intelligence... ...sits on top of our enterprise data fabric. This isn't a... ...data Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex...PipelineDataFull time
$130k - $170k
...Blue Rose Research is seeking a Data Scientist to develop applied statistical models and manage end-to-end ML pipelines. This role involves close collaboration with a multidisciplinary... ...2 years of experience in data science or machine learning, possessing strong data wrangling and...PipelineDataRemote work$139.5k - $258.1k
...ML Applied Scientist, Apple Services Engineering AI/ML... ...Artificial Intelligence & Machine Learning. Thanks to Apple's... ...that features, code, data, and models are successfully... ...areas: Generative AI, Agentic AI systems, Natural... ...and large-scale data pipelines. Preferred...PipelineDataRelocationFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Machine Learning Scientist Agentic data pipelines. Be the first to apply!
- downstream processing scientist United States
- machine learning research scientist United States
- drug safety scientist United States
- remote scientist United States
- variant scientist United States
- hplc scientist United States
- graduate scientist United States
- operations research scientist United States
- senior scientist United States
- research associate scientist United States

