Data Architect, Data Foundry
$132k - $193.6kInitial Therapeutics, Inc.
At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life‑changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world. Position: Data Architect, Data Foundry Location: San Diego, CA; San Francisco, CA; Boston, MA; Louisville, CO; Indianapolis, IN Overview Lilly Small Molecule Discovery is purpose‑built to create molecules that make life better for people. Discovery Technology and Platforms (DTP) accelerates molecule discovery by building optimized foundational platforms, streamlining lab operations through advanced technologies and data connectivity, and investing in novel capabilities. Data Foundry is a multidisciplinary team within DTP that enables AI‑native drug discovery through four integrated pillars: Architecture4Insight (data infrastructure and scientific software), Methods4Insight (analytical and computational methods), Automation & Scale4Insight (lab automation and agentic workflows), and Preparedness4Insight (data governance and readiness). These pillars empower every Lilly scientist to make optimal decisions by providing seamless access to data, insights, and AI‑driven capabilities—serving both human scientists and autonomous AI agents. Position Summary We are seeking Data Architects at multiple levels to design and build the data infrastructure that makes AI‑native drug discovery possible. You will create the schemas, ontologies, data models, knowledge graphs, and platform architectures that transform raw scientific data into machine‑actionable, FAIR‑compliant, insight‑ready assets—serving both discovery scientists and autonomous AI agents. This role is the foundation of Architecture4Insight . Everything the software engineering team builds—pipelines, APIs, prototypes—depends on the data models and platform architecture this team designs. You will work with deep knowledge of scientific data (chemical, biological, HTE, automation‑generated) to create custom‑fit solutions, then partner with View email address on click.appcast.io to scale and maintain them. The role spans three focus areas depending on expertise: data modeling & ontologies , data platform & lakehouse architecture , and knowledge graph & specialized data systems . Responsibilities Data Modeling & Ontologies Design and implement data models, schemas, and ontologies for chemical, biological, and automation‑generated data that serve discovery workflows across the portfolio. Define and maintain controlled vocabularies, metadata standards, and FAIR‑compliant data frameworks in partnership with Preparedness4Insight. Implement semantic data standards (RDF, OWL, SPARQL) and ontology engineering practices to create interoperable, machine‑readable scientific data. Data Platform & Lakehouse Architecture Design and implement data lakehouse architecture using modern platforms (Databricks, Snowflake, or equivalent), including data storage patterns, partitioning strategies, and query optimization. Build and optimize ETL/ELT pipelines using Spark, dbt, or similar tools to transform raw scientific data into analytical and ML‑ready formats. Implement real‑time and streaming data integration (Kafka, Kinesis, event‑driven patterns) connecting LIMS, instruments, and lab automation systems to the data infrastructure. Knowledge Graph & Specialized Data Systems Design and implement knowledge graphs (Neo4j, Amazon Neptune, TigerGraph) that capture molecular, target, pathway, and experimental relationships across the discovery landscape. Architect specialized data solutions: array databases (TileDB) for genomics/imaging, document stores (MongoDB) for experimental records, and vector databases for embedding‑based retrieval supporting ML and RAG workflows. Build query and traversal patterns that enable scientists and AI agents to ask relational questions across the entire data landscape. Cross‑Functional Partnership Partner with scientific software engineers to ensure data architectures are implementable, performant, and well‑documented. Collaborate with Methods4Insight to design data structures that support analytical model training, deployment, and evaluation. Work with View email address on click.appcast.io to define scaling strategies, ensure enterprise compliance, and transition data architectures to production‑grade management. Contribute to build‑versus‑buy‑versus‑adopt decisions by evaluating commercial and open‑source data platforms against Data Foundry requirements. Basic Requirements B.S. or M.S. in Computer Science, Data Science, Bioinformatics, Computational Biology, Information Science, or related STEM field; Ph.D. valued for ontology and knowledge graph roles. B.S. with 7+ years and M.S. with 5+ years of data architecture, data engineering, or scientific informatics experience. SQL skills and experience in multiple database paradigms (relational, graph, document, columnar, key‑value). Qualified applicants must be authorized to work in the United States on a full‑time basis. Lilly will not provide support for or sponsor work authorization or visas for this role, including but not limited to F‑1 CPT, F‑1 OPT, F‑1 STEM OPT, J‑1, H‑1B, TN, O‑1, E‑3, H‑1B1, or L‑1. Preferred Qualifications Expertise in at least one of: data modeling/ontologies, data platform engineering (Databricks, Snowflake, Spark), or graph/specialized databases (Neo4j, Neptune, MongoDB). Familiarity with cloud platforms (AWS, Azure, or GCP) and modern data integration patterns. Understanding of scientific data types and experimental workflows in life sciences or pharma (chemical, biological, HTE data). Strong communication skills with ability to translate data architecture concepts for both technical and scientific audiences. Pharmaceutical or biotech research industry experience, particularly in discovery data management or research informatics. Experience with semantic web technologies: RDF, OWL, SPARQL, Protégé, or equivalent ontology engineering tools. Hands‑on experience with graph databases (Neo4j, Neptune, TigerGraph) and knowledge graph design patterns for scientific data. Data lakehouse architecture experience: Databricks (Delta Lake, Unity Catalog), Snowflake, or equivalent; ETL/ELT with Spark, dbt. Experience with streaming/real‑time data platforms (Kafka, Kinesis, Flink) and event‑driven architectures. Familiarity with LIMS, ELN systems (e.g., Benchling), and laboratory instrument data integration. Experience with vector databases (Pinecone, Weaviate, pgvector) and embedding‑based retrieval for ML/RAG applications. Array database experience (TileDB, Zarr) for genomics, imaging, or high‑dimensional scientific data. Experience with bioinformatics data formats (FASTA, BAM/CRAM, VCF) and biological sequence databases; familiarity with NGS data pipelines and proteomics data management. FAIR data principles implementation experience and Data Readiness Level frameworks. Scientific data standards and controlled vocabularies in chemistry (InChI, SMILES) or biology (Gene Ontology, UniProt, pathway databases such as Reactome or KEGG). Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response. Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status. Our employee resource groups (ERGs) offer strong support networks for their members and are open to all employees. Our current groups include Africa, Middle East, Central Asia Network, Black Employees at Lilly, Chinese Culture Network, Japanese International Leadership Network (JILN), Lilly India Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ+ Allies), Veterans Leadership Network (VLN), Women’s Initiative for Leading at Lilly (WILL), enAble (for people with disabilities). Learn more about all of our groups. Actual compensation will depend on a candidate’s education, experience, skills, and geographic location. The anticipated wage for this position is $132,000 - $193,600. Full‑time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company‑sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well‑being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities). Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly’s compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees. #WeAreLilly #J-18808-Ljbffr
- ...Overview: Role: Data Architect Location: San Francisco, CA Duration: 6 months Data Architecture Strategy o Define and implement enterprise-wide data architecture standards and best practices. o Develop conceptual, logical, and physical data models...Suggested
- ...The Data Architect is responsible for designing scalable data architectures that support the integration of multiple data sources. This role defines technical standards, ensures data quality and governance, and enables efficient data processing across the platform. Responsibilities...SuggestedRemote work
- ...Data Architect Data Architect Bay Area, CA / Charlotte, NC (1 position for each) Manage multiple projects Design and execute the key initiatives using the subject matter expertise with extensive knowledge of customer complaints & survey and system of records,...Suggested
- ...A forward-thinking company in San Francisco is seeking a Healthcare Data Partnerships Lead to build and manage data partnerships in healthcare. This role requires a strong background in business development or partnerships within the healthcare sector. The candidate will...Suggested
- ...A leading cloud-based software company seeks a Data Model Architect with extensive experience in data modeling and architecture. The role involves designing robust data models that integrate various data sources and optimizing data architectures on platforms like Snowflake...Suggested
$75k - $135k
...A fast-growing consulting firm in San Francisco is seeking a Consulting Associate to support data and AI initiatives for leading companies. Ideal candidates will possess a Bachelor's degree in a related field and have 2 to 4 years of experience in data engineering or...- ...Lawrence Berkeley National Laboratory is hiring a Data Science Workflows Architect within the NERSC division. The National Energy Research Scientific Computing Center (NERSC) is seeking an engineer with experience in complex scientific workflows to join our team to help...
- ...Violet Research in San Francisco is seeking a founding Bioinformatician to architect the entire genomics data foundation. This role includes managing data from sequencing ingestion to clinical interpretation, impacting real patient outcomes. Ideal candidates should have...
- ...LendingClub in San Francisco is seeking a Data Platform Architect to define the architecture for our modern data and AI platform. You will ensure the platform meets security and scalability requirements and collaborate across teams. The ideal candidate has over 12 years...
- ...A leading technology firm is seeking a Senior Consultant for Data Governance in San Francisco. You will develop and implement data governance frameworks and policies, work with data teams, and ensure high data quality standards. The ideal candidate will have over 7 years...
$132k - $193.6k
...Initial Therapeutics, Inc. is seeking a knowledgeable Data Architect to join their team in San Francisco, CA, focused on AI-native drug discovery. The role involves designing data models, optimizing lakehouse architecture, and implementing innovative data solutions. Candidates...$233.5k - $350.5k
...A dynamic nonprofit organization in San Francisco seeks a Senior Staff Data Platform Architect to modernize their data ecosystem and manage a cloud-native platform. This impactful role requires 10+ years of data architecture experience, hands-on skills with platforms like...- ...GCP Data Architect Location: SFO, CA Rate: Open Experience: 8+ years of total experience with at least 3 years in working with Google Cloud components. GCP Data Experience: Must have good knowledge of Cloud run, Cloud function, Cloud SQL, Pub-sub, Cloud...
$226.67k - $236.26k
...A national insurance group is seeking a Principal Data Architect to shape the enterprise data architecture and support their strategic transition toward AI adoption. The ideal candidate will lead efforts in data integration, real-time analytics, and the establishment...$100 per hour
...A tech firm specializing in data collection is seeking an experienced individual to design and implement high-throughput data pipelines. The role requires expertise in storage solutions with a focus on reliability and performance. Candidates must have over 5 years of...Hourly payContract work$170k - $225k
...Data Architect Atlanta; Boston; Charlotte; Chicago; Dallas; Los Angeles; New York; San Francisco This position is not eligible for immigration sponsorship. Company Overview We are the better way to work in finance. As private equity's value creation partner...Work at officeLocal areaRemote work2 days per week- ...A leading global consulting firm is seeking a Senior Manager in Data Architecture to oversee technology projects within the automotive and aerospace sector. This role involves crafting and implementing data architectures, leading teams, and managing client relationships...Flexible hours
- A leading cloud cost management company is seeking a Distinguished Architect to drive the architecture of its data platform. This role involves designing real-time streaming data pipelines and collaborating with various engineering teams to shape the overall engineering...Remote work
- A technology firm is seeking a Principal Machine Learning & Data Engineer to lead the creation of an internal ML-and-data platform. You will design cloud-native pipelines and model-serving infrastructure, enabling rapid development for product teams. Ideal candidates have...Remote work
- ...Databricks Inc. is seeking a Delivery Solutions Architect to drive customer success and accelerate adoption of its Data Intelligence Platform. The role entails engaging with Solutions Architects to develop customer solutions, leading technical strategies in high-level...
$100 per hour
...Eon Systems, Inc in San Francisco is looking for an expert to architect and implement a high-throughput data pipeline for large microscopy datasets. The ideal candidate will have over 5 years of experience designing storage solutions capable of handling petabyte-scale...Hourly payContract workImmediate start- A data solutions company in San Francisco seeks a Data Architect to design scalable data architectures that integrate multiple data sources. The role involves establishing technical standards for data quality, governance, and enabling efficient data processing. Candidates...
- ...A leading data and AI company is looking for a Sr. Staff Software Engineer to join their executive leadership team. This position involves designing and developing the Data Intelligence Platform and managing large-scale infrastructure across clouds. Candidates should...
- ...Overview: About the job you're considering The Data Governance Architect must have experience in defining and implementing enterprise data governance frameworks that ensure data is trusted, secure, compliant, and fit for business use. This role bridges business...
- ...contribute to ongoing research and development. - Guide and mentor architects and engineers establishing and maintaining architecture of... ...to standards and approaches for architecture of enterprise data solutions. - Optimize overall data/information flow by reducing...Minimum wageContract workTemporary workWork experience placementLocal areaRemote work
- ...Data Governance Architect Location(s): Dallas, TX or San Francisco, CA (Hybrid: Onsite 4 days/week) Summary: Seeking a senior Data Governance Architect with deep expertise in establishing and scaling enterprise data governance programs. The role focuses on creating...Local area
$226.67k - $236.26k
...Principal Data Architect Berkshire Hathaway Homestate Companies, Workers Compensation Division, is searching for a Principal Data Architect to design and evolve the enterprise data architecture that supports our growing business, AI adoption, and long-term strategy...Work at officeImmediate startWork from homeWork visaFlexible hours- ...Microsoft Data Architect Sonsoft, Inc. is a USA based corporation duly organized under the laws of the Commonwealth of Georgia. Sonsoft Inc. is growing at a steady pace specializing in the fields of Software Development, Software Consultancy and Information Technology...Permanent employmentFull time
- .... Our Mission Reflective is a philanthropically-funded nonprofit working to close that gap. Our mission is equip the world with the data and tools needed to make informed decisions about SAI, fast enough to matter. What We're Looking For We’re building a small, high‑trust...Work at officeRelocationVisa sponsorship2 days per week3 days per week
- ...Data Architect Location: San Francisco, CA (Onsite) Duration: 6+ Months Job Description: Research and properly evaluate sources of information to determine possible limitations in reliability or usability Apply sampling techniques to effectively determine...Work experience placement
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Data Architect, Data Foundry. Be the first to apply!
- sr data modeler San Francisco, CA
- database designer San Francisco, CA
- senior data modeler San Francisco, CA
- data integration architect San Francisco, CA
- remote data architect San Francisco, CA
- senior data architect San Francisco, CA
- data center architect San Francisco, CA
- data architect San Francisco, CA
- data officer San Francisco, CA
- data network cabling San Francisco, CA

