Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Data Scientist

University of California , San Francisco

Job Function Summary:

Involves developing and utilizing computational tools and systems to analyze and interpret biological or other research data. Utilizes and develops algorithms, computational techniques, and standard statistical methodologies. Helps in the design of new experiments and leads the execution of building machine learning and statistical models. Implements end-user needs in database development, maintenance, searching, and integration. Maintains computational infrastructure and manages and tracks the flow of samples and information for large-scale studies. Provides bioinformatics and access to public and proprietary databases. Manages cloud and on-premises computational infrastructure and data.

Generic Scope

Professional who applies acquired job skills, policies, and procedures to complete substantive assignments / projects / tasks of moderate scope and complexity; exercises judgment within defined guidelines and practices to determine appropriate action.

Custom Scope

Our research efforts are at the intersection of cardiovascular disease and human genetics. Our clinical research efforts employ new techniques for deep phenotyping, such as deep learning. But these techniques rely on a solid foundation of classical bioinformatics. The Bioinformatics Programmer/Data Scientist will assist in managing, cleaning, and analyzing large scale medical data using a wide variety of analytic techniques, both in the cloud and with on-premises compute depending on data permissions. Experience with a cloud provider such as AWS, Microsoft Azure, or Google Cloud is a plus, and ability to learn how to manage cloud-based pipelines, and to perform cloud data management will be essential skills to develop and maintain. Maintaining bioinformatic databases by obtaining and restructuring data, including both UCSF proprietary data and public data, and writing tools to streamline discovery and replication analyses using these databases will be core responsibilities. An important task will be writing and maintaining analytic pipelines in languages such as R, python, Go, Rust, shell, SQL, WDL, and/or other appropriate languages, and using tools such as Docker. Experience with databases or the ability to learn will be requisite. Under the supervision of the PI, the Data Scientist will also be involved in data analysis, and will be comfortable with bioinformatic analyses including variant calling and annotation. There will be opportunities to employ cutting-edge methods and to develop new methods. The ability to learn and implement new techniques depending on the problem at hand will be an essential skill, thus requiring a strong foundation in computer programming. This position will also include administrative duties and will have the opportunity to participate in-and to lead-authorship teams.

%

of time

Essential Function (Yes/No )

Key Responsibilities

(To be completed by Supervisor)

30

YES

Designs, develops, debugs and utilizes computer programs necessary to extract, transform, and load data and prepare it for analysis.

  • Assists in extracting, transforming, and loading data from clinical sources and research sources using a wide variety of analytic techniques.
  • Develops data pipelines to standardize and automate repeatable data processing steps as appropriate.
  • Build and run programs to extract relevant imaging, biosignals, and medical data from clinical systems, including UCSF data.
  • Performs data quality control.

25

YES

Utilizes standard software tools to analyze, interpret or create moderately complex biological or research data.

  • Uses software such as plink2 to manage, merge, split, and analyze sequencing and genetic imputation data
  • Performs quality control at the sample-, variant-, and genotype-level for genetic sequencing and imputation data
  • Conducts analyses with linear, logistic, or survival models where appropriate

15

YES

Assists with computational resource management

  • Assists with management of research databases and shared computational resources
  • Manages cloud virtual environments
  • Manages containerization with tools such as Docker

15

YES

Assists with report preparation and / or analysis for internal constituents and scientific publication and dissemination .

  • Describes methods, results and implications of the work
  • Conducts background bibliographic research and summaries of the latter if appropriate for documents to be published externally
  • Generates appropriate data visualizations
  • Assists with general manuscript preparation and submission

15

YES

Maintains code and documentation, communicates proactively

  • Writes internal-facing documentation for all analyses, coding, tooling, and pipelines, clearly describing in text and graphics what is done and why it is done this way.
  • Writes appropriate code comments explaining unintuitive decisions, algorithms, and functions to allow other lab members to reason clearly about the code.
  • Uses change-management software, including git for code management.
  • Proactively communicates to the PI about barriers to progress and possible code or workflow improvements.
  • Provides the PI and collaborators with recommendations and guidance for subsequent steps.

100%

  • (To update total %, enter the amount of time in whole numbers (without the % symbol - e.g., 15, 20) then highlight the total sum (e.g., 1%) at the bottom of the column and press F9. The total sum should add up to 100%.)

Required Qualifications

  • Bachelor's degree in biological science, computational / programming, or related area and / or equivalent experience / training.
  • 12 months or more of demonstrated work experience using medical and/or health-related data, or similar, including developing pipelines for extracting, transforming, and loading data, and data analysis.
  • Working knowledge of bioinformatics methods and data structures.
  • Working knowledge of biostatistics and basic statistical testing.
  • Working knowledge of systems programming and databases.
  • Working knowledge of application and data security concepts.
  • Ability to effectively manage time and see assigned parts of projects through to completion on deadline.
  • Basic consultation and communication skills.
  • Demonstrated fluency and competency with statistical programming with the R programming language or the Python programming language.
  • Experience with or a demonstrated ability to learn and implement data management and computational pipelines for management of large-scale data.
  • At least 6 months of experience in direct data management and analysis using medical and/or health-related data using the above tools.
  • Ability to lead and maintain data pipelines for real-time data acquisition from clinical systems
  • Ability to multi-task and work well with limited supervision
  • Working project management skills.
  • Interpersonal skills in order to work with both technical and non-technical personnel at various levels in the organization.
  • Ability to communicate technical information in a clear and concise manner.
  • Self motivated, able to learn quickly, meet deadlines and demonstrate problem solving skills.

Preferred Qualifications

  • MS or greater in a related science or an equivalent combination of education and experience.
  • PhD in a field relevant to biomedical research (bioinformatics, biomedical engineering), or computer science (computer science, machine learning, artificial intelligence) or similar.

Required Qualifications

  • Bachelor's degree in biological science, computational / programming, or related area and / or equivalent experience / training.
  • 12 months or more of demonstrated work experience using medical and/or health-related data, or similar, including developing pipelines for extracting, transforming, and loading data, and data analysis.
  • Working knowledge of bioinformatics methods and data structures.
  • Working knowledge of biostatistics and basic statistical testing.
  • Working knowledge of systems programming and databases.
  • Working knowledge of application and data security concepts.
  • Ability to effectively manage time and see assigned parts of projects through to completion on deadline.
  • Basic consultation and communication skills.
  • Demonstrated fluency and competency with statistical programming with the R programming language or the Python programming language.
  • Experience with or a demonstrated ability to learn and implement data management and computational pipelines for management of large-scale data.
  • At least 6 months of experience in direct data management and analysis using medical and/or health-related data using the above tools.
  • Ability to lead and maintain data pipelines for real-time data acquisition from clinical systems
  • Ability to multi-task and work well with limited supervision
  • Working project management skills.
  • Interpersonal skills in order to work with both technical and non-technical personnel at various levels in the organization.
  • Ability to communicate technical information in a clear and concise manner.
  • Self motivated, able to learn quickly, meet deadlines and demonstrate problem solving skills.

Preferred Qualifications

  • MS or greater in a related science or an equivalent combination of education and experience.
  • PhD in a field relevant to biomedical research (bioinformatics, biomedical engineering), or computer science (computer science, machine learning, artificial intelligence) or similar.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Data Scientist in San Francisco, CA vacancy
  •  ...industries, including the world’s foremost experts in AI. About The Role To build truly global AI, our models must be trained on data that reflects the world’s diversity of languages and cultures. We are searching for a Machine Learning Engineer to own the quality... 
    Suggested
    Work at office
    Relocation package

    Cartesia

    San Francisco, CA
    1 day ago
  • $130k - $185k

     ...of the way – enabling you to shape your future with confidence. Within the EY-Parthenon service line, the EY Growth Platforms Data Scientists collaborate with Business Leaders, AI/ML Engineers, Project Managers, and other team members to design, build, and scale innovative... 
    Suggested
    Work experience placement
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    San Francisco, CA
    1 day ago
  • $100k

     ...Currently, we are looking for entry-level software programmers, Java full-stack developers, Python/Java developers, data analysts/data scientists, and machine learning engineers for full-time positions with clients. Who should apply? Recent computer science/engineering... 
    Suggested
    Full time
    H1b
    Remote work

    SynergisticIT

    San Francisco, CA
    2 days ago
  •  ...Lead Data Scientist San Francisco The Role We are seeking a Data Science Lead with demonstrated experience in leading data-driven product development and comprehensive business analytics. In addition to designing and conducting product experiments, this data... 
    Suggested

    1872 Consulting

    San Francisco, CA
    8 hours ago
  • $215.2k - $245.6k

     ...Lead Data Engineer Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast‑paced, collaborative, inclusive, and iterative delivery environment? At Capital One, you’ll be part of a big group of makers, breakers... 
    Suggested
    Internship
    Local area

    Capital One National Association

    San Francisco, CA
    3 days ago
  •  ...of deliverables You Will Be Successful If You have strong experience developing analytics solutions within the Epic Clarity Data Model environment You are comfortable working directly with operational clinical leaders, project teams, clinic staff, and other... 

    Bickham Services Unlimited, LLC

    San Francisco, CA
    3 days ago
  •  ...manufacturers recover value from surplus equipment. You'll lead the development of our appraisal and pricing capabilities — combining data science with agentic AI to automate and improve valuation decisions at scale. Key Responsibilities Design and iterate... 

    Amplio

    San Francisco, CA
    8 hours ago
  • $149.3k - $200.2k

     ...products and techniques that shape industry norms and enhance how audiences experience sports, entertainment & news. The Product & Data Engineering team is responsible for end to end development for Disney’s world-class consumer-facing products, including streaming platforms... 
    Work experience placement
    Worldwide

    The Walt Disney Company

    San Francisco, CA
    3 days ago
  •  ...'ve been transforming business identity verification, replacing slow, manual processes with seamless access to complete, up-to-date data. Our platform helps companies across industries confidently verify business identities, onboard customers faster, and reduce risk at... 
    Work at office
    2 days per week

    Middesk

    San Francisco, CA
    1 day ago
  •  ...Brisk Teaching Data Scientist Opportunity Brisk Teaching is the leading AI platform for K-12 educators, empowering teachers to deliver personalized, curriculum-aligned instruction at scale. Our Classroom Intelligence tools connect teachers, students, and curriculum... 
    Shift work

    Brisk Teaching

    San Francisco, CA
    4 days ago
  •  ...product, you will find a home at Fieldguide. About the Role The Data Science team at Fieldguide leverages novel, proprietary...  ...experiences Who You Are Minimum 5+ years of experience as a Data Scientist, Data Analyst, or similar role working with large and complex datasets... 
    Work from home
    Flexible hours

    Fieldguide.ai

    San Francisco, CA
    4 days ago
  •  ...Position Title We are seeking a Lead Data Engineer to architect, build, and lead the development of scalable, cloud‑based data platforms that support enterprise analytics, operational reporting, and advanced data use cases. This role provides technical leadership in designing... 

    Q-Cells

    San Francisco, CA
    3 days ago
  • $250k - $280k

     ...team in San Francisco. The ideal candidate will partner with clients to understand their needs and educate them on WEKA's advanced data management solutions, focusing on high performance workloads and cloud technologies. Responsibilities include technical presentations... 

    WekaIO

    San Francisco, CA
    8 hours ago
  • $50 - $70 per hour

     ...About The Role Looking for 15 talented PhD Data Scientists for part‑time remote work supporting AI research. Projects start at $50-70 USD / hour. Do you have a PhD in Data Science or Statistics and 3+ years of senior‑level experience at a highly reputable company... 
    Bi-weekly pay
    Hourly pay
    Part time
    Freelance
    Immediate start
    Remote work
    Flexible hours

    Aligned Labs

    San Francisco, CA
    3 days ago
  • $180k - $225k

     ...problems with ingenuity, creativity, and a keen moral compass. Nuna is committed to simple principles: a rigorous understanding of data, modern technology, and most importantly, compassion and care for our fellow human. We want to know what really works, what doesn't—... 
    Shift work

    Nuna Inc

    San Francisco, CA
    3 days ago
  •  ...Lead Data Engineer RADIUMONE IS A GLOBAL PROGRAMMATIC AD BUYING PLATFORM RadiumOne is the 6th largest web property in the U....  ...as well as implementing data transformations developed by data scientists as scalable and robust processes. In this role, the successful... 

    Stepping Up Solutions

    San Francisco, CA
    8 hours ago
  •  ...Job Title Mandatory Skills: (Oracle or PostgreSQL) and ETL Pipelines and Big Data and AWS Responsibilities · Uses structured tools for analysis and presentation of concepts and models to enhance the BRD · Develop, maintain and deliver training materials to the... 
    Work experience placement

    Omega Solutions Inc

    San Francisco, CA
    8 days ago
  • $215.2k - $245.6k

     ...Lead Data Engineer Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, collaborative, inclusive, and iterative delivery environment? At Capital One, you'll be part of a big group of makers, breakers... 
    Full time
    Part time
    Internship
    H1b
    Local area

    Capital One Financial Corp

    San Francisco, CA
    1 day ago
  •  ..., ADF, Datastage (or other ETL tool), SSAS cubes, Cognos, Tableau, Thoughtspot and other BI tools • Write SQL for processing raw data, kafka ingestions, adf pipelines, data validation and QA • Knowledge working with APIs to collect or ingest data • Experience... 

    BayOne Solutions

    San Francisco, CA
    8 hours ago
  •  ...Lead Data Engineer The Office of Information Technology (IT) is responsible for enabling State Bar's internal and external stakeholders by the management, implementation, and maintenance of an organization's technology to support of State Bar's mission and goals. The... 
    Work at office

    State Bar CA

    San Francisco, CA
    8 hours ago
  •  ...Lead Data Engineer With MarTech Location: SFO, CA (Hybrid 2 days a week) Key Responsibilities Lead end-to-end MarTech engineering initiatives across orchestration, data processing, and activation pipelines. Architect scalable, event-driven systems that... 
    2 days per week

    Staffing the Universe

    San Francisco, CA
    2 days ago
  •  ...Responsibilities Development Tasks: Collect metrics based on user interactions. Visualize data for business teams. Develop and redesign data pipelines using Kafka streams. Implement solutions using Spring Boot Java and Databricks Spark streaming.... 
    Local area

    My3Tech Inc

    San Francisco, CA
    4 days ago
  •  ...Title: Lead Data Engineer Location: Hybrid in SF (Tuesdays onsite) Openings: 1 Work Schedule: (available until 12 am PT time to overlap with onshore team) Follow-Up Meeting: After each interview is scheduled. Contract Type: 12 months contract extensions... 
    Contract work

    Insight Global

    San Francisco, CA
    2 days ago
  •  ...Job title: Lead Data Engineer Work Location: San Francisco, CA. Type: Contract Tech Stack & Skills She's Looking For: Core (Must-Have): Backend / Data Engineering (Primary focus) End-to-end data pipeline experience Strong SQL... 
    Contract work

    VBeyond

    San Francisco, CA
    2 days ago
  • $100k

     ...Cloud World/Oracle Java One (Las Vegas) -2023/2022 and at Gartner Data Analytics Summit (Florida)-2023. All positions are open for...  ...stack developers, Python/Java developers, data analysts/data scientists, machine learning engineers for full time positions with clients... 
    Full time
    H1b

    SynergisticIT

    San Francisco, CA
    2 days ago
  •  ...affordability, and effectiveness. About the Role We’re looking for a Sr. Data Engineer with strong data platform experience to help evolve...  ...AI into real-world use. You will partner closely with data scientists, analysts and product managers to ensure our platform supports... 

    Octave

    San Francisco, CA
    2 days ago
  •  ...Job Description Snowflake data engineering to support LEAPS We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes... 

    Insight Global

    San Francisco, CA
    2 days ago
  •  ...about new treatments to key people in the life science community. You can read more about Veeva Link on our product pages at As a data engineer, you focus on our data pipelines and take responsibility for a major part of the Link data processing platform. We value end... 
    Work at office
    Work from home

    Veeva Systems

    San Francisco, CA
    3 days ago
  •  ...Factory is bringing autonomy to software engineering, and we’re hiring a Data Engineer to own the systems that power how we understand and operate the business. You’ll architect and evolve the full data stack, designing the pipelines, models, and integrations that turn... 
    Work at office

    Factory

    San Francisco, CA
    3 days ago
  •  ...months after launch, and a fresh $130M Series A. Who we are looking for Everyone at Higgsfield is an A-player. You are: A strong SQL and data modeling expert who cares deeply about metric integrity. Obsessed with building reliable, auditable systems, not just dashboards.... 
    Full time
    Local area
    2 days per week

    Menlo Ventures

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Data Scientist. Be the first to apply!