RG 5363: KI-FOR Fusing Deep Learning and Statistics towards Understanding Structured Biomedical Data

At a glance

Project duration
01/2023  – 01/2027
DFG classification of subject areas

Electrical Engineering and Information Technology

Medicine

Social Sciences

Funded by

DFG Research Unit DFG Research Unit

Project description

High-throughput measurements in the biomedical sciences such as stacks of images, genome sequences or time-series constitute structured data that are characterized by their inherent dependencies between measurements, often non-vectorial nature and the presence of confounding influences and sampling biases. For example, population structure, systematic measurement artifacts, non-independent sampling or different group age distributions can lead to spurious results if not accounted for. Deep learning excels in many applications on structured data due to the ability to capture complex dependencies within and between inputs and outputs, allowing for accurate prediction. Despite recent advances in explainable artificial intelligence and Bayesian neural networks, deep learning still has limitations with respect to its assessment of uncertainty, interpretability, and validation. These, however, are important components in order to go beyond prediction towards understanding the underlying biology. To this end, statistics has traditionally been used in the biomedical sciences due to interpretable model output and statistical inference, which i.a. provides quantification of uncertainty, corrections for confounding and testing of hypotheses with statistical error control. Methods from classical statistics, however, have limitations in their modelling flexibility for structured data and their ability to capture complex non-linearities in a data-driven way.In this research unit we bring together experts from machine learning and statistics with a track record in biomedical applications to address the following overarching objectives:(O1) to integrate deep learning and statistics to improve interpretability, uncertainty quantification and statistical inference for deep learning, and to improve modeling flexibility of statistical methods for structured data. In particular, we will develop methods that provide statistical inference for structured data by quantification of uncertainty, testing of hypotheses and conditioning on confounders, and that improve explanations of structured data through hybrid statistical and deep learning models, population- and distribution-level explanations, and robust sparse explanations.(O2) to create a feedback loop between this methods development and biomedical applications, where we account for the needs in the analysis of the data when developing new methods and generate biomedical insights from applications of the developed methods to the data. Applications include analysis of MRI, fMRI and microscopy images, longitudinal disease progression modeling, DNA sequence analysis, and genetic association studies.

Open project website

Topics

Medizin

Cooperation partners

  • Cooperation partner
    UniversityGermany

    Eberhard Karls University of Tübingen

  • Cooperation partner
    Research instituteGermany

    Hasso Plattner Institute for Digital Engineering

  • Cooperation partner
    UniversityGermany

    Karlsruhe Institute of Technology

  • Cooperation partner
    Non-university research institutionGermany

    Max Delbrück Center for Molecular Medicine in the Helmholtz Association

  • Cooperation partner
    UniversityGermany

    Technical University of Berlin

  • Cooperation partner
    UniversityGermany

    University Hospital and Faculty of Medicine Tübingen