Direkt zum InhaltDirekt zur SucheDirekt zur Navigation
▼ Zielgruppen ▼

Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Wissensmanagement in der Bioinformatik

Current Projects

  • TABSIM: Table Similarity Search, DFG, 2018-2020
  • PREDICT: comPREhensive Data Integration for Cancer Treatment, BMBF, 2016-2019
  • PerSonS: Personalizing Oncology via Semantic Integration of Data, BMBF; 2016 - 2019
  • simpatix: Similarity Search for Richly Annotated Structured Patient Cases, DFG, 2016 - 2019
  • BioPatent: and of Biomedical Patents, BMWi, 2015-2017
  • MAPTor-NET: MAPK-mTOR network model driven individualized therapies of pancreatic neuro-endocrine tumors (pNETs), BMBF, 2015-2018
  • Graduate School SOAMED: Service-Oriented Architectures in Medicine, DFG, 2010-2019
  • Excellence Graduate School BSIO: Berlin School of Integrative Oncology, DFG, 2012-2017
  • Research unit Stratosphere: Cloud-enabled declarative information management, DFG, 2010-2016




TABSIM: Table Similarity Search


TABSIM-LOGOHomepage: None yet

Funding: DFG

Period: 2018 - 2020

Project partner: HPI Potsdam

Existing table similarity measures build on simple models of table metadata, structure, and content. They are designed mainly for tables with a horizontal layout where each column represents one attribute and data values are in rows, and they cannot be easily used for tables with other structures, such as matrix tables where both rows and columns are represented by attributes and values. Moreover, they rely (in different manners) on computing with frequency values of individual words which is not sufficient to capture the semantics of table elements. The main objective of this projetc is to research methods that bring more "semantics" to table similarity measures. We expect that better TSM will significantly improve the quality of applications relying on tables, such as table similarity search and table auto completion. We will approach this problem in two ways: By learning specific word embeddings optimized to yield semantically meaningful comparisons of single tokens within tables, and by designing a particular neural network architecture addressing table normalization and table comparison in a single, trainable framework.

PREDICT: comPREhensive Data Integration for Cancer Treatment



Funding: BMBF

Period: 2016 - 2019

Project partner: Charite Berlin, Berlin Institute of Health

The central aim of the PREDICT project is to develop a software system that enables clinicians to use the large body of data on the relationships between genetic/epigenetic alterations and treatment options/success in cancer, to support (a) the rapid development of new, targeted studies whose design essentially is based on genomic features, and to (b) enable a maximally informed and structured clinical decision process. A knowledge base will be created using advanced and innovative algorithms for knowledge extraction, semantic data integration, and biomedical text , and made available to the clinical oncologist through a cancer-genomic clinical workbench based. Moreover, the knowledge base will be an essential tool to initiate and support highly targeted umbrella and basket trials in which experimental drugs are administered to a typically small group of patients chosen based on their mutation status. Finally, the knowledge base will be used to develop novel algorithms to assess the effect of drugs on a patient’s tumor depending on its mutation profile.

PerSonS: Personalizing Oncology via Semantic Integration of Data



Funding: BMBF

Period: 2016 - 2019

Project partner: Universität Tübingen, KAIROS GmbH, Leibniz-Institut für Wissensmedien, Tübingen, Universitätsklinik Tübingen, Charite Berlin

The central goal of the project is to develop a software system for providing homogeneous and intuitive access to all data relevant to therapeutic decisions. The project will enable clinicians to (a) select stratified patient cohorts based on a full semantic integration of clinical and high-throughput (HT) data and (b) to bring personalized tumor therapy to a new level through a touch-based visual analytics tool allowing intuitive access to all data – literally ‘information at your fingertips’. Key tasks to achieve these goals are conversion of free text clinical reports into structured data, automated consistent analysis of HT data, a full semantic integration of the clinical data with selected HT data, the design of a user-friendly graphical user interface for the selection of patient cohorts, and finally (particularly for the translational phase) an interactive visual analytics tool based on a multi-representational interactive touch table to conduct personalized interdisciplinary tumor conferences in order to support therapeutic decisions.

simpatix: Similarity Search for Richly Annotated Structured Patient Cases


simpatix-logoHomepage: simpatix

Funding: DFG, to Dr. Starlinger

Period: 2016 - 2019

Project partner: Charite Berlin

Process-enhanced similarity search has a big potential to improve knowledge discovery and decision support in a number of disciplines, and especially in clinical medicine. Currently, this potential can not be exploited due to a lack of algorithms for both the creation of annotated process representations from unstructured content, and of methods for the effective comparison of such annotated processes. In the simpatix project, we focus on the medical domain where the central concept is a patient’s case, recorded in a (electronic) health record (EHR). Consisting of mostly unstructured or semi-structured data, such as clinical notes from examinations and treatments, tabularized data from quantitative tests (such as blood screenings), or discharge summaries, each case encodes a process describing the individual patient’s disease history. This project's main objectives are to a) develop methods for the construction of structured, process-oriented case representations from large data sets including unstructured documents; b) research algorithms for process-enhanced similarity search over richly annotated case collections; and to c) design and implement a generic repository to store process-enhanced case collections that allows scalable, effective similarity search.

MAPTor-NET: MAPK-mTOR network model driven individualized therapies of pancreatic neuro-endocrine tumors



Funding: BMBF

Period: 2015 - 2018

Project partner: Charite Berlin, Universität Oldenburg

Pancreatic NET (pNET) comprise the most prominent subgroup group of rare Neuroendocrine tumors (NET) with distinct prognostic classes, and thus diverse therapeutic regimens. Available pNET treatments include somatostatin analogs, systemic chemotherapy, and novel molecular drugs targeting receptor tyrosine kinases (Sunitinib), or the mTOR pathway (Everolimus). However, tumor heterogeneity results in an unpredictable response to the therapy, and only a limited number of patients profits from either treatment. To date, no method for diagnostic stratification of patients exists. The MAPTor-Net consortium suggests a focused systems medicine approach that uses clinical and pathological data together with mutation/expression profiles to individually preselect patients prior to therapy. The approach uses a combination of top-down modeling of the core pathways altered in pNET, and a bottom-up approach gathering and integrating individual molecular data.

Graduate School GRK 1651 SOAMED - Service-Oriented Architectures in Medicine



Funding: Deutsche Forschungsgemeinschaft

Period: 2010 - 2019 (now in its second funding period)

Project partner: Technische Universität Berlin, Hasso-Plattner-Institut Potsdam, Fraunhofer FIRST Berlin

Service orientation is a promising architectural concept to quickly and cost efficiently couple encapsulated software components ("services"), and to adapt them to new requirements. At the same time, Informatics is a key technology for the innovative organization of health care systems and of medical technology. In comparison with other organizational and embedded systems, the involved processes are more versatile, and the reliability and correctness requirements are higher. Medical processes are usually loosely coupled. Their integration is as much difficult as important. Theoretical and methodological foundations of both the design process and the structure of service-oriented systems might substantially improve today’s information technology In this situation, this Graduate School starts out with the idea to underpin the currently pragmatically focussed service-oriented approach with theoretical foundations by integrating established as well as emerging software engineering procedures. This approach aims at a decisive improvement of concepts, methods, and tool support for service-oriented system construction.

BSIO: Berlin School of Integrative Oncology



Funding: DFG Excellence Initiative

Period: 2012 - 2017

Project partner: Charite Berlin + Further partners

The BSIO offers a structured 3-year doctoral program jointly educating natural scientists and physicians/medical students in the field of integrative oncology and features excellent research conditions, a comprehensive curriculum and a broad supervision and mentoring network. With respect to its scope, the BSIO aims to bring our understanding of malignant growth to new conceptual levels, i.e. to expand the molecular, cell biological, organismic and system-mathematical research focus by utilizing advanced experimental and simulatory models to develop novel diagnostics and innovative therapeutic principles, and to make them rapidly available for clinical testing.

Research Unit Stratosphere: Cloud-enables Declarative Information Management



Funding: Deutsche Forschungsgemeinschaft

Period: 2010 - 2013 (first funding period), 2014-2016 (second funding period)

Project partner: Technische Universität Berlin, Hasso-Plattner-Institut Potsdam

The Collaborative Research Unit “Stratosphere” aims at advancing the state-of-art in data processing on parallel, architectures. Stratosphere explores the power of massively parallel computing for complex information management applications. We will develop a novel, database-inspired approach to analyze, aggregate, and query very large collections of either textual or (semi-)structured data on a virtualized, massively parallel architecture.
Stratosphere will conduct research in the areas of massively parallel data processing engines, a programming model for parallel data programming, robust optimization of declarative data flow programs, continuous re-optimization and adaptation of the execution, data cleansing, and text . The unit will validate its work through a benchmark of the overall system performance and by demonstrators in the areas of climate research, the biosciences and linked open data.