Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Wissensmanagement in der Bioinformatik


Arbeitsgruppe Wissensmanagement in der Bioinformatik

Neue Entwicklungen im Datenbankbereich und in der Bioinformatik

Prof. Ulf Leser

  • wann/wo? siehe Vortragsliste

Dieses Seminar wird von den Mitgliedern der Arbeitsgruppe als Forum der Diskussion und des Austauschs genutzt. Studierende und Gäste sind herzlich eingeladen.

Folgende Vorträge sind bisher vorgesehen:

Termin & Ort Thema Vortragende(r)
Freitag, 12.06.20, 11 Uhr s.t., Zoom Biomedical Event Extraction as Multi-Turn Question Answering Xing Wang
Freitag, 25.09.20, 11 Uhr, Zoom Sexual Predator Detection in Chats using Machine Learning on Android Matthias Vogt
Freitag, 09.10.20, 10 Uhr c.t., RUD 26, 1’308 Modeling Hierarchical OPS Codes In Multi-Label Recurrent Neural Network Based Document Classification Lennart Grosser


Biomedical Event Extraction as Multi-Turn Question Answering (Xing Wang)

The extraction of relation and event structures of genes and proteins is an important task in biomedical text mining. In this task, we are interested in learning interactions between biomedical entities like genes and proteins, trigger words and their corresponding types of biochemical reactions. Found events can be combined with other entities or events to create nested event structures and to model parts of biological pathways and networks. We tackle the task of biomedical event extraction by applying question answering to uncover the structure of an event over multiple turns step by step. We use a pretrained BERT neural network model and evaluate the results on the GENIA and Pathway Curation datasets from the BioNLP tasks. We compare our multi-turn question answering approach to the results from the TEES event extraction systems.

Sexual Predator Detection in Chats using Machine Learning on Android (Matthias Vogt)

An important risk children are facing online is grooming, where a sexual predator befriends and establishes an emotional connection with a child to lower the child's inhibitions, with the objective of sexual abuse or obtaining sexual content from them such as images. This is a major concern of public safety.
To approach this problem, we propose two early warning systems which aim to disrupt the grooming process. Just as on desktop, such systems are needed on mobile devices, where they could analyse end-to-end encrypted chats in messengers or be used to create privacy friendly parental control apps. Both of our systems can be implemented on mobile and for one, we also present a demo app.
Our early warning systems use machine learning models: a convolutional neural network (CNN) and a fine-tuned transformer. We describe their implementation, and evaluate that they predict early and accurately whether a victim is chatting with a sexual predator. We also novelly evaluate our early warning systems on a large number of full-length predator chats. Our fine-tuned transformer outperforms the state of the art for early sexual predator detection considerably. For this approach, we also publish our code and models.

Modeling Hierarchical OPS Codes In Multi-Label Recurrent Neural Network Based Document Classification (Lennart Grosser)

Die Arbeit behandelt die Multi-Label Klassifikation von OPS Codes anhand deutscher Operationsberichte mit rekurrenten neuronalen Netzen, wobei die Hierarchie der OPS Codes und ihre Modellierung als Teil der Architektur im Vordergrund steht. Dabei wird insbesondere eine Modellarchitektur, bestehend aus FastText Embeddings, LSTM und Attention mit einer Modellarchitektur, bestehend aus tf-idf Embeddings und Logistischer Regression verglichen und bezüglich ihrer prädiktiven Performanz untersucht.

Kontakt: Patrick Schäfer; patrick.schaefer(at)