Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Institut für Informatik

Verteidigung Bachelorarbeit: Martin Wackerbauer

  • Wann 25.04.2018 von 10:15 bis 23:59
  • Wo Rudower Chaussee 25, Haus 4, R. 410
  • iCal

Herr Martin Wackerbauer verteidigt seine Bachelorarbeit zum Thema "Automatically Identifying Key Sentences in Oncological Abstracts Using Semi-Supervised Learning".

 

Abstract

We present a machine learning pipeline that identifies key sentences in abstracts of oncological articles to aid evidence-based medicine. As a gold standard of sentences containing high-quality information, we use the CIViC database of clinical evidence summaries. This entails an implicit definition of a sentence’s relevance, linked to its similarity to sentences found in professional summaries.
For obtaining a realistic model, we use the abstracts summarised in the gold standard as unlabelled examples in semi-supervised learning. Using the Hallmarks of Cancer corpus as auxiliary data provides additional labelled examples for training and testing. To mitigate difficulties arising from heterogenous data sources, we propose using learning from positive and unlabelled data for noise detection and explore the behaviour of different learning heuristics. The best model achieves 84% accuracy on our dataset.


 

Alle Interessierten sind herzlich eingeladen.