Masterseminar: Algorithmen und Methoden der Zeitreihenanalyse - Detailseite

Wissensmanagement in der Bioinformatik | Seminar (Master) Algorithmen und Methoden der Zeitreihenanalyse

Masterseminar: Algorithmen und Methoden der Zeitreihenanalyse - Detailseite

Debunking Myths in Time Series Deep Learning: A reproducibility Study.

Dr. Patrick Schäfer

Inhalt

Deep learning has firmly established itself as a leading-edge technology across various domains, accompanied by an unprecedented surge in research publications. In the realm of time series classification (TSC), groundbreaking achievements have been reported, setting a new standard on the UCR time series benchmark archive that is beyond the reach of conventional machine learning approaches.

Yet, deep learning research in TSC has encountered worrisome trends. Many seemingly unattainable benchmark results stem from inadequate model training and overfitting the test data, i.e. maximizing the test accuracy of the model over multiple epochs. This is obviously biased and gives an unfair advantage over conventional models/training, yet there has been a concerning influx of approaches published at esteemed venues following this pitfall. Additionally, many approaches undergo evaluation on subsets of the UCR benchmark datasets without transparently indicating the rationale behind such seemingly cherry picking. Finally, many papers do not publish any source codes, though they are based on common frameworks like TensorFlow and PyTorch, which impedes reproducibility.

This seminar is dedicated to addressing common pitfalls that pose a threat to the field of TSC. Throughout the seminar, groups consisting of 2-3 students will select from a list of current time series deep learning models, examine potential pitfalls associated with these models, and attempt to replicate their results.

This process will provide participants with invaluable insights into the current landscape of deep learning for time series classification.

Termine

Einführungsveranstaltung: Freitag, 03.11.2023, 13 Uhr, RUD26 1´305
Abschlusspräsentation: Freitag, 02.02.2024, 12:15 Uhr, RUD25 3.101

Einführende Literatur

Foumani, Navid Mohammadi, et al. "Deep learning for time series classification and extrinsic regression: A current survey." arXiv preprint arXiv:2302.02515 (2023). Paper
Middlehurst, Matthew, Patrick Schäfer, and Anthony Bagnall. "Bake off redux: a review and experimental evaluation of recent time series classification algorithms." arXiv preprint arXiv:2304.13029 (2023). Paper
Fawaz, Hassan Ismail, et al. "Deep learning for time series classification: a review." arXiv preprint arXiv:1809.04356 (2018). Paper
https://scholar.google.de for searching scientific papers
https://sktime.org a Python library dedicated to time series (classifiers)
http://timeseriesclassification.com a website dedicated to univariate time series classifiers

Voraussetzungen

Gute Kenntnisse in Algorithmen und Datenstrukturen (z.B. gleichnamige Vorlesung)
Kenntnisse in Statistik und/oder Machine Learning (oder die Bereitschaft sich einzuarbeiten)

Anmeldung

Die Teilnehmerzahl ist begrenzt, die Anmeldung erfolgt über AGNES.

Anforderungen

TBA

Schein und Anrechenbarkeit

Das Seminar ist anrechenbar für

Diplom Informatik
Master Informatik
Master Wirtschaftsinformatik

Voraussetzungen für den Schein sind:

der Besuch der Einführungsveranstaltungen zur Themenvergabe,
die regelmäßige Kommunikation mit dem Betreuer,
eine Kurzpräsentation des Themas (etwa in der Mitte des Semesters),
das Halten eines wissenschaftlichen Vortrags im Blockseminar am Ende des Semesters,
die Teilnahme am Wettbewerb inkl. Präsentation der Ergebnisse, und
das Erstellen einer schriftlichen Ausarbeitung (Seminararbeit).

Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Wissensmanagement in der Bioinformatik