Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Wissensmanagement in der Bioinformatik

Masterseminar: Algorithmen und Methoden der Zeitreihenanalyse - Detailseite

Debunking Myths in Time Series Deep Learning: A reproducibility Study.

Dr. Patrick Schäfer

Inhalt

Deep learning has firmly established itself as a leading-edge technology across various domains, accompanied by an unprecedented surge in research publications. In the realm of time series classification (TSC), groundbreaking achievements have been reported, setting a new standard on the UCR time series benchmark archive that is beyond the reach of conventional machine learning approaches.

Yet, deep learning research in TSC has encountered worrisome trends. Many seemingly unattainable benchmark results stem from inadequate model training and overfitting the test data, i.e. maximizing the test accuracy of the model over multiple epochs. This is obviously biased and gives an unfair advantage over conventional models/training, yet there has been a concerning influx of approaches published at esteemed venues following this pitfall. Additionally, many approaches undergo evaluation on subsets of the UCR benchmark datasets without transparently indicating the rationale behind such seemingly cherry picking. Finally, many papers do not publish any source codes, though they are based on common frameworks like TensorFlow and PyTorch, which impedes reproducibility.

This seminar is dedicated to addressing common pitfalls that pose a threat to the field of TSC. Throughout the seminar, groups consisting of 2-3 students will select from a list of current time series deep learning models, examine potential pitfalls associated with these models, and attempt to replicate their results.

This process will provide participants with invaluable insights into the current landscape of deep learning for time series classification.

Termine

  • Einführungsveranstaltung: Freitag, 03.11.2023, 13 Uhr, RUD26 1´305 
  • Abschlusspräsentation: Freitag, 02.02.2024, 12:15 Uhr, RUD25 3.101 

Einführende Literatur

  • Foumani, Navid Mohammadi, et al. "Deep learning for time series classification and extrinsic regression: A current survey." arXiv preprint arXiv:2302.02515 (2023). Paper
  • Middlehurst, Matthew, Patrick Schäfer, and Anthony Bagnall. "Bake off redux: a review and experimental evaluation of recent time series classification algorithms." arXiv preprint arXiv:2304.13029 (2023). Paper
  • Fawaz, Hassan Ismail, et al. "Deep learning for time series classification: a review." arXiv preprint arXiv:1809.04356 (2018). Paper
  • https://scholar.google.de for searching scientific papers
  • https://sktime.org a Python library dedicated to time series (classifiers)
  • http://timeseriesclassification.com a website dedicated to univariate time series classifiers

Voraussetzungen

  • Gute Kenntnisse in Algorithmen und Datenstrukturen (z.B. gleichnamige Vorlesung)
  • Kenntnisse in Statistik und/oder Machine Learning (oder die Bereitschaft sich einzuarbeiten)

Anmeldung

Die Teilnehmerzahl ist begrenzt, die Anmeldung erfolgt über AGNES.

Anforderungen

TBA

 

Schein und Anrechenbarkeit

Das Seminar ist anrechenbar für

  • Diplom Informatik
  • Master Informatik
  • Master Wirtschaftsinformatik

Voraussetzungen für den Schein sind:

  • der Besuch der Einführungsveranstaltungen zur Themenvergabe,
  • die regelmäßige Kommunikation mit dem Betreuer,
  • eine Kurzpräsentation des Themas (etwa in der Mitte des Semesters),
  • das Halten eines wissenschaftlichen Vortrags im Blockseminar am Ende des Semesters,
  • die Teilnahme am Wettbewerb inkl. Präsentation der Ergebnisse, und
  • das Erstellen einer schriftlichen Ausarbeitung (Seminararbeit).

Vorlagen