Humboldt-Universität zu Berlin - Faculty of Mathematics and Natural Sciences - Process Management and Information Systems

Bachelor and Master Thesis

General Information

Our team offers bachelor and master thesis topics to be written in English. There are quarterly info sessions where we explain the process of writing a thesis with our team.

Last info session took place on October 26th, 2022. Here you can find the slides and the recording of the session.

For the current topic list, see below. Furthermore, find here a summary of guidelines for working on your thesis with us.

 

Important Dates

26.10.2022: Info session

03.11.2022: Expression of interest

04.11.2022: Topic assignment

15.12.2022: Official start

 

Formatting

Please consider the following hints and guidelines for working on your thesis:

  • The thesis has to be written using the \documentclass[preprint,review,12pt]{elsarticle} Latex template. You can use the overleaf template for writing your thesis.
  • A bachelor thesis has a page limit of 40 pages of text (not including cover, table of content, references, appendices).

  • A master thesis has a page limit of 80 pages of text (not including cover, table of content, references, appendices).

 

Prerequisites

The candidate is expected to be familiar with the general rules of writing a scientific paper. Some general references are helpful for framing any thesis, no matter which topic:

In agreement with the supervisor an individual list of expected readings should be studied by the student in preparation of the actual work on the thesis.

 

Grading

The grading of the thesis takes various criteria into account, relating both to the thesis as a product and the process of establishing its content. These include, but are not limited to:

  • Correctness of spelling and grammar

  • Aesthetic appeal of documents and figures

  • Compliance with formal rules

  • Appropriateness of thesis structure

  • Coverage of relevant literature

  • Appropriateness of research question and method

  • Diligence of own research work

  • Significance of research results

  • Punctuality of work progress

  • Proactiveness of handling research progress

 

Recent Topics

If you are interested in one of the following topics, please send an email expressing your interest to Dr. Saimir Bala (firstname[dot]lastname[at]hu-berlin.de). Please explain why this topic is interesting for you and how it fits your prior studies. Also explain what are your strengths in your studies and in which semester of your studies you are.

The next deadline is 3 November 2022.

 

Topic 1: Visual Analytics of Waiting Times for Process Mining

Process mining is a family of analysis techniques that takes event sequence data as input and generates meaningful visual representations for analysts. So far, the analysis of waiting times has been limited in prior research. The goal of this thesis is to review contributions that explicitly show the timeline of event, identify opportunities for improvement and implement and evaluate a new visualization technique.

References:

 

Topic 2: Investigating process mining use case constellations: Design of a vignette study

Process Mining enables organization to analyse business processes in a multitude of ways in order to discover, validate, enhance, or monitor their actual execution. Based on the use case, organisations face the challenge of deciding how to leverage process mining in the best possible way. For instance, they need to decide which perspectives (e.g., control-flow, case, time, or organizational), contextual level (e.g., case, process, social, external), or usage type (ad-hoc, repeated, or standard) is most beneficial for their strategic objective.

The goal of this thesis is to develop and conduct a survey (target audience are process mining experts and middle-to higher management) that investigates which process mining use case constellations are utilized in organizations and why they were chosen. As a potential setup, students can follow the study design of Kollmer et al. who conducted a quantitative vignette study.

References:

  • Van Der Aalst, W., 2016. Process mining: data science in action (Vol. 2). Heidelberg: Springer.
  • Pfahlsberger, L., Mendling, J. and A Eckhardt, 2021. Design of a process mining alignment method for building big data analytics capabilities. In Proceedings of the 54th Hawaii International Conference on System Sciences.
  • Kollmer, T., Eckhardt, A. and Reibenspiess, V., 2022. Explaining consumer suspicion: insights of a vignette study on online product reviews. Electronic Markets, pp.1-18.

Topic 3: Multi-Dimensional Visual Representations for Process Mining

Process Mining techniques typically produce visual representations that show the control flow of different activities captured in an event log. Several dimensions go orthogonal to this control flow dimension such as time and resources. It is the objective of this thesis to develop new process mining techniques that integrate a visualization of the control flow with additional dimensions such as time and resources. To this end, new algorithms will be developed, implemented and evaluated using benchmark event log data. Good implementation skills in java are required.

References:

  • Dumas, M., La Rosa, M., Mendling, J., & Reijers, H. A. (2018). Process monitoring. In Fundamentals of Business Process Management (pp. 413-473). Springer, Berlin, Heidelberg.
  • Van Der Aalst, W. (2016). Process mining: data science in action (Vol. 2). Heidelberg: Springer.
  • Yeshchenko, A., & Mendling, J. (2022). A Survey of Approaches for Event Sequence Analysis and Visualization using the ESeVis Framework. arXiv preprint arXiv:2202.07941.

 

Topic 4: Analyzing File Evolution Trends for Predicting Development Task Completion

Software development is supported by central repositories, such Version Control Systems or Issue Tracking Systems. These repositories keep track of the evolution of the artifacts being created during the development process.

The evolution of artifacts in real world software projects can give important insights on the effort put on determined development tasks. Understanding trends in the software evolution can help to better predict the time and the effort taken by specific artifacts to evolve towards a final version.

This work should investigate how we can use time-series analysis to understand and predict the status of software development tasks. Real world data from open-source repositories can be found online.

  • Saimir Bala, Kate Revoredo, João Carlos de A. R. Gonçalves, Fernanda Baião, Jan Mendling, Flávia Maria Santoro:
    Uncovering the Hidden Co-evolution in the Work History of Software Projects. BPM 2017: 164-180
  • Ruohonen, Jukka, Sami Hyrynsalmi, and Ville Leppänen. "Time series trends in software evolution." Journal of Software: Evolution and Process 27.12 (2015): 990-1015.

 

Topic 5: Reconstructing the Process Mining Ecosystem -  the Case of Celonis

Organizations cannot be viewed in isolation. Moreover, they are embedded inside a complex ecosystem of competitors, suppliers, customers, governments, and many more. Over time, organizations inside this ecosystem are subject to certain mimetic pressures that lead to unintentional or intentional changes, which in turn can lead to reciprocal imitation – Especially in dynamic and emerging areas like the process mining industry. Mimetic pressures lead to the prevalence of certain practices that seemingly caused success within the focal organization’s industry because those organizations follow the same goals, resources, customers, human capital or experience similar constraints.

Currently, the process mining ecosystem is characterized by a relatively homogenous constellation with many organizations that implemented a similar technological foundation, serve the same group of customers, and select the same kind of strategic partnerships. This thesis (Master thesis is favoured) has the goal to reconstruct the history of the ecosystem behind the market leading process mining vendor Celonis and find answers why Celonis favoured certain actions over others in order to be successful.

References:

  • Shahzad Ansari, Raghu Garud, and Arun Kumaraswamy. The disruptor’s dilemma: Tivo and the us television ecosystem. Strategic Management Journal, 37(9):1829–1853, 2016.
  • DiMaggio, P.J.; Powell, W.W. (1983): The iron cage revisited: institutional isomorphism an collective rationality in organizational fields. American Sociological Review 48, 147-160.
  • Wil MP van der Aalst. On the representational bias in process mining. In 2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, pages 2–7. IEEE, 2011.
  • Bernhard Lingens, Lucas Miehé, and Oliver Gassmann. The ecosystem blueprint: How firms shape the design of an ecosystem according to the surrounding conditions. Long Range Planning, page 102043, 2020.

 

Topic 6: Multi-Dimensional Process Analysis on Software Data

Software processes are complex as they involve multiple actors and data which interplay with one another over time. Process science is the discipline that studies processes. Works in this area are already using multi-dimensional analyses approaches to provide new insights in business processes that go beyond the discovery of control-flow via process mining. For instance, these new techniques allow to explain reasons why certain activities are a bottleneck by analyzing at the same time various perspectives such as case, time, resources and control-flow. The aim of this thesis is to explore the applicability of multi-dimensional process analysis on data from software development.

References:

  • Fahland, Dirk. "Process mining over multiple behavioral dimensions with event knowledge graphs." Process Mining Handbook. Springer, Cham, 2022. 274-319.
  • Esser, Stefan, and Dirk Fahland. "Multi-dimensional event data in graph databases." Journal on Data Semantics 10.1 (2021): 109-141.
  • AlMarzouq, Mohammad, Abdullatif AlZaidan, and Jehad AlDallal. "Mining GitHub for research and education: challenges and opportunities." International Journal of Web Information Systems (2020).