Bachelor and Master Thesis
General Information
Our team offers bachelor and master thesis topics as well as student projects to be written in English. There are quarterly info sessions where we explain the process of writing a thesis with our team.
The last info session took place on April 5th, 2023. Here you can find the slides (part I, part II) and the recording (part I, part II) of the previous session.
For the current topic list, see below. Furthermore, find here a summary of guidelines for working on your thesis with us.
Important Dates
05.04.2023: Info session at 11:30
26.04.2023: Expression of interest
28.04.2023: Topic assignment
26.05.2023: Research proposal submission
01.06.2023: Official start (if proposal sufficient)
Formatting
Please consider the following hints and guidelines for working on your thesis:
- The thesis has to be written using the \documentclass[preprint,review,12pt]{elsarticle} Latex template. You can use the overleaf template for writing your thesis.
- Page limits are as follows
- page limit is for Bachelor Informatik 40 pages and for Kombibachelor Lehramt Informatik 30 pages
- page limit is for Master Informatik 80 pages and for Master Information Systems 60 pages
- The limits do not include cover, table of content, references, and appendices.
Prerequisites
The candidate is expected to be familiar with the general rules of writing a scientific paper. Some general references are helpful for framing any thesis, no matter which topic:
- Wil van der Aalst: How to Write Beautiful Process and Data Science Papers? Archive Report (2022).
-
Jan Recker: Scientific Research in Information Systems: A Beginner's Guide . Springer, Heidelberg, Germany (2021).
-
Jan Mendling, Benoit Depaire, Henrik Leopold: Theory and Practice of Algorithm Engineering . Archive Report (2021).
-
Claes Wohlin, Pär Runeson, Martin Höst, Magnus Ohlsson, Björn Regnell, Anders Wesslén Experimentation in software engineering . Springer Science & Business Media (2012).
-
Ken Peffers, Tuure Tuunanen, Marcus A. Rothenberger, Samir Chatterjee: A Design Science Research Methodology for Information Systems Research . J of Management Information Systems 24(3): 45-77 (2008).
-
Barbara Kitchenham, Rialette Pretorius, David Budgen, Pearl Brereton, Mark Turner, Mahmood Niazi, Stephen G. Linkman: Systematic literature reviews in software engineering - A tertiary study . Information & Software Technology 52(8): 792-805 (2010).
-
Lagendijk, Ad. Survival Guide for Scientists: Writing, Presentation, Email . Amsterdam University Press (2008).
-
Adam LeBrocq: Journal of the Association for Information Systems Style Guide. http://aisel.aisnet.org/jais/JAIS_Style_Guide_2013.pdf
In agreement with the supervisor an individual list of expected readings should be studied by the student in preparation of the actual work on the thesis.
Grading
The grading of the thesis takes various criteria into account, relating both to the thesis as a product and the process of establishing its content. These include, but are not limited to:
-
Correctness of spelling and grammar
-
Aesthetic appeal of documents and figures
-
Compliance with formal rules
-
Appropriateness of thesis structure
-
Coverage of relevant literature
-
Appropriateness of research question and method
-
Diligence of own research work
-
Significance of research results
-
Punctuality of work progress
-
Proactiveness of handling research progress
Recent Topics
If you are interested in one of the following topics, please send an email expressing your interest to Dr. Saimir Bala (firstname[dot]lastname[at]hu-berlin.de). Please explain why this topic is interesting for you and how it fits your prior studies. Also explain what are your strengths in your studies and in which semester of your studies you are.
The next deadline is 26 April 2023.
Topic 1: Federated Learning for Process Monitoring (Bachelor/Master)
Federated Learning enables the learning of machine learning models over distributed datasets, which can then be used for predictive or prescriptive analytics. One promising domain where such models can be applied is process monitoring, where the goal is to predict the remaining time of a business process or potential unwanted outcomes.
This thesis project seeks to explore how federated learning techniques can be utilized in the context of predictive process monitoring. The project can be adapted to either a bachelor's or master's thesis project, and the student can expect close supervision and the possibility of publishing their results if they meet a certain quality threshold. However, it's important to note that a high level of technical expertise and dedication will be required to carry out this project successfully.
Initial References:
- Di Francescomarino, C., & Ghidini, C. (2022). Predictive process monitoring. Process Mining Handbook. LNBIP, 448, 320-346.
- Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3), 50-60.
Supervisor: Stephan Fahrenkrog-Petersen
Topic 2: Building Event Logs based on Semi-structured Data from the German Parliament (Bachelor/Master)
The aim of this project is to build event logs using publicly available semi-structured data from the German Bundestag. The objective is to extract the event data and gain insight into the challenges and difficulties associated with this task. The data describes political decision-making processes within the German Bundestag, and it is hoped that the thesis will also explore how process mining techniques can be applied in such a setting.
This project is suitable for either a bachelor's or master's thesis. The student can expect close supervision and the opportunity to publish their results if they meet a certain quality threshold. However, it's important to note that this project will require a high level of dedication and a good command of German since the data is in German. Additionally, knowledge of natural language processing (NLP) will be highly beneficial for this project.
Initial References:
- Remy, S., Pufahl, L., Sachs, J. P., Böttinger, E., & Weske, M. (2020). Event log generation in a health system: a case study. In Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September 13–18, 2020, Proceedings 18 (pp. 505-522). Springer International Publishing.
- Kecht, C., Egger, A., Kratsch, W., & Röglinger, M. (2021, October). Event Log Construction from Customer Service Conversations Using Natural Language Inference. In 2021 3rd International Conference on Process Mining (ICPM) (pp. 144-151). IEEE.
Supervisor: Stephan Fahrenkrog-Petersen
Chatbots and artificial intelligence (AI) technologies have been widely adopted in various industries to improve business processes and support analysis. Among these AI technologies, ChatGPT, a large language model trained by OpenAI, has gained attention due to its ability to generate human-like responses with great precision and dept. The aim of this thesis is to investigate ChatGPTs ability to support decision-making when deriving use cases from process mining outcomes. Specifically, the research will examine the potential benefits of ChatGPT in improving efficiency and accuracy. To achieve these objectives, the student should apply the design-science approach that creates an artifact in the form of optimized prompts as a means for analysts. The artifact needs to be evaluated in multiple iterations by process experts in order to measure problem-fit.
Initial References:
- White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J. and Schmidt, D.C., 2023. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382.
- Hevner, A., Chatterjee, S., Hevner, A. and Chatterjee, S., 2010. Design science research in information systems. Design research in information systems: theory and practice, pp.9-22.
- Van Der Aalst, W., Adriansyah, A., De Medeiros, A.K.A., Arcieri, F., Baier, T., Blickle, T., Bose, J.C., Van Den Brand, P., Brandtjen, R., Buijs, J. and Burattin, A., 2012. Process mining manifesto. In Business Process Management Workshops: BPM 2011 International Workshops, Clermont-Ferrand, France, August 29, 2011, Revised Selected Papers, Part I 9 (pp. 169-194). Springer Berlin Heidelberg.
Supervisor: Lukas Pfahlsberger
Topic 4: Analyzing File Evolution Trends for Predicting Development Task Completion (Bachelor/Master)
Topic 5: Characterizing iterations in software development (Bachelor/Master)
Iterations are present everywhere and at different levels when it comes to processes. Software development is a set of activities aiming at delivering software. Here, iteration may take various forms, such as breaking down the development activities into phases or chunks or using specific programming styles (e.g., trial and error, test-driven, etc). Hence, iteration is a fundamental concept of systems engineering processes.
However, literature remains unclear in characterizing these iterations. The aim of this thesis is to provide a characterization of the different iteration-types that can be observed when analyzing the development of software. A basic requirement for this thesis is the ability to implement scripts/code that can help with integrating/analyzing software repositories such GitHub, databases, and various semi-structured or unstructured data sources.
Initial References:
- N. Berente and K. Lyytinen, “What Is Being Iterated? Reflections on Iteration in Information System Engineering Processes,” in Concept. Model. Inf. Syst. Eng. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 261–278.
- Von Krogh, G., & Von Hippel, E. (2006). The promise of research on open source software. Management science, 52(7), 975-983.
- Berente, N., & Lyytinen, K. (2008). Iteration in systems analysis and design: Cognitive processes and representational artifacts.
Supervisor: Saimir Bala
Topic 6: Exploring the Digitalization of Creative Work through Knowledge Workers Utilizing AI-Based Tools (Master)
The rapid advancement of artificial intelligence (AI) and its integration into various sectors has led to a digital transformation of creative work, called artificial creativity. This research aims to investigate the impact of AI-based tools on the productivity, creativity, and overall work processes of knowledge workers in the creative industries. The study will examine how these AI-driven technologies are reshaping the nature of creative work and evaluate the potential benefits and challenges associated with their adoption.
This research requires dedication in terms of intense literature review and survey design. The student will be expected to:
- Conduct a thorough literature review on the current state of AI in creative industries, focusing on knowledge workers and their use of AI-based tools.
- Design and implement a mixed-methods approach, combining qualitative interviews with industry professionals and quantitative surveys of knowledge workers in the creative sector.
- Analyze the data to determine the impacts of AI-driven tools on productivity, creativity, and the overall work process.
- Develop recommendations for organizations and individuals in the creative industries to harness the full potential of AI-based tools while mitigating potential risks.
The student can expect close supervision and the opportunity to publish their results if they meet a certain quality threshold.
Initial References:
- Cropley, D. H., Medeiros, K. E., & Damadzic, A. (2023). The Intersection of Human and Artificial Creativity. In D. Henriksen & P. Mishra (Eds.), Creative Provocations: Speculations on the Future of Creativity, Technology & Learning (pp. 19–34). Springer International Publishing. https://doi.org/10.1007/978-3-031-14549-0_2
- Frosio, G. (2023). The Artificial Creatives: The Rise of Combinatorial Creativity from Dall-E to GPT-3. In E. Elgar (Ed.), Handbook of Artificial Intelligence at Work: Interconnections and Policy Implications. https://papers.ssrn.com/abstract=4350802
- White, C. (2023). Opinion: Artificial intelligence can’t reproduce the wonders of original human creativity. The Star. https://www.thestar.com.my/tech/tech-news/2023/01/18/opinion-artificial-intelligence-cant-reproduce-the-wonders-of-original-human-creativity
Supervisor: Jennifer Haase
This research aims to investigate the relationship between the use of diverse prompts and the level of creativity exhibited by ChatGPT, a prominent AI language model. By employing a standardized creativity test, the study will explore how different types of prompts can enhance the model's creative output and determine the optimal conditions for generating highly creative responses.
This research requires dedication in terms of literature review as well as a web search on prompting. The testing will follow a black-box testing design, combined with standardized (human) creativity measures. The student can expect close supervision and the opportunity to publish their results if they meet a certain quality threshold.
Initial References:
- Kocoń, J., Cichecki, I., Kaszyca, O., Kochanek, M., Szydło, D., Baran, J., Bielaniewicz, J., Gruza, M., Janz, A., Kanclerz, K., Kocoń, A., Koptyra, B., Mieleszczenko-Kowszewicz, W., Miłkowski, P., Oleksy, M., Piasecki, M., Radliński, Ł., Wojtasik, K., Woźniak, S., & Kazienko, P. (2023). ChatGPT: Jack of all trades, master of none. https://doi.org/10.48550/arXiv.2302.10724
- Dell’Aversana, P. (2023). GPT-3: A new cooperation scenario between humans and machines. Benefits and limitations of GPT-3 as a coding virtual assistant. https://doi.org/10.13140/RG.2.2.32450.04800
- Organisciak, P., Selcuk Acar, Dumas, D., & Berthiaume, K. (2022). Beyond Semantic Distance: Automated Scoring of Divergent Thinking Greatly Improves with Large Language Models. https://doi.org/10.13140/RG.2.2.32393.31840
Supervisor: Jennifer Haase
Topic 8: Process mining in Industry 4.0: how is process mining being used in the smart manufacturing industry? (Bachelor/Master)
Background:
In Industry 4.0, the use of smart technologies and sensors enhances the productivity of manufacturing technologies. Process mining technologies can be used in the context of the smart manufacturing process as data analysis tool providing insights about the process to the analysts.
Research problem:
The core research problem addressed is: How process mining can be applied to the smart manufacturing industry? The aim is to propose a method to apply process mining techniques to smart manufacturing industry.
Requirements:
The candidate must have previous knowledge on process mining. Further desirable requirements are pro-activity and self-organization.
Initial References:
- Cristina-Claudia Osman, Ana-Maria Ghiran: When Industry 4.0 meets Process Mining. KES 2019: 2130-2136
- Mannhardt, F., Petersen, S. & Oliveira, M. Process Mining and Privacy in Smart
Manufacturing. Informatik Spektrum 42, 336–339 (2019). https://doi.org/10.1007/s00287-019-01199-6 - Dina Bayomie, Kate Revoredo, Stefan Bachhofner, Kabul Kurniawan, Elmar Kiesling, Jan Mendling: Analyzing Manufacturing Process By Enabling Process Mining on Sensor Data. PoEM Workshops 2022
Supervisor: Kate Revoredo
Topic 9: Domain knowledge and process mining: how domain knowledge artifacts are being considered by process mining techniques? (Bachelor/Master)
The quality of the outcomes of process mining techniques strongly relies on the adequate input data. Domain knowledge plays an important role on the task of choosing and preparing the data. Usually process mining techniques rely on domain expert knowledge which can be error prone and not always available. Knowledge representation artifacts such as Ontology or Knowledge Graphs may support the process mining techniques
Research problem:
The core research problem addressed is: How process mining techniques can benefit from domain knowledge artifacts, such as ontologies or knowledge graphs?
This thesis can be conducted in two ways:
- With the aim of proposing a method that uses knowledge graph for supporting process mining
- with the aim of doing a case study, i.e. applying process mining with the support of domain knowledge in a real-world case.
Requirements:
The candidate must have previous knowledge on process mining and knowledge graph. Further desirable requirements are pro-activity and self-organization.
Initial References:
- Mohammad Khanbabaei, Farzad Movahedi Sobhani, Mahmood Alborzi, Reza Radfar: Developing an integrated framework for using data mining techniques and ontology concepts for process improvement. J. Syst. Softw. 137: 78-95 (2018).
- Stefano Bistarelli, Tommaso Di Noia, Marina Mongiello, Francesco Nocera:
PrOnto: an Ontology Driven Business Process Mining Tool. KES 2017: 306-315 - Dirk Fahland:Process Mining over Multiple Behavioral Dimensions with Event Knowledge Graphs. Process Mining Handbook 2022: 274-319
Supervisor: Kate Revoredo
Topic 10: Investigating Bursts of Process Complexity (Master)
A recent study by Pentland and colleagues (2020) uses a simulation design based on Mathlab for studying dynamics of process drift. A key observation of their study is that so-called bursts of complexity can occur. This is surprising and controversial, since there is no empirical basis to this argument. The aim of this Master study is to replicate the simulation design by Pentland and extend it with other types of complexity measurements. The proposition of this work is that the existence of bursts of complexity depends on the way how complexity is measured. Mathlab, Process mining, and implementation skills required.
Initial References:
- Pentland, B. T., Liu, P., Kremser, W., & Hærem, T. (2020). The Dynamics of Drift in Digitized Processes. MIS quarterly, 44(1).
- Augusto, A., Mendling, J., Vidgof, M., & Wurm, B. (2022). The connection between process complexity of event sequences and models discovered by process mining. Information Sciences, 598, 196-215.
Supervisor: Jan Mendling
Topic 11: Process Improvement in the Faculty of Mathematics and Natural Sciences at HU (Bachelor)
Currently, several organizational processes at HU Berlin build on the processing of PDF forms, often via email, between different departments. The objective of this bachelor thesis is to investigate how such PDF-based processes can be systematically redesigned. The aim is to develop a design method and to exemplify it for one specific process by implementing it using Camunda, Jira or a comparable workflow system. German native language is required, also BPM knowledge and implementation skills.
Reference:
- Dumas, M., La Rosa, M., Mendling, J., & Reijers, H. A. (2018). Fundamentals of Business Process Management. Springer.
Supervisor: Jan Mendling
Topic 12: Analysis of Career Paths (Master)
A seminal paper by Andrew Abbott introduces sequence analysis to the social sciences in 1990. Since then, major advancements have been made in the fields of process mining and visual analytics. The aim of this Master thesis is to build a dataset of career paths of different types of professionals including musicians (like in Abbott's work) and e.g. academics, and to develop process mining analysis techniques that address the specific characteristics of these data. Process mining and implementation skills required.
Reference:
- Abbott, A., & Hrycak, A. (1990). Measuring resemblance in sequence data: An optimal matching analysis of musicians' careers. American journal of sociology, 96(1), 144-185.
Supervisor: Jan Mendling
Topic 13: Timeline-based Process Mining (Bachelor/Master)
In process mining, various techniques have been defined for discovering process models from event log data. So far, the main focus of such techniques is to deconstruct the order and dependencies between the different types of events. The aim of this thesis is to develop new analysis techniques that explicitly take the time axis into account. There are various opportunities to design and implement visual analytics in this area. Process mining and implementation skills required.
Initial References:
- Yeshchenko, A., Di Ciccio, C., Mendling, J., & Polyvyanyy, A. (2021). Visual drift detection for event sequence data of business processes. IEEE Transactions on Visualization and Computer Graphics, 28(8), 3050-3068.
- De Smedt, J., Yeshchenko, A., Polyvyanyy, A., De Weerdt, J., & Mendling, J. (2023). Process model forecasting and change exploration using time series analysis of event sequence data. Data & Knowledge Engineering, 145, 102145.
- Yeshchenko, A., & Mendling, J. (2022). A survey of approaches for event sequence analysis and visualization using the esevis framework. arXiv preprint arXiv:2202.07941.
Supervisor: Jan Mendling
Topic 14: Perception of Privacy Guarantees (Master Wirtschaftsinformatik)
Anonymization of data is a common strategy to protect the privacy of individuals. This involves using formal notions of privacy to limit the privacy loss of an individual. While these guarantees are well-understood from a formal perspective, the perceived protection by users is only partially understood.
In this thesis project, a survey will be designed, performed, and analyzed to understand the perception of privacy guarantees. The student can expect close supervision and the opportunity to publish their results if they meet a certain quality threshold. However, it's important to note that this project will require a high level of dedication and knowledge of empirical research methods. It can be carried out as a master's thesis in Wirtschaftsinformatik.
Initial References:
- Wagner, I., & Eckhoff, D. (2018). Technical privacy metrics: a systematic survey. ACM Computing Surveys (CSUR), 51(3), 1-38.
- Krosnick, J. A. (2018). Questionnaire Design. In D. L. Vannette & J. A. Krosnick (Eds.), The Palgrave Handbook of Survey Research (pp. 439–455). Springer International Publishing. https://doi.org/10.1007/978-3-319-54395-6_53
Supervisors: Stephan Fahrenkrog-Petersen & Jennifer Haase