Direkt zum InhaltDirekt zur SucheDirekt zur Navigation
▼ Zielgruppen ▼

Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Modellgetriebene Software Entwicklung

 

Modellgetriebene Software Entwicklung

Abschlussarbeiten

Themen, die noch zu vergeben sind

 


Model Based Development

 

  • An Efficient Graph Matcher for the Henshin Model and Graph Transformation Framework
    Henshin is a model transformation framework which is based on graph transformation concepts implemented on top of the Eclipse modeling framework. The computationally most expensive part supported by the Henshin transformation engine is to find all occurrences of the left-hand side graph of a transformation rule in a larger host graph. The problem to be solved is a variant of the well-known subgraph isomorphism problem, namely to find all subgraphs of the host graph which are isomorphic to the left-hand side graph. The current implementation translates the problem to a constraint satisfaction problem amenable to an off-the-shelf constraint solver. While this approach is generic and works reasonably well for small to medium sized graphs, it suffers from performance and scalability problems for larger host and/or rule graphs. The goal of this thesis is to implement a simple pattern matching algorithm and to try out heuristic methods similar to those for the special case of triangle listing [Ortmann & Brandes 2014] for speeding up the search. Furthermore, experiments on finding partial patterns will be made to determine whether a suitable ordering of the pattern search leads to a speed up. The thesis will be co-supervised by Prof. Kratsch

    Supervisors: Prof. Dr. Timo Kehrer, Prof. Kratsch
     
  • Entwicklung eines Rahmenwerks zur beispielgetriebenen Spezifikation komplexer Editieroperationen
    Komplexe Editieroperationen wie bspw. Refactoring-Operationen sind ein wichtiger Konfigurationsparameter für viele Werkzeuge zur Unterstützung von Modellevolution im Kontext der modellbasierten Softwareentwicklung (MBSE). Hierfür werden meist deklarative, regelbasierte Spezifikationen von Editieroperationen benötigt, welche mittels dedizierter Modelltransformationssprachen formuliert werden können. Für Domänenexperten ist eine präzise Spezifikation aus mehreren Gründen nicht trivial, so z.B. weil Modelltransformationen auf der abstrakten Syntax von Modellen arbeiten, welche von der konkreten Syntax oftmals erheblich abweichen kann. „Model-Transformation-By-Example“ ist ein vielversprechender Ansatz, diese Hürde zu überwinden. Domänenexperten demonstrieren hierbei die Transformation in Form von Beispielen in ihrer gewohnten Modellierungsumgebung, und aus den Beispielen wird die Transformationsspezifikation automatisch abgeleitet. Allerdings ist die Spezifikation nur so gut wie die zur Verfügung gestellten Beispiele. Ziel dieser Arbeit ist es, ein Rahmenwerk zu schaffen, um Modellierer in den Prozess der beispielgetriebenen Transformationsspezifikation in Form einer Feedback-Schleife einzubinden. Grundlegende Idee ist es, die aus Beispielen inferierten Editieroperationen in Standardeditoren zu integrieren, um so auf weitere Modelle anzuwenden zu können. Entsprechende Resultate sollen durch die Modellierer beurteilt werden, um so neue Beispiele zur inkrementellen Verfeinerung der Transformationsregeln zu erzeugen.

    Supervisors: Prof. Dr. Timo Kehrer
     
  • Systematische Behandlung von Variabilität bei der Generierung von Editieroperationen für Modelle in der modellbasierten Softwareentwicklung
    Editieroperationen auf Modellen sind ein wichtiger Konfigurationsparameter vieler variabler Werkzeuge zur Unterstützung von Modellevolution im Kontext der modellbasierten Softwareentwicklung (MBSE). Konsens herrscht in der Literatur darüber, dass die Menge der zur Verfügung stehenden Editieroperationen an den Modelltyp angepasst werden muss; so sind bspw. für UML-Klassendiagramme andere Editieroperationen sinnvoll als für Matlab/Simulink-Modelle. Ebenfalls Konsens herrscht darüber, dass sich die zur Verfügung stehenden Editieroperationen grob in komplexe und elementare Operationen klassifizieren lassen. In dieser Arbeit sollen elementare Editieroperationen betrachtet werden. Tatsächlich existieren hierfür bereits verschiedene Variationspunkte in der Spezifikation der Editieroperationen, welche bislang nicht in systematischer Weise untersucht wurden. Eine solche systematische Analyse ist Gegenstand dieser Arbeit mit dem Ziel, die Ergebnisse in einem Variabilitätsmodell formal zu dokumentieren. Die Ergebnisse sollen ferner zur Konfiguration eines bereits existierenden Generatorwerkzeugs für Editieroperationen genutzt werden.

    Supervisors: Prof. Dr. Timo Kehrer
     
  • Abgleich objektorientierter Datenmodelle mittels Verfahren des Ontologie-Matching
    Werkzeuge zum Abgleich von Modellen werden für unterschiedlichste Zwecke in allen Phasen modellbasierter Softwareentwicklungsprojekte benötigt. Gängige Verfahren des Modellvergleichs unterstellen hierfür meist eine Menge von Modelleigenschaften, welche die Identifikation korrespondierender Modellelemente stark erleichtert. Beispiele hierfür sind die Verfügbarkeit von eindeutigen Identifizierern von Modellelementen oder die Stabilität von Namen evolvierender Modelle. Diese günstigen Rahmenbedingungen treffen jedoch in frühen Projektphasen nicht zu, da Modelle hier oftmals unabhängig voneinander entstehen, so z.B. in Form von unterschiedlichen Sichten verschiedener Stakeholder. Ferner existiert meist noch keine projektweit einheitliche Terminologie für fachliche Entitäten. Ziel dieser Arbeit ist es zu prüfen, inwiefern Verfahren des Ontologie-Matching, also Verfahren zur Bestimmung korrespondierender Konzepte in Ontologien, hierfür geeignet sind. Im Fokus stehen dabei objektorientierte Analysedatenmodelle, da deren Beschreibungssprachen, bspw. Klassendiagramme der UML, eine hohe Ähnlichkeiten zu Ontologie-Beschreibungssprachen aufweisen.

    Supervisors: Prof. Dr. Timo Kehrer
     
  • Graph Grammar-Based Fuzzing for Testing Model-Based Software Engineering Tools
    Fuzzing is an established technique for testing programs that take structured inputs; desktop publishing tools, web browsers etc. being typical examples of this. The basic idea is to systematically generate input documents causing unexpected program behaviors, typically program crashes. Grammar-based fuzzing has been studied in the context of textual documents whose structure may be described using traditional context-free grammars. That is, documents used as test inputs are generated with the help of the production rules specified by the grammar. This thesis shall investigate how this idea can be transferred from context-free grammars to graph grammars, which provide a constructive means for specifying the abstract syntax of visual models in model-based software engineering. The ultimate goal is to come up with an effective yet generic technique for fuzz testing model-based software engineering tools.

    Supervisors: Prof. Dr. Timo Kehrer

 

Software Version and Variant Management 

  • Edit Script Merging (Cooperation with Ulm University)
    Synchronising branches of forks is done by merging or rebasing the commit history with pull-requests or cherry-picking. When adopting feature trace recording, we store the history of development not on commit level but on edit level (i.e., we record the edits developers make). Thus, we have access to the development history at a more fine-grained level in form of an edit script (i.e., a list of edits).

    Goal: Design an edit-based merge operator that merges edit scripts instead of states or coarse-grained commit histories. Compare your operator against existing merge tools such as git merge.

    Questions:
    - To what extend can we merge edit scripts automatically?
    - What is the state-of-the-art? Are there edit-based merge operators yet?
    - Does Edit Script Merging increase a merge‘s accuracy?

    Supervisors: Alexander Schultheiß, Paul Maximilian Bittner (Ulm University), Prof. Dr. Timo Kehrer
     

  • Semantic Lifting of Abstract Syntax Tree Edits (Cooperation with Ulm University)
    Many tasks need to assess the changes made to development artefacts. A common way to detect changes are diffing operators that take the old and new state of an artefact as input and yield an edit script (i.e., a list of edits that transform the old state to the new state). For example, git diff computes a list of source code lines that were inserted and a list of lines that where removed.

    When computing such an edit script, the edits are a valid transformation indeed but might not the actual changes of the developers. In this sense, the edits might diverge from what developers actually did. This becomes more prominent when looking at diffs for abstract syntax trees (ASTs) where edit scripts consist of possibly very fine-grained tree operations.

    To this end, we classified semantic edits as edit operations that describe developer‘s intents more accurately when editing ASTs.

    Questions:
    - Can we lift edit scripts to semantic-edit scripts?
    - Do semantic edits represent developers actions more accurately?
    - Do existing tree diffing algorithms yield edits impossible for programmers to execute on the concrete syntax?
    - Are semantic edits sufficient to assess developers' intents?
    - Does semantic lifting increase diffing accuracy?

    Goal: Create an algorithm for semantic lifting of edit scripts obtained from off-the-shelf tree differs to semantic edits. Optionally, extend the notion of semantic edits for increased accuracy. Evaluate your results on existing diffs.

    Supervisors: Alexander Schultheiß, Paul Maximilian Bittner (Ulm University), Prof. Dr. Timo Kehrer
     

  • History-Based Code Matching 
    Various tools that automate aspects of software development require a matching of the code belonging to different versions or variants of a software. One simple use case is merging two different branches in a version control system. Git, for example, applies changes based on naive name-based and line-based matching. However, this can lead to merge conflicts if both versions were edited in parallel. Some of these conflicts might be avoided, if more accurate matchings are available.

    The core idea of this topic is to calculate a matching between two different versions based on the development history of the software. Usually, both versions have a common ancestor in the history: the commit from which the branch was created. This common ancestor and the history of changes that were applied over time can potentially be used to improve the accuracy of matches between different code parts.

    Goal: Implement a prototype that calculates a matching for two versions of a software based on the software's development history. Evaluate the approach and compare it to similar approaches, such as Git's naive approach. 

    Questions:
    - Can we improve the accuracy of a code matching between different versions by considering the development history, in comparison to similar approaches?
    - Can we reduce the number of merge conflicts by calculating better matches?

    Supervisors: Alexander SchultheißProf. Dr. Timo Kehrer
     

  • Investigating the Effect of Variant Drift 
    Software is often released in multiple variants to address the needs of different customers or application scenarios. One frequent approach to create new variants is clone-and-own (copy-paste), whose systematic support has gained considerable research interest in the last decade. However, only few techniques have been evaluated in a realistic setting, due to a substantial lack of publicly available clone-and-own projects which could be used as experimental subjects.

    Many studies use variants generated from software product lines for their evaluation. Unfortunately, the results might be biased, because variants generated from a single code base lack unintentional divergences that would have been introduced by clone-and-own.
    The goal is to assess the performance of state-of-the-art algorithms on variant sets exposing increasing degrees of divergence. 

    Goal: Implement a prototype that introduces increasing degrees of divergence in a set of software variants (e.g., by applying refactoring operations). Evaluate the effect that variant drift has on the evaluation of tools supporting clone-and-own. 

    Questions:
    - Do techniques supporting clone-and-own perform better on product-line variants than on clone-and-own variants?
    - Do more sophisticated techniques generally yield better results than simple ones?
    - Does increasing variant drift reveal differences in the quality of results delivered by techniques supporting clone-and-own?

    Supervisors: Alexander SchultheißProf. Dr. Timo Kehrer

 

Feature Trace Recording
(In Cooperation with Ulm University)

The following topics are in cooperation with Ulm University. Please contact Alexander Schultheiß for more information. 

Despite extensive research on software product lines, ad-hoc branching and forking, known as clone-and-own (copy-and-paste), is still the dominant way for developing multi-variant software systems. Retroactive productline migration techniques suffer from uncertainties and high effort when recovering lost domain knowledge on tracing features to their implementation. We investigate a methodology for gradually incorporating domain knowledge during software development by recording feature traces upon source code edits.

One could say that the goal of feature trace recording is to offer systematic support for managing projects that heavily rely on copy-paste. Without systematic support, these projects would eventually become unmanagable. 

 

feature is characteristic of a program that can either be activated or deactivated (for example via plug-ins, preprocessor macros (#ifdef), or services).

feature trace states for an artefact (e.g., a file, a class or a method) to which feature the artefact belongs. For example, the method of a class is associated with the feature GUI_Support and should only be present if a user should be able to interact with the program via a GUI. If the feature GUI_Support is disabled, the method is omitted from the compiled program. 

For Feature Trace Recording, developers specify the feature they are working on as a propositional formula (the feature context) during development. From the type of edits developers make (e.g., how they program) and the feature context, we derive feature traces for implementation artefacts such as source code. Hereby, more and more feature traces are recording in the course of development, improving the maintainability of the program over time. 

There are various open thesis topics related to feature trace recording. The following list includes examples of available topics. More are available on request.  

 

 

Scientific Software Engineering, Data-oriented Software Engineering

 

  • Domain Specific Language for Transfer Learning / Uncertainty Quantification
    Recent breakthroughs in biomedical image analysis have been based on the application of advanced machine learning for large imaging data sets. For instance, convolutional neural network-based classifiers have recently been shown to reach or even exceed human-level performance in a range of disease- and abnormality detection tasks, based on tens of thousands of images. However, such machine learning approaches often fail when confronted with changes to the input data, e.g. images from a different MRI machine, or patients of a different age group or ethnicity. Uncertainty quantification and transfer learning are used to identify and alleviate such problems.  
    In this project, we are looking for a student to design a domain specific language (DSL) for uncertainty quantification and transfer learning on biomedical images based on our existing code base in python. Specifically, the task is to identify commonalities between neuroimaging and microscopy data analysis, and design and build a framework to evaluate state-of-the-art methods for uncertainty quantification and transfer learning in both analysis settings.

    Supervisors: Prof. Dr. Timo Kehrer, Prof. Dr. Kerstin Ritter