Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Institut für Informatik

Promotionsvortrag Fabian Lehmann

"Adaptive Scheduling of Dynamic Workflows"

Participation by zoom is possible; please contact Prof. Leser for a link.

 

================================================

Scientific workflows are the state-of-the-art approach for analyzing large datasets across a wide range of domains, including bioinformatics, remote sensing, astronomy, physics, and many others. In recent years, significant effort has been devoted to the portability and reusability of workflows, enabling a workflow developed by one research group to be reused on another group’s cluster or with another dataset, with
only minimal to no adaptation. For this purpose, ScientificWorkflow Management Systems (SWMSs) have become agnostic to the infrastructure, enabling similar execution on HPC, in the Cloud, or on in-house clusters. As a result, SWMSs often lack mechanisms for users to optimize aworkflow’s execution for their specific dataset and cluster. The absence of fine-grained optimization typically leads researchers to overprovision resources, which in turn produces suboptimal scheduling plans, resulting in unnecessarily long makespans and requiring scientists to wait longer for results.

We identify four potential bottlenecks that increase the workflow’s makespan, arising from limitations in the state-of-the-art technology for reusing workflows on different clusters or datasets.
a) Scheduling decisions are based on incomplete information, as relevant metadata is not properly transferred to the compute infrastructure. This lack of context results in suboptimal workflow executions.
b) Network performance varies across clusters, causing workflows to experience I/O bottlenecks on some, but not all, systems.
c) Users tend to overestimate memory requests per task, as the value needs to hold true for other data sets as well, and often fluctuates.
d) As with memory, the CPU is often overpredicted, since it is assumed that all threads of a task run at 100% all the time.

In this thesis, we develop an interface to exchange workflow information between an SWMS and a Resource Manager (RM). This integration allows for more informed scheduling decisions and adaptive resource allocation. Through this interface, RMs can obtain sufficient information to tailor workflow execution to their specific infrastructure characteristics automatically.

Building on this foundation, we develop two adaptive workflow optimization strategies. First, we present a new scheduling approach that considers the workflow’s structure and data transfer between tasks, leveraging speculative copies to prepare tasks across multiple nodes, to resolve scheduling and I/O bottlenecks, addressing bottlenecks (a) and (b). Second, we introduce a new approach to size memory requests of tasks online, aiming to improve throughput in memory-limited clusters,
addressing bottleneck (c). Finally, we combine these strategies and extend them with a CPU-sizing approach to demonstrate their combined benefits and overall potential, addressing bottleneck (d).
The results of our prototype, together with Nextflow and real state-of-the-art workflows and datasets, indicate that we can reduce the makespan substantially on state-of-the-art clusters.