1. Introduction
Traditional science often follows the path of formulating a problem statement, creating a scientific experiment as a verification environment and then running the experiment to gather data, that is then analyzed to either verify or falsify made assumptions. However, nowadays setting up real world environments for doing experimental evaluations of problem statements is often too costly and time consuming. Due to this, the experimental environment is often replaced by digital solutions using software models run in simulators and other scientific software tools for performing tasks, substituting the real world experimental workflow for gathering data with a digital workflow in silico (in the following called “computational scientific workflow”). The research field of computational science tackles challenges like these by creating necessary tools and research methodologies for offering software solutions supporting digital workflows performed by scientists of any research fields that need such software solutions. Aspects relevant to the respective research are formalized and transformed into computer models, which provide efficient means to analyze large sets of data intensive scenarios and utilize advanced computing capabilities. Across disciplines, the model landscape comprises a wide variety of model types, each focusing on certain facets of a system with individual levels of detail and resolution and applying different methodologies for simulation, optimization, and statistical data processing. To gain a broader view on a system, it is typically necessary to use multiple models, simulators and other auxiliary tools (e.g., optimization tools), and combine their input and output data flows, forming a more complex computational scientific workflow for the evaluation of key system properties. Such a workflow can be formally described by defining the connection between model output and input streams and the coordinated execution of the model logic to implement sequential, iterative, or simultaneous execution logic. If more than one simulator is involved, other more complex coordination mechanisms, e.g., time step-based or event-based synchronization of simulator execution, may be needed for modelling co-simulation workflows. To achieve a higher degree of workflow automation, the model tasks as part of an automated workflow can be complemented by any other sort of auxiliary processing task, e.g., tasks performing data transformation, validation, visualization, etc.
Figure 1 depicts the role of software instrumented workflows within the context of computational science, which provides a digital equivalent to the traditional experiment setup. Most often in this case, scientific research begins with the formulation of a research question that can be more or less concretely stated. The benefits of researching this topic and the impact of potential findings are described. After the question has been stated, typically a profound investigation, e.g., based on literature research of the thematic background, is conducted, identifying the most advanced models, procedures and processes that might be helpful in answering the question. The found models, procedures and processes must then be compiled into the digital equivalent of the classical experiment setup so that the digital experiments can then be performed by executing the digital workflow. Depending on the availability of prerequisites found by this investigation, the subsequent research approach is determined. If the found state-of-the-art models and methods are already implemented in a form fitting to the investigator’s software-infrastructure, they can directly be integrated and used. If this is not possible, new model implementations or the implementations of dedicated software for e.g., implementing a certain control algorithm, become necessary. After all models and auxiliary software tools have been implemented, the foreseen digital experiment workflow can be (manually) executed by inputting adequate test data and gathering result data until all respective scenarios have been sufficiently investigated. The final model results, which are typically already post-processed and visualized for human readability at the end of a workflow, are then put into the context of the research topic and the results of previous scientific work. The outcomes are carefully interpreted and conclusions drawn within the respective scientific community. In the best case, a satisfying answer to the originally stated research question can be found. It is typically the case that research findings lead to new research questions requiring extensions and adjustments to the previous workflow, which will be the beginning of the next iteration within the workflow development cycle.
For implementing the coordination and interaction between workflow tasks, several approaches can be employed. In a tight coupling approach, concrete task implementations directly refer to each other in their source codes. This approach is constrained to the use of suitable programming frameworks for implementing the interprocess or distributed communication logic and requires an intimate knowledge of the individual models’ internal procedures (white box). The immediate dependencies between model functionalities lead to a low level of modularity hindering reuse of model code in other contexts and internal refactoring. The replacement of individual model components or versions involves high reprogramming, testing, and documentation efforts. Loose coupling on the other hand aims for an abstraction of model components where each component implements a more generic extrinsic interface e.g., defining input and output functionality, allowing flexible configuration for adapting the module to different applications contexts and encapsulating the internal model logic (black box). Utilizing this modular approach to implement a workflow, modules can be used in different application contexts and only knowledge about their respective interfaces is needed. The replacement of components within a workflow implementing a compatible extrinsic interface does not require code changes in other tasks but only requires adjustments to the runtime configuration responsible for executing the task component. The more the respective tasks use common interface standards for e.g., describing the exchange of data or configuration information, the less adjustments are needed to integrate these components in a workflow. Although a loose coupling approach allows for a higher degree of freedom in combining programming languages and environments, the design of a stable extrinsic interface providing compatibility with standard exchange interfaces to other related software tools has to explicitly be considered during the construction of a workflow.
In context of researching and evaluating the behavior of complex system setups, the manual executions of digital scientific workflows—where many manual steps are performed by using different software tools—become more and more tedious. Often, automated coordination of the execution of one or more software tools is needed to perform e.g., a co-simulation. Moreover, this form of coordination is programmed by scientists themselves-as an auxiliary tool by using one of the strategies introduced previously, or a certain framework (e.g., a co-simulation framework) is used, which performs the coordination but also requires writing glue code. This all contributes to a large overhead that scientists are confronted with when setting up their digital equivalent of an experimental setup, and that hinders them from concentrating on performing the actual experiments and evaluating the results. Thus, a central research question nowadays is to determine whether generic computational workflow execution platforms can be created to fully automate the execution of complex computational workflows and free the scientists from manually implementing different kinds of auxiliary tools for e.g., implementing coordination or data transformation.
Considering the fast growing complexity of modern technical systems and the need for efficiently performing many interconnecting software-based scientific tasks for their evaluation, an organization’s development of auxiliary tools and setup of a software environment that facilitates easy-to-use workflows is a strategic issue with significant impact on research efficiency and long-term consequences. Against the background of the software used by the respective scientific community, the choices of programming languages and environments, operating systems, optimization solvers and other supporting tools and libraries, interface standards, file formats, etc. directly influence the potential to couple the organization’s model set with those of fellow researchers. Since the investigation of complex and interdisciplinary research questions is usually not handled by a single organization, but is made possible through collaborations within the community, this potential is of particular importance. At the same time, changing an organization’s software strategy is not an easy task. Models and other executables are already implemented, making redevelopments utilizing other programming languages and environments seldom worthwhile. Additionally, the retraining of employees and the rebuilding of programming experiences constitute cumbersome exercises. This makes coupling environments that allow the integration of existing tools and models with as little effort as possible very valuable.
In addition, while ensuring sufficient flexibility in model coupling, the requirements for efficiently processing the increasingly complex workflows—which is indispensable for the achievement of meaningful and reliable results—present a growing performance challenge. In contrast to white box integrated applications, which are often internally designed for parallel computing by using a parallel computing framework for implementation, loosely coupled workflows are usually constructed by chaining existing model components together without efficient parallel execution of the components in mind. It follows that the logic to run subprocesses and entire workflows in parallel as well as the communication control and consistency assurance mechanisms have to be designed, implemented, and tested with each newly drafted workflow. The workflow engineering and runtime environment should allow the instrumentation of such features as essential parts of a scientific computing workflow as easy as possible.
Within this overall context of efficiently performing computational science related tasks for providing such a scientific workflow environment as essential software infrastructure is an open research question, especially for energy research projects that bundle the smart energy systems related research of several research centers. One key element of such a research environment is a software platform that easily allows the research centers to not only share data but especially integrate their partly software solutions for performing digital workflows (e.g., models of new storage solutions, technical energy plants, optimization and control algorithms) into bigger, fully automated workflows that implement integrated energy system co-simulation models or other complex computational scientific workflows. The requirements that have to be met by such a platform can be summarized as follows:
automated set up and execution of computational scientific workflows and co-simulations
reusability of integrated executables with their specific dependencies and runtime environments
configurable communication between executables without the need for changing executables or specific implementations of interfaces and adapters by the users
availability of an easy-to-use web interface to build, operate and manage scenarios
parallelization and coordination of workflows for increasing performance
The remainder of this article is organized as follows:
Section 2 presents the main results of a literature review on already existing solutions for efficiently automating complex computational scientific workflows. Since no solution (either commercial or open source) was found that fulfills the discussed requirements, further research on developing a flexible framework as a solution for automating computational scientific workflows within the energy research software environment was conducted. Within this context, a background literature review was performed for finding state-of-the-art methodological approaches and reusable frameworks that could be used as building blocks for a comprehensive solution. The thereby selected technologies and used basic approaches are described in
Section 2.2. In
Section 3, the architectural design of the workflow automation framework is described. The architecture addresses the goals of a generic, modular, and highly scalable framework supporting the loose coupling of executables. Additionally, by instrumenting cluster computing environments and modern runtime automation technologies, it is capable of efficiently performing complex data processing and co-simulation workflows with high levels of performance and automation. Furthermore, a web-based editor and runtime interface is presented, providing an easy-to-use user interface for defining and managing scientific workflows. In
Section 4, for the purpose of testing and evaluating the presented framework, a typical scientific model chain within the field of Energy Systems Analysis is used as an example workflow and implemented within the framework. The setup of this example workflow and the overall procedure for evaluating and benchmarking the framework are explained. The implications of the chosen architectural design and the benchmark results are then discussed in
Section 5.
Section 6 summarizes key results discussed in the article and gives an outlook on future work.
5. Discussion
In addition to the discussion of the observed runtime behavior of the presented framework, in this section also the architectural design decisions are considered as they have significant impact on the software domain processes within the workflow development cycle of computational science like Energy Systems Analysis (compare
Figure 1). A major aspect in this regard is, that the usage of the container virtualization employed by the framework allows the integration of all kinds of models and other executables, regardless of the operating system or the used programming languages. As long as all dependencies needed for execution are provided, Docker files and images of any application or service can be built and embedded into a workflow. By using the framework, it therefore is no longer necessary that already existing models—while meeting the corresponding scientific requirements—will have to be redeveloped, only because they cannot be integrated into a research organization’s software infrastructure. As a result, a considerable amount of development and implementation effort can be saved with each complex model that can additionally be treated as a black box and does not have to be reverse engineered. However, if no appropriate model is available, a new model still has to be developed from scratch. But also in this case the chosen architecture is an asset, indirectly allowing a high degree of freedom for choosing the technology stack for model implementation, since all programming languages, technology environments, and resources can be used within the software implementation process.
Besides the possibility to integrate the most diverse executables, the involved effort has to be discussed in order to describe the usefulness of the framework for supporting computational research processes. As explained in
Section 3.1 and
Section 3.4, a Docker file has to be written to make an executable applicable in line with the framework. As exemplary shown in
Figure 11, this includes in detail the listing of all program libraries and other software components the executable is depending on. Then the source code file of the executable is named and an individual set of adapters, which themselves come with the framework, is selected. The communication and management setup is the same for all executables and does not have to be modified by the user. Afterwards, with a single instruction in the command line, a corresponding Docker image is built, which then can easily be integrated into a workflow via the Apache NiFi user interface. This short setup process can be completed in minutes even by users with little programming experience, regardless of whether the executable is an in-house development. Requiring interface knowledge only, the framework performs significantly better in model integration than using a more tight coupling approach, which always involves sophisticated and time-consuming code base alterations. Also, a loose coupling approach—which is on the one hand typically based on an individually programmed workflow coupling script or on a kind of workflow or co-simulation-based framework and on the other hand far more inflexible in terms of integrating different technologies—requires at least the same amount of integration effort as using the presented framework. Furthermore, the framework offers the advantage that the integration effort only needs to be done once per model, not once per workflow like in the tight and loose coupling approaches of other frameworks, since the created Docker images and the NiFi processors are stored in a Docker registry and in a NiFi processor library respectively and can be reused in all future iterations of the workflow development cycle, which is particularly beneficial in the dynamic field of computational research.
Another major characteristic is that, unlike the other coupling approaches, the presented framework does not require any additional programming effort to automatically execute multiple instances of a workflow or parts of a workflow in parallel, since this coordination functionality is already inherent to the NiFi processor logic described in
Section 3.6. The benchmark results presented in
Section 4.1 show that the induced memory overhead is almost negligible and the slight increase in runtime is easily compensated by utilizing the parallel computation capabilities. At the same time, as described in
Section 3.6 and
Section 4.2, the architecture includes generic functionality to synchronize results from parallel processed calculations, which allows the supplementation of workflows with components to aggregate, consolidate, and prepare result data for evaluation and interpretation, tasks necessary at the end of every scientific experiment. The combination of those both features prevents, that the logic to run subprocesses and entire workflows in parallel as well as the communication control and consistency assurance mechanisms have to be designed, implemented, and tested with each newly drafted workflow.
It is the authors opinion that all the discussed potentials for saving time and effort in the technical coupling of executables to workflows and the decrease of required software development skills easily compensate the amount of effort that has to be made for installing and learning the framework, and that the presented approach can significantly help streamlining the workflow development processes of an institution performing research within the domain of computational science. Once the framework is set up in an institution and the models and auxiliary tasks are instrumented as NiFi processors within the workflow platform, all scientists of an organization can directly benefit from the reuse of components, which are immediately available within the platform.
6. Conclusions
The efficient automation and parallel computation of complex workflows are of increasing importance for performing computational science. When coupling heterogeneous models and other executables, a wide variety of software infrastructure requirements must be considered to ensure the compatibility of workflow components. The consistent utilization of advanced computing capabilities and the implementation of sustainable software development concepts that guarantee maximum efficiency and reusability are further issues that must regularly be met by scientists within research organizations. This article addresses these challenges by presenting a generic, modular and highly scalable process operation framework for efficient coupling and automated execution of complex computational scientific workflows. By implementing a microservice architecture that utilizes Docker container virtualization and Kubernetes container orchestration, the framework supports the flexible and efficient parallelization of computational tasks on distributed cluster nodes. Additionally, the use of Redis and different I/O adapters offer a scalable and high-performance communication infrastructure for data exchange between executables, allowing the computation of workflows without requiring the adjustment of executables or the implementation of interfaces or adapters. The specification, processing, controlling, and evaluation of computational scientific workflows is ensured by a convenient and easily understandable user interface, which is based on Apache NiFi technology. By implementing and executing a complex Energy Systems Analysis workflow, the performance of executing workflows with the presented framework was evaluated. The memory footprint for running an executable using the framework is similar to a manual execution. Moreover, the performance and running environment for single model instances of the framework are also nearly equivalent. Due to the high scalability and extended flexibility of the framework, use cases benefitting from parallel execution can be parallelized thereby significantly saving runtime and improving operational efficiency, especially during complex tasks like iterative grid optimization.
In order to consolidate the results presented in this article and to further verify the usefulness of the framework, the next step is to broaden the user base and gain experience with additional use cases. As part of the further development of the presented framework, the aim is to provide users with the opportunity to graphically depict workflow information and to enhance data management and examination capabilities. Furthermore, a more comprehensive user interface will be implemented to allow the upload and automated integration of customized executables into the framework.