A Comprehensive Review of Evolutionary Algorithms for Multiprocessor DAG Scheduling

: The multiprocessor task scheduling problem has received considerable attention over the last three decades. In this context, a wide range of studies focuses on the design of evolutionary algorithms. These papers deal with many topics, such as task characteristics, environmental heterogeneity, and optimization criteria. To classify the academic production in this research ﬁeld, we present here a systematic literature review for the directed acyclic graph (DAG) scheduling, that is, when tasks are modeled through a directed acyclic graph. Based on the survey of 56 works, we provide a panorama about the last 30 years of research in this ﬁeld. From the analyzes of the selected studies, we found a diversity of application domains and mapped their main contributions.

An multiprocessor task scheduling problem (MTSP) instance, adapted from Reference [3], with eight tasks and two processors. In this figure, cc represents the communication cost; for instance, t 6 waits two seconds to start its execution because task t 3 is sending a message throw network.
Similar to many optimization problems in the real world, the search for the optimal MTSP solution has a high computational complexity [1,2] and cannot be solved promptly by brute force techniques. One of the most investigated approaches to this type of problem is the use of techniques that, in many cases, discover a set of potential solutions, from which one can find the global optimum or an approximate solution to that problem. In this context, evolutionary algorithms (EAs) have been investigated as a search method due to their flexibility to solve complex optimization problems.
EAs [4] can be adapted to various types of problems. Inspired by the evolutionary paradigm, these techniques work with the evolution of several solutions (population) simultaneously, ensuring a significant advantage when compared to other techniques. There are several studies involving EAs applied to the scheduling problem, and this number increases every year. For this reason, it is necessary to systematically organize these works in order to summarize their main characteristics, understand contributions, authors, and institutions that maintain relevant publications. With such information, it is possible to identify domains that may receive attention to additional research in the future.
One way to investigate a significant amount of studies, and synthesize their contributions, is through a Systematic Literature Review (SLR) [5,6]. SLR consists of a rigorous procedure to select and analyze works that fit a research question without following a bias on the part of the researchers. Moreover, the results of SLR may be replicated and extended as needed.
This paper proposes the application of SLR techniques to find a collection of works that proposes EAs to MTSP. In this way, we present a comprehensive survey of these works through a meta-analytic approach, examining authors, institutions, and other aspects. We organize works considering task characteristics, computational environments, and optimization criteria. This survey covers an interval of 30 years (from 1990 to 2019) of research papers; thus, we also map the historical evolution of studies in this scientific field. Our investigation considered four digital libraries that index computing studies; after the three stages of papers analysis, we created a collection composed of 56 studies (This paper is an extension of previous work by the same authors [3]. The original work focused only on journal articles and was limited to studies based on genetic algorithms. Moreover, that study was less rigorous in terms of inclusion and exclusion criteria and considered works until 2018.). The next section of this paper details our research protocol, presenting research questions, search key, and inclusion/exclusion criteria. Afterward, Section 3 presents the collection of 56 papers and provides a meta-analysis of them. Finally, we draw our concluding remarks (Section 4) and expose the references list.

Research Protocol
According to Kitchenham [5], the Systematic Literature Review (SLR) is a process of interpreting and evaluating relevant studies related to a research question, topic, or phenomenon. Consequently, we consider the SLR as an outstanding tool for aggregating experiences from a collection of different works [7].
The execution of an SLR follows some formal steps, named research protocol: (i) limit the research question; (ii) compose a search key; (iii) choose the databases; (iv) establish the inclusion/exclusion criteria; and (v) report the stages of the review.
The first step of an SLR consists of limiting the scope of the research. In the context of this paper, we are interested in mapping the last three decades of EAs applied to the MTSP. Moreover, we desire to characterize the application scenarios of such algorithms, that is, we intend to record which authors are dealing with which particularities of this problem (e.g., presence of communication among tasks, processors heterogeneity, and others).
Having this scope in mind, and concerning the SLR protocol, we formulated the main research question (referred to as MQ), as follows: For survey works corresponding to the questions above, we drew up a search key and applied it to the search mechanisms of the digital databases. For this, we eliminated corresponding keywords and combined different subjects using the logical operators OR and AND, respectively. Since the search key influences the behavior of the search engine, it is necessary to build it verifying each employed term. Table 1 presents the search key constructed in this study. ("evolutionary algorithm" OR "genetic algorithm") AND "task scheduling" AND (parallel OR multiprocessor) AND ("directed acyclic graph" OR DAG OR workflow) AND (representation OR encoding) AND "genetic operators" AND "objective function" In our research, we considered "evolutionary" and "genetic" algorithms as synonymous. Indeed, several authors do not provide a proper distinction between both terms; for this reason, we opted to maintain both expressions and filter results (according to De Jong [4], EAs are a class of algorithms and genetic algorithms are techniques from such class). We also regarded the words "graph", "workflow" and "DAG" in conjunction, as suggested by Robert [1]. Besides, we included both "parallel" and "multiprocessor" terms.
Finally, the search key includes the expressions "representation", "encoding", "genetic operators", and "objective function". All of these are specific for the context of evolutionary algorithms, and we considered them in order to select papers that provide details about algorithm design.
After preparing the search key, it is necessary to select some digital databases. Such databases must contain search engines that reproduce the semantics of the key to find related works. Table 2 shows the bases used in this study. It is necessary to highlight that intended to consider the ACM Digital Library (https://dl.acm.org/); however, its search engine did not reflect the main aspects contained in the search key, resulting in unsatisfactory outcomes when compared to the other databases. Although the search key encompasses several terms related to research questions, the amount of selected works on digital databases can be pervasive. Thus, the protocol used in the SLR must define some inclusion and exclusion criteria that reflect the format and desired aspects of the investigation. The criteria adopted in this protocol are:

1.
Papers must deal with MTSP in its static version and without the use of release times and deadlines (in other words, all information related to the problem is known before scheduling and the execution time is not limited); 2.
The research papers must deal with MTSP using EAs solutions, that is, EAs are the focus of the study; 3.
Selected studies cannot combine EAs with other meta-heuristics; 4.
All the papers must be in the language; 5.
Papers must be available for consultation on Internet platforms.
Based on all the rules defined here, we performed the following steps, named search phase, as required by the SLR protocol (we documented each SLR execution phase using annotations and spreadsheets-thus, we may extend this survey or adapt it to answer other research questions.): 1. Search in search engines: We perform the search step and collected all resulting works for future analysis; 2. Results filtering: we discarded works that did not meet the inclusion/exclusion criteria; 3. Manual search: we carried out a new survey of works by looking at the bibliographic references of the first pre-surveyed collection. We also submitted these new studies to the inclusion/exclusion criteria.

Analysis of the Collection
This section presents the collection of works recovered with the SLR. Each part answers one or more secondary questions presented in our SLR protocol. Thus, in Section 3.1, we answer the SQ.01; Section 3.2 deals with SQ.02 and SQ.03; Section 3.3 answers SQ.04, and Section 3.4 SQ.05. Sections 3.5 and 3.6 are related to SQ.06 and SQ.07, respectively. Finally, we present a summary in Section 3.7, answering our main research question (MQ).

Collection and Publication Timeline
This SLR is part of a more comprehensive research project [3,8]. For this reason, we collected the data into separated moments. The first collection took place in April 2018 and returned 430 works distributed as follows: • However, we observe a total of 50 works duplicated, that is, they appeared in two or more databases; consequently, after deleting duplicates, our research returned 380 papers. From this collection, we selected 120 papers based on information from their abstracts. Afterward, we applied the inclusion/exclusion criteria (second phase of SLR protocol) and reduced the number of selected papers to 36. Finally, in the third phase, we analyzed the section "References" of such papers, selecting more 27 new works; therefore, our first collection had 63 papers.
In order to extend our research and include new articles, covering the years 2018 and 2019, we repeated all those steps, performing a new search. We found 84 new results, but only five of them attempted to the inclusion/exclusion criteria; thus, we generated a final collection of 68 works (this new survey was conducted on February 12, 2020.). Before proceeding with the analysis, we opted to follow an additional exclusion criterion-remove works without the Digital Object Identifier (https://www.doi.org/) (DOI). We took this decision to allow the recovery of detailed information about selected works; after this last step, we eliminated 12 papers, obtained a collection with 56 works. Table 3 presents all the works that compose our collection in chronological order. When two or more papers were published in the same year, we organize them in alphabetical order. In this table, we also present how the work was published, that is, if it is a journal article ("Jour") or a conference paper ("Conf"); in this last case, the publication year refers to the publication date of the conference proceedings. Table 3. Summary of the collection in chronological order. We employ the ID along this paper to referee the works.

ID Reference
Year Type ID Reference Year Type  Another essential characteristic to be highlighted is that our research only returned works based on genetic algorithms, which means that our search key, together with our exclusion criteria, eliminates works that design other evolutionary techniques. This situation was expected since we consider keywords "genetic" and "evolutionary"; moreover, EAs tends to appear with other meta-heuristics, violating our inclusion/exclusion criteria.

Environments and Communication Cost
As already mentioned, MTSP can present variations in its multiprocessor system. In this context, papers of our collection deal with two primary environments: (i) homogeneous processors, that is, the execution time of a task does not vary depending on where it is executed; and (ii) heterogeneous processors. There are some papers that present solutions for both scenarios as well as papers that model communication costs. Table 4 lists all the works from the collection organized according to their environments. As observed, most of the works (29 in total) pay attention to the homogeneous environment, that is, when all the processors have the same processing capacity. Besides, 24 papers are related to the heterogeneous environment, and four works [27,28,43,49] deal with both scenarios. Azghadi et al. [53] did not provide sufficient information about where their algorithm is applied; we believe, based on information presented in the publication, that they also considered both.
Based on Table 4, the chart in Figure 3 distributes publications on a timeline (in this chart, we did not consider Azghadi et al. [53].), responding to the SQ.02. In addition to the significant growth between 2010 and 2011, the heterogeneous scenario is also common in the latest studies. This situation may be related to the popularity of MTSP in the Infrastructure as a Service (IaaS) paradigm, that is, virtual machines adjusted by frequency controls. We also analyzed the publications considering the use of the communication cost, that is, when DAG edges have weights. In this scenario, a task sends data (in bytes) to another, which imposes an additional delay in the execution (without communication, the successor task only waits for the execution of its predecessor). Responding to SQ.03, it is possible to observe that most works employ this scenario (about 80%), representing the most substantial volume of publications. Table 5 lists these works.

Optimization Criteria
There are several ways to measure the performance of an MTSP solution. Such measurement is responsible for driving the search process in an evolutionary algorithm. In practice, the quality of a solution is employed in the design of fitness functions that must be optimized. Table 6 lists all the metrics found in the collection in order to respond to the SQ.04. The most widely used metric is the makespan (sometimes referred to as scheduling length), which is present in the vast majority of the works surveyed (55 in total). This metric corresponds to the time instant when all the tasks have finished processing (in Figure 1, for example, the makespan value is equated to 12-time units). Besides makespan, the literature in this area also considered other metrics related to the execution time, that is, flowtime [10,34,58,61] and communication cost [17,64].
Another popular metric is the load balancing [12,17,44,59], which consists of the equitable distribution of tasks over processors. This metric is directed related to better resource utilization: if all the processors are overloaded, it is necessary more investments in the computational infrastructure.
It is also possible to observe the increase, in recent years, in the number of works that consider metrics focused on sustainability [36,48,50,56,60,62]. These works deal with issues related to the temperature and the energy consumption of computers (processors). Sustainability metrics are related to the popularization of distributed systems, mainly because of the adoption of computational clouds. Another recently adopted metric is the financial cost [64], which is related to scenarios that employ IaaS platforms (the performance of virtual machines depends on the frequency values contracted by the client).
All these performance metrics are related to optimization criteria. Most of them (makespan, financial cost, energy consumption, flowtime, temperature, number of processors, and communication cost) must be minimized. On the other hand, system reliability, load balancing, and parallelism are used to design maximization objective functions.
It is essential to highlight that several works consider two or more of these optimization criteria in conjunction. Indeed, 19 papers related to makespan also consider at least one of the other metrics. In most of these cases, works consider a weighted sum to combine these objectives; however, there is also multiobjective evolutionary approaches, based on the concept of Pareto's front [14,36,50,[60][61][62]64]. Finally, only Aguilar and Gelenbe [17] do not directly consider the minimization of makespan, having their focus on load balancing and communication cost.

Number of Citations
In this study, we also computed the number of citations in the work collection, answering the secondary question SQ.05. The chart in Figure 4 classifies the ten most cited publications considering the number of citations returned by the Google Scholar platform. We observe that the most referred publication [13] appeared in more than 900 related works. In fact, the work by Hou et al. [13] is considered one of the most important in this research field, presenting ideas that were extended by several works. Another prominent work is due to Wang et al. [25], with about 500 citations.  Additionally, we related which publication vehicles are preferred. The Journal of Parallel and Distributed Computing published a total of six works [12,16,19,25,44,60], followed by the IEEE Transactions on Parallel and Distributed Systems, with five works [13,31,35,43,50]. After them, Information Sciences [17,40] and Microprocessors and Microsystems [29,45] published two papers each; the other two works appeared in the same conference, the IEEE International Conference on High-Performance Computing and Communication, in 2012 [24,32].

Researchers Affiliations
The collection of works also allowed the extraction of information regarding the authors and their respective nationalities. In this context, "nationality" reflects the researcher's country of affiliation; therefore, it does not necessarily represent his country of origin. We identified 145 distinct researchers in 21 countries, responding to question SQ.06. Figure 5 presents an infographic distributing researchers from their affiliation countries. The countries with the most considerable number of researchers are India and the United States, both with 26 researchers each, followed by China, having 18. Brazil, Singapore, and Turkey have one researcher each.

Word Frequency Rank
Finally, in order to identify the main terms adopted in publications, this study also extracted the primary expressions used in abstracts and keywords. In Figure 6a, we quantify the most common keywords found. We also isolate the common expressions found in the abstracts, as presented in Figure 6b.   In both charts, we linked singular expressions to their respective plural versions. Moreover, this survey responds to the question SQ.07. The identification of the primary expressions can help in the use of terms more related to researches in this field, consequently helping indexing mechanisms and the search process.

Summary
In this survey, we perform a meta-analysis and answered seven secondary questions (enumerated in Section 2). As a result, it is possible to respond to the main question (MQ) of this study, that is, what are the central works in this research field.
Here we listed the ten most cited works presented in our collection. In order to summarize our answers, Table 7 related some information about these works: reference ("Ref."), title, publication vehicle, and publication year. We also present the characteristics of the environments, that is, if the work deals with a homogeneous or a heterogeneous scenario, with or without communication costs ("CC"). Finally, we provide the Scientific Journal Rankings (SJR), retrieved on March 13, 2020.

Conclusions and Future Research
This paper has reported the results of a systematic literature review (SLR) on evolutionary algorithms for the multiprocessor task scheduling problem. Based on four databases, we generated a collection of 56 papers published in journals or presented in conferences between the years of 1990 and 2019, that is, we covered three decades of studies.
From this collection, we presented an overview of general characteristics from the selected works and, consequently, from this research field. We observed that homogeneous computational environments are still the most common, with 29 published works (about 52% of the collection). Moreover, more than 80% of selected works deal with communication costs. All the papers except one look for the minimization of makespan, sometimes in conjunction with other optimization criteria.
In this survey, we focus on genetic algorithms; this means that we did not cover other evolutionary approaches (such as differential evolution, estimation of distribution algorithms, memetic algorithms, swarm intelligence, and others). We took this decision to limit our research to the most common evolutionary technique.
Finally, from this collection, we can now extract a large volume of information about the structure of GAs. For example, it is possible to analyze details about solution encoding (i.e., chromosome), genetic operators (mutation and crossover), as well as mechanisms to deal with variability and convergence. Further studies may include an experimental evaluation of these design decisions.
Regarding the proposed framework for this SLR, it can be extended in future works to grant other scheduling problems (for example, considering independent tasks) or solving methods.