1. Introduction
The multiprocessor task scheduling problem (MTSP) consists of mapping a task graph onto a target platform [
1]. In this model, the graph represents the parallel application, that is, vertices denote computational tasks and edges model precedence constraints between tasks. The objective is to find a mapping of tasks over processors in order to optimize some performance criteria [
2].
Formally, tasks and their respective relations are modeled using a directed acyclic graph (DAG),
, where the (finite) set
represents the tasks (vertices), and set
E (edges) represents precedence constraints between tasks, that is,
if and only if
depends on the execution of
. Moreover, the weight function
gives the number of processing instructions required by each task [
1]. A task
may send (communicate) data to a successor task
; this situation is represented by weights on the DAG edges, indicate the communication cost. When the same processor executes both tasks, this cost is negligible.
Figure 1a [
3] illustrates a DAG with
tasks and its execution costs (for instance,
means that task
has a cost of 3 million of processing instructions). In this same graph, edges represent a precedence relationship, and its weights are the communication costs.
The computational environment consists of a set
of widely interconnected processors. In a homogeneous environment, all processors have the same processing capacity (given in millions of instructions per second—MIPS). On the other hand, in heterogeneous environments, the computational cost of each task may vary depending on the processor where it is attributed. Finally, a solution to the MTSP can be evaluated in several ways: execution time of application, resource utilization, and others.
Figure 1b [
3] shows a scheduling solution, represented as a Gantt Chart for two identical processors. In this solution, tasks
are allocated over processor
and
over processor
. In this example, the execution time of the application consists of 12-time units.
Similar to many optimization problems in the real world, the search for the optimal MTSP solution has a high computational complexity [
1,
2] and cannot be solved promptly by brute force techniques. One of the most investigated approaches to this type of problem is the use of techniques that, in many cases, discover a set of potential solutions, from which one can find the global optimum or an approximate solution to that problem. In this context, evolutionary algorithms (EAs) have been investigated as a search method due to their flexibility to solve complex optimization problems.
EAs [
4] can be adapted to various types of problems. Inspired by the evolutionary paradigm, these techniques work with the evolution of several solutions (population) simultaneously, ensuring a significant advantage when compared to other techniques. There are several studies involving EAs applied to the scheduling problem, and this number increases every year. For this reason, it is necessary to systematically organize these works in order to summarize their main characteristics, understand contributions, authors, and institutions that maintain relevant publications. With such information, it is possible to identify domains that may receive attention to additional research in the future.
One way to investigate a significant amount of studies, and synthesize their contributions, is through a Systematic Literature Review (SLR) [
5,
6]. SLR consists of a rigorous procedure to select and analyze works that fit a research question without following a bias on the part of the researchers. Moreover, the results of SLR may be replicated and extended as needed.
This paper proposes the application of SLR techniques to find a collection of works that proposes EAs to MTSP. In this way, we present a comprehensive survey of these works through a meta-analytic approach, examining authors, institutions, and other aspects. We organize works considering task characteristics, computational environments, and optimization criteria. This survey covers an interval of 30 years (from 1990 to 2019) of research papers; thus, we also map the historical evolution of studies in this scientific field. Our investigation considered four digital libraries that index computing studies; after the three stages of papers analysis, we created a collection composed of 56 studies (This paper is an extension of previous work by the same authors [
3]. The original work focused only on journal articles and was limited to studies based on genetic algorithms. Moreover, that study was less rigorous in terms of inclusion and exclusion criteria and considered works until 2018.). The next section of this paper details our research protocol, presenting research questions, search key, and inclusion/exclusion criteria. Afterward,
Section 3 presents the collection of 56 papers and provides a meta-analysis of them. Finally, we draw our concluding remarks (
Section 4) and expose the references list.
2. Research Protocol
According to Kitchenham [
5], the Systematic Literature Review (SLR) is a process of interpreting and evaluating relevant studies related to a research question, topic, or phenomenon. Consequently, we consider the SLR as an outstanding tool for aggregating experiences from a collection of different works [
7]. The execution of an SLR follows some formal steps, named
research protocol: (i) limit the research question; (ii) compose a search key; (iii) choose the databases; (iv) establish the inclusion/exclusion criteria; and (v) report the stages of the review.
The first step of an SLR consists of limiting the scope of the research. In the context of this paper, we are interested in mapping the last three decades of EAs applied to the MTSP. Moreover, we desire to characterize the application scenarios of such algorithms, that is, we intend to record which authors are dealing with which particularities of this problem (e.g., presence of communication among tasks, processors heterogeneity, and others).
Having this scope in mind, and concerning the SLR protocol, we formulated the main research question (referred to as MQ), as follows:
MQ: “Between the years 1990 to 2019, what are the central studies on evolutionary algorithms applied to the multiprocessor task scheduling problem?”
In addition to the direction of MQ, it is necessary to prepare other questions (called secondary questions—SQ) to expand the collection with relevant information. These questions are necessary to guide the research process, providing directions to select relevant works. In this paper, we considered nine secondary questions, enumerated here:
- SQ.01:
What is the period with the most significant amount of research papers?
- SQ.02:
What are the computational environments considered in these studies?
- SQ.03:
How do these studies deal with the cost of communication?
- SQ.04:
What are the main optimization criteria considered by the studies?
- SQ.05:
Which of these papers has the highest number of citations?
- SQ.06:
Which regions of the planet have the highest number of authors?
- SQ.07:
What are the most used expressions in abstracts and keywords?
For survey works corresponding to the questions above, we drew up a search key and applied it to the search mechanisms of the digital databases. For this, we eliminated corresponding keywords and combined different subjects using the logical operators
OR and
AND, respectively. Since the search key influences the behavior of the search engine, it is necessary to build it verifying each employed term.
Table 1 presents the search key constructed in this study.
In our research, we considered “evolutionary” and “genetic” algorithms as synonymous. Indeed, several authors do not provide a proper distinction between both terms; for this reason, we opted to maintain both expressions and filter results (according to De Jong [
4], EAs are a class of algorithms and genetic algorithms are techniques from such class). We also regarded the words “graph”, “workflow” and “DAG” in conjunction, as suggested by Robert [
1]. Besides, we included both “parallel” and “multiprocessor” terms.
Finally, the search key includes the expressions “representation”, “encoding”, “genetic operators”, and “objective function”. All of these are specific for the context of evolutionary algorithms, and we considered them in order to select papers that provide details about algorithm design.
After preparing the search key, it is necessary to select some digital databases. Such databases must contain search engines that reproduce the semantics of the key to find related works.
Table 2 shows the bases used in this study. It is necessary to highlight that intended to consider the ACM Digital Library (
https://dl.acm.org/); however, its search engine did not reflect the main aspects contained in the search key, resulting in unsatisfactory outcomes when compared to the other databases.
Although the search key encompasses several terms related to research questions, the amount of selected works on digital databases can be pervasive. Thus, the protocol used in the SLR must define some inclusion and exclusion criteria that reflect the format and desired aspects of the investigation. The criteria adopted in this protocol are:
Papers must deal with MTSP in its static version and without the use of release times and deadlines (in other words, all information related to the problem is known before scheduling and the execution time is not limited);
The research papers must deal with MTSP using EAs solutions, that is, EAs are the focus of the study;
Selected studies cannot combine EAs with other meta-heuristics;
All the papers must be in the language;
Papers must be available for consultation on Internet platforms.
Based on all the rules defined here, we performed the following steps, named search phase, as required by the SLR protocol (we documented each SLR execution phase using annotations and spreadsheets—thus, we may extend this survey or adapt it to answer other research questions.):
Search in search engines: We perform the search step and collected all resulting works for future analysis;
Results filtering: we discarded works that did not meet the inclusion/exclusion criteria;
Manual search: we carried out a new survey of works by looking at the bibliographic references of the first pre-surveyed collection. We also submitted these new studies to the inclusion/exclusion criteria.
3. Analysis of the Collection
This section presents the collection of works recovered with the SLR. Each part answers one or more secondary questions presented in our SLR protocol. Thus, in
Section 3.1, we answer the
SQ.01;
Section 3.2 deals with
SQ.02 and
SQ.03;
Section 3.3 answers
SQ.04, and
Section 3.4 SQ.05.
Section 3.5 and
Section 3.6 are related to
SQ.06 and
SQ.07, respectively. Finally, we present a summary in
Section 3.7, answering our main research question (
MQ).
3.1. Collection and Publication Timeline
This SLR is part of a more comprehensive research project [
3,
8]. For this reason, we collected the data into separated moments. The first collection took place in April 2018 and returned 430 works distributed as follows:
Google Scholar: 237 works;
Elsevier-Science Direct: 157 works;
Portal de Periódicos Capes: 34 works;
IEEE Xplore Digital Library: 2 works.
However, we observe a total of 50 works duplicated, that is, they appeared in two or more databases; consequently, after deleting duplicates, our research returned 380 papers. From this collection, we selected 120 papers based on information from their abstracts. Afterward, we applied the inclusion/exclusion criteria (second phase of SLR protocol) and reduced the number of selected papers to 36. Finally, in the third phase, we analyzed the section “References” of such papers, selecting more 27 new works; therefore, our first collection had 63 papers.
In order to extend our research and include new articles, covering the years 2018 and 2019, we repeated all those steps, performing a new search. We found 84 new results, but only five of them attempted to the inclusion/exclusion criteria; thus, we generated a final collection of 68 works (this new survey was conducted on February 12, 2020.). Before proceeding with the analysis, we opted to follow an additional exclusion criterion—remove works without the Digital Object Identifier (
https://www.doi.org/) (DOI). We took this decision to allow the recovery of detailed information about selected works; after this last step, we eliminated 12 papers, obtained a collection with 56 works.
Table 3 presents all the works that compose our collection in chronological order. When two or more papers were published in the same year, we organize them in alphabetical order. In this table, we also present how the work was published, that is, if it is a journal article (“Jour”) or a conference paper (“Conf”); in this last case, the publication year refers to the publication date of the conference proceedings.
Note that only one paper was published in 1990 [
9], and, after this, two others appeared in 1994 [
11,
13]. Closing the collection, three papers were published in 2019 [
60,
62,
64]. In all, 31 papers (≈55%) were published in journals. Based on
Table 3,
Figure 2 presents the timeline of publications. Answering the first secondary research question (
SQ.01), the period between 2011 and 2012 is the interval with the most significant number of publications.
Another essential characteristic to be highlighted is that our research only returned works based on genetic algorithms, which means that our search key, together with our exclusion criteria, eliminates works that design other evolutionary techniques. This situation was expected since we consider keywords “genetic” and “evolutionary”; moreover, EAs tends to appear with other meta-heuristics, violating our inclusion/exclusion criteria.
3.2. Environments and Communication Cost
As already mentioned, MTSP can present variations in its multiprocessor system. In this context, papers of our collection deal with two primary environments: (i) homogeneous processors, that is, the execution time of a task does not vary depending on where it is executed; and (ii) heterogeneous processors. There are some papers that present solutions for both scenarios as well as papers that model communication costs.
Table 4 lists all the works from the collection organized according to their environments.
As observed, most of the works (29 in total) pay attention to the homogeneous environment, that is, when all the processors have the same processing capacity. Besides, 24 papers are related to the heterogeneous environment, and four works [
27,
28,
43,
49] deal with both scenarios. Azghadi et al. [
53] did not provide sufficient information about where their algorithm is applied; we believe, based on information presented in the publication, that they also considered both.
Based on
Table 4, the chart in
Figure 3 distributes publications on a timeline (in this chart, we did not consider Azghadi et al. [
53].), responding to the
SQ.02. In addition to the significant growth between 2010 and 2011, the heterogeneous scenario is also common in the latest studies. This situation may be related to the popularity of MTSP in the Infrastructure as a Service (IaaS) paradigm, that is, virtual machines adjusted by frequency controls.
We also analyzed the publications considering the use of the communication cost, that is, when DAG edges have weights. In this scenario, a task sends data (in bytes) to another, which imposes an additional delay in the execution (without communication, the successor task only waits for the execution of its predecessor). Responding to
SQ.03, it is possible to observe that most works employ this scenario (about
), representing the most substantial volume of publications.
Table 5 lists these works.
3.3. Optimization Criteria
There are several ways to measure the performance of an MTSP solution. Such measurement is responsible for driving the search process in an evolutionary algorithm. In practice, the quality of a solution is employed in the design of fitness functions that must be optimized.
Table 6 lists all the metrics found in the collection in order to respond to the
SQ.04.
The most widely used metric is the
makespan (sometimes referred to as
scheduling length), which is present in the vast majority of the works surveyed (55 in total). This metric corresponds to the time instant when all the tasks have finished processing (in
Figure 1, for example, the makespan value is equated to 12-time units). Besides makespan, the literature in this area also considered other metrics related to the execution time, that is,
flowtime [
10,
34,
58,
61] and
communication cost [
17,
64].
Another popular metric is the
load balancing [
12,
17,
44,
59], which consists of the equitable distribution of tasks over processors. This metric is directed related to better resource utilization: if all the processors are overloaded, it is necessary more investments in the computational infrastructure.
It is also possible to observe the increase, in recent years, in the number of works that consider metrics focused on sustainability [
36,
48,
50,
56,
60,
62]. These works deal with issues related to the temperature and the energy consumption of computers (processors). Sustainability metrics are related to the popularization of distributed systems, mainly because of the adoption of computational clouds. Another recently adopted metric is the financial cost [
64], which is related to scenarios that employ IaaS platforms (the performance of virtual machines depends on the frequency values contracted by the client).
All these performance metrics are related to optimization criteria. Most of them (makespan, financial cost, energy consumption, flowtime, temperature, number of processors, and communication cost) must be minimized. On the other hand, system reliability, load balancing, and parallelism are used to design maximization objective functions.
It is essential to highlight that several works consider two or more of these optimization criteria in conjunction. Indeed, 19 papers related to makespan also consider at least one of the other metrics. In most of these cases, works consider a weighted sum to combine these objectives; however, there is also multiobjective evolutionary approaches, based on the concept of Pareto’s front [
14,
36,
50,
60,
61,
62,
64]. Finally, only Aguilar and Gelenbe [
17] do not directly consider the minimization of makespan, having their focus on load balancing and communication cost.
3.4. Number of Citations
In this study, we also computed the number of citations in the work collection, answering the secondary question
SQ.05. The chart in
Figure 4 classifies the ten most cited publications considering the number of citations returned by the Google Scholar platform. We observe that the most referred publication [
13] appeared in more than 900 related works. In fact, the work by Hou et al. [
13] is considered one of the most important in this research field, presenting ideas that were extended by several works. Another prominent work is due to Wang et al. [
25], with about 500 citations.
Additionally, we related which publication vehicles are preferred. The
Journal of Parallel and Distributed Computing published a total of six works [
12,
16,
19,
25,
44,
60], followed by the
IEEE Transactions on Parallel and Distributed Systems, with five works [
13,
31,
35,
43,
50]. After them,
Information Sciences [
17,
40] and
Microprocessors and Microsystems [
29,
45] published two papers each; the other two works appeared in the same conference, the
IEEE International Conference on High-Performance Computing and Communication, in 2012 [
24,
32].
3.5. Researchers Affiliations
The collection of works also allowed the extraction of information regarding the authors and their respective nationalities. In this context, “nationality” reflects the researcher’s country of affiliation; therefore, it does not necessarily represent his country of origin. We identified 145 distinct researchers in 21 countries, responding to question SQ.06.
Figure 5 presents an infographic distributing researchers from their affiliation countries. The countries with the most considerable number of researchers are India and the United States, both with 26 researchers each, followed by China, having 18. Brazil, Singapore, and Turkey have one researcher each.
3.6. Word Frequency Rank
Finally, in order to identify the main terms adopted in publications, this study also extracted the primary expressions used in abstracts and keywords. In
Figure 6a, we quantify the most common keywords found. We also isolate the common expressions found in the abstracts, as presented in
Figure 6b.
In both charts, we linked singular expressions to their respective plural versions. Moreover, this survey responds to the question SQ.07. The identification of the primary expressions can help in the use of terms more related to researches in this field, consequently helping indexing mechanisms and the search process.
3.7. Summary
In this survey, we perform a meta-analysis and answered seven secondary questions (enumerated in
Section 2). As a result, it is possible to respond to the main question (
MQ) of this study, that is, what are the central works in this research field.
Here we listed the ten most cited works presented in our collection. In order to summarize our answers,
Table 7 related some information about these works: reference (“Ref.”), title, publication vehicle, and publication year. We also present the characteristics of the environments, that is, if the work deals with a homogeneous or a heterogeneous scenario, with or without communication costs (“CC”). Finally, we provide the Scientific Journal Rankings (SJR), retrieved on March 13, 2020.
From these works, seven deal with homogeneous environments [
12,
13,
17,
19,
25,
31,
35], two with heterogeneous [
22,
40], and only one with both environments [
43]. Additionally, Hou et al. [
13], Zomaya et al. [
35], and Wang et al. [
22] do not model communication costs. All these works, except Aguilar and Gelenbe [
17], consider makespan as primarily optimization metric. Besides, Omara and Arafa [
12] and Wang et al. [
22] also take into account load balancing and system reliability, respectively, as a secondary metric, and Aguilar and Gelenbe [
17] employ load balancing and communication cost (none of them, however, design a multiobjective algorithm). At long last, all these works were published in journals.
4. Conclusions and Future Research
This paper has reported the results of a systematic literature review (SLR) on evolutionary algorithms for the multiprocessor task scheduling problem. Based on four databases, we generated a collection of 56 papers published in journals or presented in conferences between the years of 1990 and 2019, that is, we covered three decades of studies.
From this collection, we presented an overview of general characteristics from the selected works and, consequently, from this research field. We observed that homogeneous computational environments are still the most common, with 29 published works (about of the collection). Moreover, more than of selected works deal with communication costs. All the papers except one look for the minimization of makespan, sometimes in conjunction with other optimization criteria.
In this survey, we focus on genetic algorithms; this means that we did not cover other evolutionary approaches (such as differential evolution, estimation of distribution algorithms, memetic algorithms, swarm intelligence, and others). We took this decision to limit our research to the most common evolutionary technique.
Finally, from this collection, we can now extract a large volume of information about the structure of GAs. For example, it is possible to analyze details about solution encoding (i.e., chromosome), genetic operators (mutation and crossover), as well as mechanisms to deal with variability and convergence. Further studies may include an experimental evaluation of these design decisions.
Regarding the proposed framework for this SLR, it can be extended in future works to grant other scheduling problems (for example, considering independent tasks) or solving methods.