A Survey on QoS Requirements Based on Particle Swarm Optimization Scheduling Techniques for Workﬂow Scheduling in Cloud Computing

: Cloud computing is an innovative technology that deploys networks of servers, located in wide remote areas, for performing operations on a large amount of data. In cloud computing, a workﬂow model is used to represent di ﬀ erent scientiﬁc and web applications. One of the main issues in this context is scheduling large workﬂows of tasks with scientiﬁc standards on the heterogeneous cloud environment. Other issues are particular to public cloud computing. These include the need for the user to be satisﬁed with the quality of service (QoS) parameters, such as scalability and reliability, as well as maximize the end-users resource utilization rate. This paper surveys scheduling algorithms based on particle swarm optimization (PSO). This is aimed at assisting users to decide on the most suitable QoS consideration for large workﬂows in infrastructure as a service (IaaS) cloud applications and mapping tasks to resources. Besides, the scheduling schemes are categorized according to the variant of the PSO algorithm implemented. Their objectives, characteristics, limitations and testing tools have also been highlighted. Finally, further directions for future research are identiﬁed. This paper presents the distribution of published research on workﬂow scheduling in the last few years. Speciﬁc RQs are considered for resolving the gaps in current strategies (RQ3). From Sections 1–6, we evaluated potential expectations (RQ5) after conducting QA, SDS, and DCP in the corresponding publications between the years 2000 to 2019.


Introduction
Cloud computing is a new technology that provides virtual, scalable and dynamic resources to users based on a pay-as-you-use service. This technology is network-dependent. Due to the network scale involved, some services such as e-commerce applications use up the entire network [1]. Cloud computing is practically developed in three stages: distributed computing, parallel computing and grid computing [2]. Cloud computing is also used for executing scientific workflows such as geological, astronomical, biological, cosmic, biotechnology and image processing. Cloud computing can be classified into three types: public clouds that everyone can register and use for their services, private clouds that are operated without limitation of network bandwidth, security visibility and regulatory specifications within the organization and hybrid clouds that merge private clouds with public cloud resources. An important aspect of cloud computing is workflow scheduling which is the process of mapping dependent tasks to the available resources considering the quality of service (QoS) constraints [3].
Workflow scheduling incurs a large communication and computation cost [4]. Some metaheuristic-based algorithms such as ant colony optimization (ACO), genetic algorithm (GA), simulated annealing (SA) and particle swarm optimization (PSO) have been proposed for scheduling tasks or workflows in the cloud environment [5]. As one of the meta-heuristic approaches used to solve optimization problems, PSO initially provides a random solution population then looks for optimal solutions by updating generations. Unlike other meta-heuristic algorithms, there are no developmental operators such as crossover and mutation in the traditional PSO algorithm [6]. Possible solutions in PSO (called particles) travel the problem space by tracking the existing optimal particles [7].
One of the fundamental challenges of cloud computing is how to manage the QoS in terms of reliability and performance during the workflow scheduling process. This involves scheduling complex workflows to concurrently reduce execution time and cost using heterogeneous resources of the cloud [8]. Most of the previous studies focus on only one objective, such as reducing the execution time or minimizing the total execution cost, to optimize the performance of workflow scheduling applications while satisfying user's QoS constraints. However, the complex nature of dynamic workflow necessitates that total execution cost is traded-off for processing time to help strike a balance between them.

Motivation
The above-mentioned challenge motivates this systematic review that focuses on the QoS requirements of PSO-based algorithms built for scheduling workflows in the cloud environment. The schemes are classified according to the type of PSO algorithm used. Their objectives and properties are also highlighted. In addition, we show the enhancements made by hybridizing PSO algorithms with other meta-heuristic algorithms. This analysis compares the schemes and discusses the peculiarities of each scheme to motivate further research in this field.

Related Works
Most of the previous surveys on PSO focused on task and workflow scheduling in cloud computing without considering future direction and open issues and some quality of service metrics. Some surveys [9,10] have also reviewed task scheduling algorithms based on PSO. However, their works fall out of the scope of this paper. They mentioned the basic working principles of several scheduling algorithms but did not discuss the pros and cons of various approaches. In contrast to the previous surveys, we systematically review the literature on PSO-based workflow scheduling algorithms in cloud computing while taking account of QoS metrics such as fault tolerance, execution time and cost. Moreover, future directions are highlighted. Table 1 compares this paper with the previous reviews on workflow scheduling using PSO-based algorithms.

Contributions
The contributions of this survey are as follows: • Classification of PSO-based workflow scheduling algorithms in cloud computing. The QoS constraints, type of workflow used, advantages and disadvantage of these algorithms in cloud computing are indicated.

•
Identification of various quality of service metrics used in the literature. Table 2 defines most of these QoS metrics.

•
Identification of bi-objective, tri-objective and multi-objective PSO-based scheduling approaches in the literature.

•
Presentation of future directions on state-of-the-art PSO-based scheduling algorithms.
The rest of this paper is organized as follows: Section 2 illustrates the systematic review process adopted to select the relevant research articles for our classification, while Section 3 provides the background to this study. Section 4 discusses the PSO-based scheduling strategies in cloud computing. Section 5 classifies the PSO-based scheduling schemes. Section 6 summarizes the literature review indicating the limitations of the review. Section 7 provides a technical comparison of Cloud, Fog and edge. Open challenges and future research directions are presented in Section 8. Finally, Section 9 concludes this paper.

Systematic Review Process
This section gives a clear description of how this review is defined, analyzed and interpreted. Research questions are developed to clearly outline the goals of this study. A search protocol is then designed so that the most relevant research papers can be reviewed. This we achieved by using search strings as well as selected digital libraries. The inclusion and exclusion criteria are defined to determine the parameters and research articles that will be included or excluded during the review process. Thereafter, the data obtained are synthesized to ensure all study questions are answered meaningfully.

Research Question (RQ)
The research questions below were formulated in the planning stage of this review and their answers are provided in subsequent sections: RQ1. Which heuristic, meta-heuristic or hybrid PSO technologies are available to support workflow scheduling? RQ2. Which simulation tool is mostly used to conduct cloud computing experiments? RQ3. What are the flaws of the current PSO-based workflow scheduling strategies? RQ4. Which PSO-based workflow scheduling algorithm performs best for different QoS constraints? RQ5. What are the prospects for PSO-based workflow scheduling schemes?
To answer these questions, we follow the Systematic Literature Review (SLR) protocols [14].

Search Strategy
For search string construction, the following steps were followed [15]: (1) Selecting the keywords from the RQs. (2) Discovering the synonyms and alternate spelling systems to extend the keywords.
(3) Investigating the resulting keywords in the relevant literature. (4) Combining the synonyms and alternate spelling using 'Boolean OR' (5) Combining the main terms and conditions using 'Boolean AND'. Thereafter, we found the general terms and conditions relating to QoS customization. The corresponding search string used in the digital library engines is: "(("multi objective" OR "multi-objective") AND ("Particle Swarm Optimization" OR "PSO") AND ("Workflow Scheduling") AND ("Cloud Computing" OR "Cloud") AND ("QoS"))".

Quest Approach (QA)
The research methodology of this analysis has been designed to ensure specific and unprejudiced solutions in the literature are considered for providing answers to the RQs. The defined search parameters are cloud computing, optimization, workflows, scheduling, heuristic techniques, meta-heuristic, and hybrid optimization.

Sample Discrimination Strategy (SDS)
Initially, 474 research papers were collected and screened for the review using the criteria specified. Some research papers were withdrawn because their titles did not fit into the scope of our current survey and/or they have incomplete descriptions and conclusions. 150 research papers were read but the contents of these articles are not relevant to our subject; hence, they were excluded. After the analysis, we got 79 relevant research papers for our survey of PSO-based workflow scheduling. These papers can be found in the list of references. The research papers were published within the period from 2000 to 2019 with some important historical references [16,17].

Data Clarification and Planning (DCP)
To organize and compare the required information, extensive structures (Tables 3-8) were generated. Then this information was further arranged to respond to the targeted RQs.

Target Audience
The survey is targeted at researchers interested in PSO algorithm for scheduling workflows based on specific QoS constraints. Scientists with a broad range of backgrounds including distributed computing, cloud, Big Data, Grid and parallel computing can also benefit from learning how workflows can be scheduled using PSO-based techniques that have been influenced by research over the past few decades. PSO algorithm has found broad use in several fields of computer science and applied mathematical applications including neural network weight calculation, time series analysis, market optimization and many more.

Workflow
A workflow is a sequence of activities carried out to achieve a defined objective in any environment. It is a group of simple processes that are used for solving complex problems [18]. These processes follow a certain order to improve the execution procedure and ensure efficiency. Workflows define the way various tasks are configured, performed and tracked. It can be modelled as a Direct Acyclic Graph (DAG) consisting of nodes and edges ( Figure 1). Workflows can be represented as W = (T, E), where T is a set of tasks t 1 , t 2 , . . . , t n and E is an edge (t a , t b ) [19].  A schedule is denoted as: where Res = r1,r2,…,rn represents the resources, is the mapping of task-to-resource, CE is the total execution cost and TE is the total execution time [19].

Scientific Workflow
Several workflows with scientific standards are used in scheduling processes to allocate tasks to appropriate resources. They are also used to measure the efficiency of scheduling methods in different scientific areas. Each scientific workflow contains tasks arranged in levels in the form of a where Res = r 1 , r 2 , . . . , r n represents the resources, map is the mapping of task-to-resource, CE is the total execution cost and TE is the total execution time [19].

Scientific Workflow
Several workflows with scientific standards are used in scheduling processes to allocate tasks to appropriate resources. They are also used to measure the efficiency of scheduling methods in different scientific areas. Each scientific workflow contains tasks arranged in levels in the form of a parent-child relationship. Based on this relationship, a parent must be processed first before the child. Several workflow scheduling methods work with different scientific workflow datasets considering that some datasets are complicated and huge. These datasets are processed to meet the users' requirements without violating the considered constraints. Scientific-oriented workflows cover a wide range of areas such as geology, astronomy, biology, cosmic analysis, biotechnology as well as image processing. Some of the most popular workflows are highlighted below and illustrated in Figure 2: A schedule is denoted as: where Res = r1,r2,…,rn represents the resources, is the mapping of task-to-resource, CE is the total execution cost and TE is the total execution time [19].

Scientific Workflow
Several workflows with scientific standards are used in scheduling processes to allocate tasks to appropriate resources. They are also used to measure the efficiency of scheduling methods in different scientific areas. Each scientific workflow contains tasks arranged in levels in the form of a parent-child relationship. Based on this relationship, a parent must be processed first before the child. Several workflow scheduling methods work with different scientific workflow datasets considering that some datasets are complicated and huge. These datasets are processed to meet the users' requirements without violating the considered constraints. Scientific-oriented workflows cover a wide range of areas such as geology, astronomy, biology, cosmic analysis, biotechnology as well as image processing. Some of the most popular workflows are highlighted below and illustrated in Figure 2:

Workflow Scheduler
To optimize defined objectives, a scheduler analyzes and distributes its tasks to the available resources. It provides a summary of the workflow, identifies multiple queues and distributes tasks to effectively run the device based on the user's requirements. Workflows can be forwarded to resources using simple approaches such as queues or more complicated techniques. When a user needs to use cloud resources efficiently, an efficient scheduler is required to facilitate his/her goals. The main function of workflow schedulers, such as GA, ACO and PSO described below, is to provide satisfactory QoS to end-users.

Genetic Algorithm (GA)
GA is a bioinspired algorithm developed in 1960 by John Holland from the University of Pennsylvania (USA). It is an optimization strategy that is well-known for finding an approximate solution to a search problem. It is used in many scientific applications such as cancer scanning, gene expression profiling analysis, robotics, telecommunications, engineering design, automotive and marketing [20].

Ant Colony Optimization (ACO)
ACO was proposed by Marco Dorigo in 1992. It is a probability-based algorithm for finding the best path. It simulates the procedure by which ants find the shortest path during their search for food. Practical applications of ACO include the train scheduling system, timetabling, shape optimization, telecommunication network design and problems in computational biology [21].

Particle Swarm Optimization (PSO)
PSO was developed by Eberhart and Kennedy in 1995. It is a population-based optimization technique that simulates the social behavior of birds flocking or fish schooling [17]. A typical PSO algorithm is presented in Figure 3. PSO can be applied in many different areas such as artificial neural network, training, function optimization, fuzzy system control and other areas where GA can be used [22].
provide satisfactory QoS to end-users.

Genetic Algorithm (GA)
GA is a bioinspired algorithm developed in 1960 by John Holland from the University of Pennsylvania (USA). It is an optimization strategy that is well-known for finding an approximate solution to a search problem. It is used in many scientific applications such as cancer scanning, gene expression profiling analysis, robotics, telecommunications, engineering design, automotive and marketing [20].

Ant Colony Optimization (ACO)
ACO was proposed by Marco Dorigo in 1992. It is a probability-based algorithm for finding the best path. It simulates the procedure by which ants find the shortest path during their search for food. Practical applications of ACO include the train scheduling system, timetabling, shape optimization, telecommunication network design and problems in computational biology [21].

Particle Swarm Optimization (PSO)
PSO was developed by Eberhart and Kennedy in 1995. It is a population-based optimization technique that simulates the social behavior of birds flocking or fish schooling [17]. A typical PSO algorithm is presented in Figure 3. PSO can be applied in many different areas such as artificial neural network, training, function optimization, fuzzy system control and other areas where GA can be used [22]. Where w = inertia; x i = current position of particle i; p best = best position of particle i and g best = position of the best particle in the population.
The three algorithms mentioned above have their ways of solving complex optimization problems. Each algorithm has its characteristic performance in finding the best solution, depending on the problems. They can be compared based on the differences in their operation.

QoS Constraints
In cloud computing, customers define QoS constraints according to their requirements. Some of these QoS constraints in the literature are described in Table 2 below.

Makespan
The period between the starting time of the execution and the completion time of the actual workflow [23].

Cost
The amount paid by users for executing workload on cloud providers' services [24].

Throughput
The total number of users' requests finished within a time limit [25]. 4 Reliability This is the ratio of the total number of performed tasks to the total number of tasks. The aim is to provide services to clients [26].

Resource utilization
The appropriate use of resources in the course of workflow scheduling using the idle time gaps [27].
The difference between the completion time and the task submission time [28]. 7 Success rate The total number of workflows carried out within user-defined constraints [29].
8 Tardiness This defines how long workflow schedule has been postponed to the extent that the time of completion exceeds the estimated time limit [30]. 9 Resource availability This estimates the number of resources available to map tasks in order to reduce the rate of failure [31]. 10 Load balancing This defines how the scheduler optimizes resources used to reduce the pressure of cloud resources [11].

11
Response time The time between task arrival and task completion [32].

Budget
The expense of using cloud services for a certain period of time [33]. 13 Deadline The user's time limit to perform the workflow [34].
14 Waiting time This determines the interval between the time the task is ready and when the task begins [35].

Execution time
The time it takes for the resource to perform the task [23].

Security
This describes a stable scheduling to reduce the effect of security attacks by attackers via abusing the cloud services [11].

Energy consumption
This determines the utilization of the energy during the scheduling process [36]. 18 Fault tolerance This identifies the hardware and software problems that can be occurred at the start of execution until the last job in the workflow is completed [37].
Meeting these requirements is a primary challenge in workflow scheduling. To address this issue, different workflow scheduling algorithms have been proposed. In the next section, we review research works on PSO-based workflow scheduling schemes that consider QoS constraints.

Standard PSO
Overcoming the drawbacks of cloud VM migration is costly and time-consuming. Instead of moving the entire overloaded VM, the PSO-based task scheduling algorithm [38] was suggested to shift the task from overloaded VMs to underloaded VMs. A new optimization model was developed to convert these new tasks into VM, optimize makespan as well as transfer time. The architecture of cloud resource brokers was designed, developed and built by [39]. Reference [59] states that controlling the resources and developing the different kinds of QoS parameters based on specified fitness function using the PSO algorithm is a constraint. PSO-based asset planning technique, called BULLET was suggested by [60] to reduce running costs, time, availability and power along with other QoS parameters. This suggested PSO-based algorithm has been used to efficiently plan resources to optimally solve the problem using the fitness function. The fitness function is more effective [61,62] in allocating the best resources for applications (tasks). It enables all applications to be processed in the shortest possible time at a minimum cost.

Standard PSO
Overcoming the drawbacks of cloud VM migration is costly and time-consuming. Instead of moving the entire overloaded VM, the PSO-based task scheduling algorithm [38] was suggested to shift the task from overloaded VMs to underloaded VMs. A new optimization model was developed to convert these new tasks into VM, optimize makespan as well as transfer time. The architecture of cloud resource brokers was designed, developed and built by [39]. Reference [59] states that controlling the resources and developing the different kinds of QoS parameters based on specified fitness function using the PSO algorithm is a constraint. PSO-based asset planning technique, called BULLET was suggested by [60] to reduce running costs, time, availability and power along with other QoS parameters. This suggested PSO-based algorithm has been used to efficiently plan resources to optimally solve the problem using the fitness function. The fitness function is more effective [61,62] in allocating the best resources for applications (tasks). It enables all applications to be processed in the shortest possible time at a minimum cost.

Jumping and Learning PSO
In the standard PSO technique, the global best particle, g best , gets stuck in the local minima because it is not dynamically adjusted in all iterations which yield a poor convergence rate. To address this drawback, the Jumping PSO technique was proposed. This technique involves moving particulate matter from one coordinate to another and reducing workflow scheduling compilation time [42]. Self-adaptive learning PSO [41] incorporates four-velocity updating mechanisms for the IaaS cloud to delegate user tasks effectively and increase the revenue of cloud service providers.

Bi-Objective PSO
Bi-objective PSO is a variation of PSO that simultaneously optimizes two objectives in the cloud environment. This workflow takes deadline and budget constraints into account to optimize the costs and time of implementation. A priority bi-objective PSO algorithm was proposed by [43] to simultaneously optimize both cost and makespan. The proposed algorithm assigns the result of HEFT to initialize PSO. The simulations of four different workflow systems in the real-world and correlations with other algorithms determine the efficiency of the algorithm. The results of the computations indicate that the proposed algorithm performs better than others. To simultaneously optimize both parameters, i.e., time and costs, authors in [44] suggested the use of the smallest position value (using PSO technique) to meet the end-user requirements and reduce infrastructure cost (thereby maximizing profit for cloud service providers).

Modified PSO (MPSO)
In the last decade, several modifications to PSO algorithms (known as MPSOs) have been proposed to optimize the performance of cloud computing for different QoS parameters. A lot of MPSOs now exist to overcome the weaknesses in the existing PSO algorithm. For example, [45] introduced an improved PSO planning algorithm to solve the cloud resource planning problem. The algorithm looks for the best resource for the next task and assigns the task to that resource to minimize completion time and cost. This is executed based on the current workload at VMs. Results showed that MPSO algorithms are better than the existing PSO algorithm in terms of time, cost, speed, and effectiveness. Similarly, [47] introduced an MPSO algorithm that optimized the fitness function to reduce the processing time and utilization of cloud resources. A new PSO methodology for the IaaS cloud, called the NPSO, was introduced in [48] to minimize the financial costs and time taken to finalize applications. An updated PSO algorithm was also proposed by [61] to address particle decline randomness and find an optimal global solution. The proposed technique provides one-to-one mapping and the fastest processor assignment tasking.
The MPSO techniques mentioned above are characterized by premature convergence and stagnation. Thus, efforts were made to resolve these problems. In this context, [50] proposed PSO algorithm distribution-dependent update rules that evaluate output at 13 non-linear global benchmark optimization functions. Experimental evidence shows that the proposed PSO-based algorithm optimizes the fitness function better than the existing algorithms. To increase the global search efficiency [62], an alternative modified APSO-VI algorithm was proposed based on the average absolute velocity of the evading particles. The experimental findings showed that the proposed algorithm escaped from a local minimum without premature convergence. [52] used the APSO-VI algorithms to schedule applications in a cloud environment. The proposed algorithm optimized different QoS parameters (like cost, time, throughput and energy consumption as well as task rejections) when compared to other state-of-the-art algorithms (such as PSO) considering the constraint of the time limit.

Binary PSO (BPSO)
Most real-world optimization problems are distinct. Examples of such include task scheduling, 0-1 knapsack problem and travel salesman problem. These problems can be solved using the BPSO algorithm. This binary version of PSO [16] was proposed for discrete optimization problems in 1997. The sigmoid transfer function was used to convert the velocity value from continuous to binary. BPSO has been used to solve diverse discrete optimization problems [53][54][55]. It has good convergence ability but it is affected by a lack of diversity in its premature convergence. An active research focus is to enhance BPSO's exploration and development capability. In this context, a sigmoid transmission, a linear transfer function, and two separate V-form transfer functions have been proposed to solve the problems of exploration-exploitation in BPSO [16,56,57].

Hybrid PSO
In conjunction with one or more scheduling algorithms, PSO can be used to solve several practical problems. Such combinations are named PSO hybrid algorithms. They can solve problems of local minimum, premature convergence, low convergence accuracy, etc. For example, to improve the completion time and resource usage ratio in cloud computing, resource planning algorithms [58] combine SA with PSO to form an improved PSO (IPSO) algorithm. SA increases the convergence speed and accuracy. This is achieved by adding PSO throughout the simulation process in every iteration. In IPSO algorithms, SA also increases PSO search speed. Krishnasamy and Gomathi et al. [63] discussed an additional hybrid PSO (PSO and DE) algorithm to balance the workload and minimize cloud computing time for applications.

PSO-Based Workflow Scheduling Schemes
The particle swarm optimization algorithms for workflow scheduling can be generally categorized as heuristic, meta-heuristic and hybrid schemes ( Figure 5). With regards to RQ1, it is observed that most researchers concentrate on the following techniques for planning workflows in distributed environments. practical problems. Such combinations are named PSO hybrid algorithms. They can solve problems of local minimum, premature convergence, low convergence accuracy, etc. For example, to improve the completion time and resource usage ratio in cloud computing, resource planning algorithms [58] combine SA with PSO to form an improved PSO (IPSO) algorithm. SA increases the convergence speed and accuracy. This is achieved by adding PSO throughout the simulation process in every iteration. In IPSO algorithms, SA also increases PSO search speed. Krishnasamy and Gomathi et al. [63] discussed an additional hybrid PSO (PSO and DE) algorithm to balance the workload and minimize cloud computing time for applications.

PSO-Based Workflow Scheduling Schemes
The particle swarm optimization algorithms for workflow scheduling can be generally categorized as heuristic, meta-heuristic and hybrid schemes ( Figure 5). With regards to RQ1, it is observed that most researchers concentrate on the following techniques for planning workflows in distributed environments.

Heuristic Algorithms
Heuristic means "to be found by trial and error." This group of algorithms solves optimization problems in a reasonable time. However, optimal solutions cannot be guaranteed. This is fine if we do not want the best solutions that can easily be found [64]. These algorithms were used by previous researchers to solve scheduling problems in cloud computing. For example, the approach in [65] implements service cost on PSO (PSO-SC) to optimize workflow in a dynamic cloud scenario. PSO-SC approach did not only reduce the computing time but also decrease the cost of running users' tasks during the scheduling process. Results show that the approach effectively schedules tasks and reduces the complexities associated with such scheduling processes. Tables 3 and 4

Heuristic Algorithms
Heuristic means "to be found by trial and error." This group of algorithms solves optimization problems in a reasonable time. However, optimal solutions cannot be guaranteed. This is fine if we do not want the best solutions that can easily be found [64]. These algorithms were used by previous researchers to solve scheduling problems in cloud computing. For example, the approach in [65] implements service cost on PSO (PSO-SC) to optimize workflow in a dynamic cloud scenario. PSO-SC approach did not only reduce the computing time but also decrease the cost of running users' tasks during the scheduling process. Results show that the approach effectively schedules tasks and reduces the complexities associated with such scheduling processes. It is trapped in local optima.  Tables 3 and 4 present the current PSO-based heuristic algorithm used by researchers to address workflow scheduling problems. The tables provide comprehensive answers to RQ1, RQ2, RQ3 and RQ4. They indicate the source of the algorithm, its advantages and disadvantages, the testing tool used in the experiment, target QoS constraints, etc.

Meta-Heuristic Algorithms
The word "meta" means "above" and usually, the meta-algorithms do much better than simple heuristics. This is because they involve randomization and local searches. Randomization provides a good way to escape local searches and thus all meta-heuristic algorithms are built for global optimization [64]. Next, we review some PSO-based meta-heuristic algorithms.
Pandey et al. [66] found PSO to be the most effective for run-time workflow scheduling. This is due to its low computation and communication cost. Also, there are two considerations for obtaining optimized solutions: one is the heuristic scheduling process while the other is the PSO for optimized performance "task-resource mapping". In [67], Wu et al. proposed an RDPSO-based PSO algorithm where each solution is described in task-set pairs. Greedy's Randomized Adaptive Research Process (GRASP) is used to maximize the initial swarm population. A three-stage process is then followed to establish new swarms. The "gbest" and "pbest" particles are picked at the first level. However, due to the discrete properties of scheduling, gbest pairs are not well-optimized in the next step for producing new locations as they 'learn' from their previous location. The unmapped tasks pick resources from other optimized pairs in the third step. The authors concluded that RDPSO surpasses PSO with respect to minimizing costs.
Thanh et al. [68] proposed a new version of the PSO algorithm that was proven to solve the problem of workflow preparation. The PSOi deploys other approaches to reach optimal solutions without being trapped in local optimum solutions. This version uses a new strategy for transferring particles into a new space called "inverse". After every iteration, PSOi helps to change the particulate position. A Catfish PSO (C-PSO) algorithm [19] was proposed to select the best task schedule with the least execution time and makespan. It was developed to schedule a large scientific workflow in an IaaS. As hypothesized, the algorithm was able to efficiently schedule tasks and map them to their appropriate resources. Another proposed workflow scheduling algorithm was named PSO-DS with CUPA features [18]. This algorithm uses a workflow manager system (WMS) to create a direct link between the workflow owners and resources. Thus, WMS was used with the required protocol to start communication between resources in the experiment.
Many of the proposed scheduling algorithms for cloud computing fall short of meeting the required QoS of users or do not take other basic principles of cloud computing such as heterogeneity of resources into consideration. Reference [69] proposed a resource scheduling strategy for scientific workflow in the IaaS cloud. The approach uses a PSO meta-heuristic algorithm to minimize the total execution cost of a workflow considering the deadline constraints. The main objective was to optimize the workflow scheduling in the cloud, considering the dynamism in IaaS resource provisioning and scheduling. This approach uses PSO not just for mapping tasks to resources but for determining the number and type of virtual machine to be leased and when to be leased and released. Moreover, it considers diverse IaaS cloud characteristics such as variation performance and resource boot time. The proposed solution merged different underlying aspects of IaaS cloud such as elasticity, heterogeneity, pay-by-use, and resource dynamism. The drawback of this approach is that it defines deadline constraints for resource provisioning and execution cost minimization.
Similarly, a multi-swarm multi-objective advanced operation algorithm (MSMOOA) [70] was proposed to improve multi-objective workflow process in cloud computing. The approach uses different kinds of swarms to cater to diverse issues, thereby enabling efficient data sharing among the swarms. Each physical machine works with these swarms. The swarms are later "upgraded" to a multi-objective molecule used for discovering a "non-commanded" arrangement as a single objective. MSMOOA groups particles of the swarm into two classes. Particles in the first category communicate with particles of different swarms to energize data sharing among the swarms. The second class of the particles trades the data between particles of similar swarms. In the approach, the server manager (SM) is used to maintain the effect of accessing resource visibility for the mapping. Experimental results comparing MSMOOA and two other approaches (HEFT and MOHEFT) indicated that MSMOOA performed significantly better than those approaches.
The authors in [71] developed a Fuzzy PSO with LJFN and SJFN algorithm using LJFN and SJFN heuristics. They used a method that swaps LJFN and SJFN heuristics when a new job is assigned to grid nodes. Nevertheless, the number of grid nodes allocated based on FCFS and LJFN is greater than the number of tasks. The proposed approach creates optimal solutions in time to reduce preparation time and increase resource use efficiency. [41], [72] suggested a PSO-based strategy that takes advantage of PSO's quick convergence. The proposed method (SLPSO) deploys various speed upgrade methods that prevent the solution from being stuck in the local optima and boost the performance. This involves other policies such as adaptive parameters changing, designing different population topologies, using multi-population in standard PSO and bio-inspired PSO methods that combine PSO with other adaptive systems. Within the user-specified QoS constraints, SLPSO performs efficiently well.
PSO with VNS was proposed by [73]. It combines four procedures: "initialization", "particle string update", "fitness calculation" and "solution selection". Before these procedures, the "particle string" must be generated to encrypt promising solutions. VNS was implemented to increase the reliability of the solutions in the solution selection process. On the same subject, Chen et al. [74] studied S-CLPSO to control the user-specified constraints using PSO and clarified the set-based PSO approach and its suitability for workflow scheduling. Speed and place are modified in every iteration in the S-CLPSO algorithm. The SPSO strategy tends to be a better option for workflow scheduling problems since service instances in the cloud are treated as a collection. Also, it is simple to 'expedite search' with S-PSO. It was therefore concluded that under very tight constraints, the findings obtained by S-CLPSO are very promising. Tables 5 and 6 provide a comparative analysis of the various meta-heuristic algorithms used in the literature to manage workflow scheduling problems. The tables present a systematic annual review of several meta-heuristic algorithms indicating the source of each algorithm, its advantages and disadvantages, the testing tool used in the experiment, target QoS constraints, scheduling strategy, etc. Both tables answer RQs 1-4.

Hybrid Algorithms
This section reviews research works that combine heuristic and meta-heuristic algorithms. A systematic analysis of the various hybrid algorithms proposed for workflow scheduling is shown in Tables 7 and 8 based on the year of publication. These tables also provide answers to RQs 1-4 indicating the strengths and weaknesses of the algorithms. The testing tools used in the experiments, target QoS constraints, scheduling strategy, etc. are also highlighted.  It runs the workflow execution process to minimize total cost and makespan. This algorithm uses the concept of the novel adaptive elite-based PSO (NAEB-PSO) for task resource mapping.
Not Specified.  The ability of the PSO approach to explore the problem space has been improved by using random inertia weight to provide particles with the ability to find better solutions during the late stages of the search. Not Specified. In comparison to non-heuristic implementations, the results of the experiments indicated a a 30% decrease in cost than PSO. Also, a 30% cost reduction in comparison to the gravitational search algorithm was recorded.
Not Specified. Authors of [79] introduced the PSO-ACO algorithm, which is a fusion of PSO and ACO algorithms. The algorithm focuses on reducing the cost and time expended in PSO for the "fitness test" and seeks the global optimal solutions in ACO. The initial approach is to initialize the population then compare them based on an iterative loop objective function. The steps are repeated by changing the speed and position of particles up to a full schedule. Moreover, ACO also uses the global updating process and job rescheduling. [80] proposed a new PSO and TS algorithm in which PSO executes global search and TS performs a local search. The idea behind this hybrid approach is to develop both local (in confined space) and global solutions. It finds viable solutions while preventing solutions from sticking to the local optima.
A non-dominance sort-based Hybrid Particle Swarm Optimization (HPSO) algorithm was introduced in [4] to solve scheduling problems with conflicting objectives in the IaaS cloud. HPSO is an extension of the authors' previously proposed Budget and Deadline Constrained heterogeneous Earliest Finish Time (BDHEFT) algorithm: a form of multi-objective PSO. One of the disadvantages of the proposed technology is that energy consumption was not considered during the scheduling phase. The HPSO aims to improve the deadline and budgetary constraints of two objectives: makespan and cost. In future works, the energy consumption of the created workflow scheduling could also be reduced while considering these two conflicting objectives. HPSO provides a collection of 'best solutions' a customer can choose from. Its operations consider non-predominant system arrangements to tackle the cloud work process for booking issues. It involves a mixture of a multi-objective PSO operation and rundown-based heuristics [4].
To improve the elasticity and heterogeneity of the existing works in the cloud for optimal scientific workflow scheduling in the IaaS cloud, [81] introduced a new meta-heuristic optimization strategy involving ACO and PSO algorithms. The hybrid finds local and global best solutions to minimize makespan and reduce cost. Another hybrid workflow scheduling algorithm was proposed in [82]. The approach combines the features of both PSO and SA. It was implemented on Cloudsim to improve on the brokers' services, reduce makespan as well as increase resource utilization.
Mapping tasks to the available resources in the cloud is quite challenging. Thus, a hybrid meta-heuristic algorithm for optimizing parallel scheduling processes in the cloud environment [83] was proposed. This improves on the hybridized SA and PSO algorithms using a crossover variation operator. The algorithm was able to effectively reduce the flow time and schedule length. Results from the experiment indicated that the approach outperforms some of the existing approaches. A PSO-based constraints-aware multi-QoS workflow scheduling strategy and a proposed look-ahead heuristics (LAPSO) [22] were used to provide QoS satisfaction for various end-users (EU) with diverse QoS objectives and optimization requirements. The strategy selects the best solution using the proposed constraints handling approach. It hybridizes PSO with a novel look-ahead mechanism based on min-max heuristics which improves the quality of choosing the best solution. Simulation results indicate that LAPSO guarantees the satisfaction of the EU constraints even in "tight" situations.
Another major issue in cloud industries is allocating and scheduling dynamic and virtual resources to the users for maximal profit. A multi-objective resource allocation (GA-PSO) algorithm was proposed in [84] to minimize cost, time, and energy consumption. The approach uses meta-heuristic algorithms to solve some scheduling issues encountered in cloud industries. PSO solves large optimization problems with superior search speed and GA solves both non-linear and highly complicated engineering problems. Experimental results indicated that GA-PSO was able to reduce cost and makespan and also increase resource utilization.
Conflicting constraints such as budget and deadline emerge in the course of cloud resources scheduling because cheaper resources are slower than expensive resources. Most of the previous studies concentrate on one objective, i.e., either time minimization or cost minimization, under the influence of user-specified QoS constraints. Because of the complexity of workflow and the dynamic nature of the cloud, a trade-off is required to make a balance between total time of execution and processing cost. Another effort, RHDPSO [26], showed that premature convergence and position maximum should be prevented. To this end, two methods were presented: first, "the discretization process" which is used to overcome the multi-QoS workflow constraint scheduling problem and second, "the random time sequence method" which can interrupt double particle extremities and solve premature convergence and local optimum problems. However, the regular PSO algorithms are superior to this hybrid form.

Summary of the Literature Review
This section summarizes the QoS metrics used in the reviewed literature for the evaluation of PSO-based workflow scheduling strategies. The rate at which each QoS metric is utilized for evaluation purposes is presented in Figure 6. The limitations of this review are also highlighted. process" which is used to overcome the multi-QoS workflow constraint scheduling problem and second, "the random time sequence method" which can interrupt double particle extremities and solve premature convergence and local optimum problems. However, the regular PSO algorithms are superior to this hybrid form.

Summary of the Literature Review
This section summarizes the QoS metrics used in the reviewed literature for the evaluation of PSO-based workflow scheduling strategies. The rate at which each QoS metric is utilized for evaluation purposes is presented in Figure 6. The limitations of this review are also highlighted.

Percentages of QoS Metrics Used in Workflow Scheduling Strategies
From Figure 6, the most widely used evaluation measure by the researchers in the reviewed literature to evaluate PSO-based workflow scheduling strategies is cost. 30% of the proposed workflow scheduling strategies considered the execution cost. The metric with the second-highest percentage of usage (17%) is execution time followed by Makespan (15%). Resource utilization and reliability constraint metrics of PSO-based workflow scheduling rates represent 6% and 5%, respectively. Next to meeting users' defined deadline constraints (9%), energy consumption was used for evaluation by 4% of the total research on workflow scheduling encountered in this review. Efficiency, security, and reputation (1% each) were scarcely used.

Limitations of This Literature Review
Upon analyzing the data obtained from the literature review related to workflow scheduling, we realize the following limitations:

•
The best criteria or methods for different databases were not defined.

•
The accuracy of the algorithms has not been established.

•
Not all the QoS constraints, e.g., load balancing, were addressed.

Historical Distribution
This paper presents the distribution of published research on workflow scheduling in the last few years. Specific RQs are considered for resolving the gaps in current strategies (RQ3

Percentages of QoS Metrics Used in Workflow Scheduling Strategies
From Figure 6, the most widely used evaluation measure by the researchers in the reviewed literature to evaluate PSO-based workflow scheduling strategies is cost. 30% of the proposed workflow scheduling strategies considered the execution cost. The metric with the second-highest percentage of usage (17%) is execution time followed by Makespan (15%). Resource utilization and reliability constraint metrics of PSO-based workflow scheduling rates represent 6% and 5%, respectively. Next to meeting users' defined deadline constraints (9%), energy consumption was used for evaluation by 4% of the total research on workflow scheduling encountered in this review. Efficiency, security, and reputation (1% each) were scarcely used.

Limitations of This Literature Review
Upon analyzing the data obtained from the literature review related to workflow scheduling, we realize the following limitations:

•
The best criteria or methods for different databases were not defined.

•
The accuracy of the algorithms has not been established.

•
Not all the QoS constraints, e.g., load balancing, were addressed.

Historical Distribution
This paper presents the distribution of published research on workflow scheduling in the last few years. Specific RQs are considered for resolving the gaps in current strategies (RQ3). From Sections 1-6, we evaluated potential expectations (RQ5) after conducting QA, SDS, and DCP in the corresponding publications between the years 2000 to 2019.

Distribution of Publications per Year
The papers published in the years between 2000 and 2019 are shown in Figure 7. This analysis includes all articles that we got from all databases before the exclusion process. Figure 7

Distribution of Publications per Year
The papers published in the years between 2000 and 2019 are shown in Figure 7. This analysis includes all articles that we got from all databases before the exclusion process. Figure 7   The highlighted measurements in most of the research articles reviewed within the scope of this paper (i.e., scheduling scientific workflow using PSO-based techniques) are illustrated in Figure 8. Researchers in this area considered 15 common metrics. The frequency count and the number of papers that used each metric are also provided in Figure 8. The important and least considered measurements are shown as well. The highlighted measurements i.e., Execution time, Makespan and Cost, have the highest frequency count. Fault tolerance, throughput, response time, reputation, efficiency and security are much less utilized in PSO-based scientific workflow scheduling. A classification of the studied QoS metric for each paper is also provided in Figure 8. Therefore, we note that Execution time, Makespan and Cost are the main concerns for most authors.   The highlighted measurements in most of the research articles reviewed within the scope of this paper (i.e., scheduling scientific workflow using PSO-based techniques) are illustrated in Figure 8. Researchers in this area considered 15 common metrics. The frequency count and the number of papers that used each metric are also provided in Figure 8. The important and least considered measurements are shown as well. The highlighted measurements i.e., Execution time, Makespan and Cost, have the highest frequency count. Fault tolerance, throughput, response time, reputation, efficiency and security are much less utilized in PSO-based scientific workflow scheduling. A classification of the studied QoS metric for each paper is also provided in Figure 8. Therefore, we note that Execution time, Makespan and Cost are the main concerns for most authors. measurements are shown as well. The highlighted measurements i.e., Execution time, Makespan and Cost, have the highest frequency count. Fault tolerance, throughput, response time, reputation, efficiency and security are much less utilized in PSO-based scientific workflow scheduling. A classification of the studied QoS metric for each paper is also provided in Figure 8. Therefore, we note that Execution time, Makespan and Cost are the main concerns for most authors.

Research Validity
This research has carefully examined the existing literature. Nonetheless, some primary studies might not have been reviewed considering that researchers use various synonyms in the course of presenting their work. Moreover, we have thoroughly reviewed the techniques during the DCP stage to prevent the biased study selection problem.

Technical Comparison of Cloud, Fog and Edge Computing
The fog computing architecture consists of fog clusters wherein multiple fog devices cooperate. In contrast, the most important physical components of clouds are data centers. As a result, cloud computing is expensive to operate and it consumes energy. On the other hand, energy consumption and operation costs are low in the fog computing paradigm. Fog is closer to the user, so there can be one or a few hops between users and fog devices [89]. However, there is a significantly higher distance between users and the cloud. For this reason, the latency of communication for the cloud is higher when compared to fog. Cloud is more centralized while fog is regionally coordinated and has dispersed solution [90].
In edge computing, different platforms (all having different runtimes) could be used for programming. Cloud computing typically uses one programming language for one target platform. Edge computing requires a comprehensive security plan to address its state-of-the-art authentication and proactive attacks while the cloud does not require a massive security plan. Edge computing is known to be suitable for operations with very high latency requirements. Hence, medium-sized businesses with budget constraints can save financial resources using edge computing. On the other hand, cloud computing is better suited for large data storage programs and organizations [91]. The technical comparison between these three types of computing is shown in Table 9.

Open Challenges and Future Research Direction
This section answers RQ5 by providing insights into further research directions. We recommend that future works should explore different choices that affect the performance of the scheduling algorithm in terms of: (1) selecting the initial resource pool which has a significant effect on the process, (2) using different optimization algorithms such as genetic algorithm and (3) comparing their performances with PSO. There is also a need to work with these algorithms to ensure the process of mapping tasks to resources takes place with enough memory size. Moreover, these algorithms should be implemented in workflow engines so that they can be used in different real-life applications. Introducing hybrid meta-heuristic algorithms can also improve the performance in the cloud.
Similarly, it is necessary that multi-objective solutions are introduced to workflow scheduling processes. Due to the vulnerability of the cloud environment to failure, fault-tolerant approaches are needed to ensure effective communication between users. Also, error-tolerant methods with lower complexity must be taken into account for designing multi-objective workflows. To reduce the energy consumption of cloud data centers and achieve green-cloud computing, more attention must be paid to objectives like the VM load balancing in the data center network. Furthermore, some recent cloud schedulers report security threats that negatively impact the quality of cloud services. This is generally because cloud systems overlook security problems. To either prevent or minimize the impacts of these security threats, future research addressing scheduling problems should focus on various factors relating to protection and workflow scheduling solutions.

Conclusions
Cloud computing is a new technology that gives the industry the ability to take the benefits of virtual resources on a pay-per-use basis. Its scheduling process involves mapping the tasks to the VMs to reduce the makespan and execution cost. Scheduling also enhances resource availability and system scalability for cloud providers thereby reducing the operational cost of data centres. A popular "unorganized" optimization technique for low computational and cost-effective applications suitable for workflow scheduling in cloud computing is Particle Swarm Optimization. In this respect, this paper presented a clear analysis of different PSO-based algorithms in cloud computing. This was done in line with the objectives of solving workflow scheduling problems. We note that future work should focus on scheduling workflows in a heterogeneous cloud environment. Also, the dynamic request for hybrid resources should be evaluated while considering different levels of reliability. Furthermore, scheduling algorithms should also cater to the trust concerns of cloud users who submit tasks for execution in the cloud.