1. Introduction
In 1997, EUROCONTROL proposed the System Wide Information Management (SWIM) concept to the Federal Aviation Administration [
1]. The International Civil Aviation Organization accepted this concept in 2002 [
2]. The System Wide Information Management is a large-scale network system with high integration, which integrates and manages internal resources. SWIM, as an information sharing platform, provides a unified data transmission and exchange mechanism for different subsystems of civil aviation businesses. In order to achieve interoperability and consistency among the relevant units of civil aviation, as well as reduce the difficulty of integrated dispatching, this system enables data to be processed and integrated among independent systems. The basic idea of SWIM is to allow all air traffic participants, such as airports, air traffic control, airlines and other related civil aviation units, to share and exchange the latest information. Moreover, it ensures that information can be shared safely, effectively and in a timely manner. Thus, for accurate and timely cooperative decision-making, SWIM can greatly improve the predictability and effectiveness of decisions and meet the needs of civil aviation in terms of efficient and coordinated operation. The comparison of the traditional civil aviation mesh communication network and the SWIM bus communication structure is shown in
Figure 1 [
3].
In order to improve SWIM resource utilization and sharing rate, it is necessary to solve the problem of system task scheduling. Because of the distribution, heterogeneity and autonomy of resources in a SWIM environment, SWIM task scheduling is more complex and difficult. As an algorithm for solving combinatorial optimization problems, the ant colony algorithm has the characteristics of parallelism and strong robustness. It can quickly obtain high-quality solutions and is very suitable for solving SWIM task scheduling problems.
Although the existing improved ant colony algorithm can improve the efficiency of task scheduling, it does not significantly improve the system load imbalance. This paper summarizes the existing network task scheduling technology in combination with SWIM’s own characteristics, on the basis of the classical ant colony algorithm, focusing on solving the problem of unbalanced resource load of service nodes in task scheduling. According to the hardware performance and load situation of service node update pheromones, the SWIM ant colony task scheduling algorithm based on load balancing (ACTS-LB) is proposed. This ACTS-LB algorithm can reduce task execution time and improve system load balancing, thus ensuring that the user’s task scheduling needs can be met as much as possible.
The remainder of the paper is structured as follows. We give some related works in
Section 2. We introduce SWIM load balancing requirements and related definitions in
Section 3.
Section 4 presents the details of the ACTS-LB algorithm. We give the simulation results and analysis in
Section 5 and conclude this paper in
Section 6.
2. Related Work
At present, experts and scholars both at home and abroad have conducted much research on task scheduling in large-scale network systems, such as cloud computing, grid and other network systems. Among them, the min-min algorithm, max-min algorithm and other task scheduling algorithms are the early classic methods. There are also exact algorithms to solve the task scheduling problems, such as branch-and-bound algorithms, linear programming and cross-entropy methods. The heuristic algorithms were used to solve optimization problems, such as genetic algorithm (GA), simulated annealing (SA), ant colony (AC) algorithm, particle swarm optimization (PSO) and greedy algorithm. These algorithms have different characteristics, which are based on the idea of optimizing minimizing scheduling objectives. They have been applied in the task scheduling of these network systems and have become the reference object for subsequent research on scheduling algorithms.
In [
4], the authors proposed a new approach for solving the shift minimization personnel task scheduling problem. These properties are used to develop a new branch and bound scheme, which is used in conjunction with two column generation based approaches and a heuristic algorithm to create an efficient solution procedure. In [
5], the authors presented a novel distributed implementation of multiple hypothesis tracking (MHT). Based on hash-tree distributed content storing approach to enable fast operations on local trees and also allow sharing of hypotheses between local and remote nodes. In [
6], the authors proposed a general algorithm for fast estimation of probability of error of linear block codes on BSC channels based on the importance sampling and the cross-entropy method for rare-events that can be employed for any hard-decision decoder. When optimal decoding is used the algorithm reduces to a single simulation run that can estimate, with a given accuracy, performances for a whole range of sufficiently high signal-to-noise ratios. In [
7], a multi-objective task scheduling algorithm was proposed based on the fusion of a genetic algorithm and a particle swarm algorithm to improve the global search ability and convergence speed. In [
8], this paper introduced an optimized algorithm for task scheduling based on genetic simulated annealing algorithm in cloud computing and its implementation. Algorithm considers the QoS requirements of different type tasks, the QoS parameters are dealt with dimensionless. In addition, a hybrid algorithm based on ant colony optimization (ACO) and Cuckoo was used to reduce task execution time, as proposed by RG Babukarthik [
9]. In [
10], the authors proposed a new heuristic algorithm combined with the particle swarm optimization (PSO) algorithm. It has the characteristics of strong optimization search ability, fast convergence speed and high solving quality, providing a new direction for solving task scheduling problems in a cloud computing environment. In [
11], the authors comprehensively considered the characteristics of tasks and virtual machine resources in the cloud environment and proposed a task scheduling strategy that improves the greedy algorithm to improve the overall scheduling efficiency of the system in the cloud computing environment. These algorithms have good performance in processing task scheduling, but they seldom confer benefits from the perspective of system load balancing.
In the research of task scheduling algorithms aimed at load balancing, Arul Xavier and others proposed a cloud computing load balancing task aware scheduling algorithm for the task scheduling problems in various heterogeneous virtual machines [
12]. In [
13], by analyzing and comparing some cluster load balancing algorithms, a task load balancing scheduling algorithm based on ant colony optimization (WLB-ACO) was proposed. The algorithm task completion efficiency is good, but the task scheduling quality is difficult to guarantee. In [
14], the authors presented a particle swarm optimization with the random forest classifier algorithm, which is used to solve the load balancing problem of virtual machines. In order to balance the utilization of virtual machine resources, the total task working time on a virtual machine is taken as the optimization objective. However, this method relies too much on intermediate nodes. In [
15], a cloud computing resource scheduling method based on a parallel genetic algorithm was proposed. This method can reduce the overall execution time of scheduling tasks to a certain extent, but it easily falls into local solutions. In [
16], the authors introduced the concept of virtual machine relative fitness according to the performance of virtual machine resources in a cloud environment, which makes it possible for virtual machine resources with a high virtual machine relative fitness to obtain greater variation and can thus speed up the convergence of the algorithm. The comparison of the related work references are shown in
Table 1.
On the basis of the existing research results, this paper analyzes and refers to a large number of network task scheduling algorithms in distributed and heterogeneous environments. Combined with the characteristics of SWIM, the adopted ant colony algorithm is improved. We propose a SWIM ant colony task scheduling algorithm based on load balancing (ACTS-LB). The simulation experiment results show that the performance of the ACTS-LB algorithm is better than that of the traditional min-min algorithm, ACO algorithm and PSO algorithm. The specific contributions of this study are as follows:
(1) The ACTS-LB algorithm reduces the task scheduling completion time and improves SWIM resource utilization;
(2) It can ensure that SWIM has better load balancing performance;
(3) It is of great significance to promote the SWIM application for civil aviation industry development.
3. SWIM Load Balancing Requirements and Related Definitions
3.1. SWIM Load Balancing Requirements
SWIM is a large-scale network system with high integration, which integrates and manages internal resources. In order to improve the utilization rate of system information and the sharing rate of resources, it is necessary to solve the problem of system task scheduling load balancing and find suitable nodes to deal with user needs. This makes SWIM load balancing and task scheduling more complicated and difficult. The selection of a resource node with a smaller load to complete user requirements is a very challenging task.
The SWIM infrastructure has a huge amount of computing and storage resources. Existing and future civil aviation applications require these resources to have the ability to perform large-scale, real-time interactions. This can result in a huge load on the data center, and it is easy to produce load imbalance. Different from load balancing in small database systems, SWIM—as a large-scale data center network—needs to respond to a high throughput of concurrent requests. Resource task scheduling and allocation become key, which puts forward higher requirements for load balancing methods [
17].
(1) Large-scale information network systems have a huge amount of data resources, so a traditional load balancer cannot meet the application requirements. It is necessary to design a new load balancing method.
(2) Although the complex load balancing algorithm can achieve a good balancing effect, it occupies too much of the system’s own resources. Especially in the case of a large number of system resource nodes, there will be risks of algorithm dead-cycle or paralysis. Therefore, it is also necessary to design a lightweight load balancing algorithm that can achieve a good load balancing effect and has low algorithm complexity.
(3) The load balancing method must work under conditions of a large amount of data, in the form of concurrent requests, as well as satisfy the requirement of reasonable load allocation in the case of resource competition.
3.2. Relevant Definitions
(1) System total resource set , where the system consists of m resource nodes and is the system resource nodes;
(2) System total scheduling task , including n scheduled tasks, where is the system scheduling tasks;
(3) Expected time to compute (ETC), , where is the expected completion time of scheduling task on resource node ;
(4) is the search table.
The mathematical symbols in the formula of this paper are shown in
Table 2.
4. Ant Colony Optimization Algorithm Based on Load Balancing
4.1. Ant Colony Optimization Algorithm Analysis
Marco Dorigo first proposed an ant colony optimization algorithm in 1991 to achieve global optimization by simulating ant foraging behavior [
18]. Suppose that
is the transition probability value of ant
from resource node
to resource node
at time
, using this value to select the next task to schedule. It is expressed by Equation (1).
In Equation (1), is the pheromone value for transferring resource node to resource node at time ; is the heuristic function for transferring resource node to resource node at time ; is the information heuristic factor, which represents the relative importance of scheduling order; is the expect heuristic factor, which represents the importance of heuristic information in ant selection scheduling sequence. The larger the value of , the more likely the ants are to choose the scheduling sequence autonomously. Set as the task that ant can choose to perform in the next step.
According to Equation (1), the two key factors affecting the selection of resource nodes are and . The improvement of these two factors is the difficulty and emphasis of the algorithm research. The ACTS-LB algorithm proposed in this paper is based on the standard ant colony algorithm to make corresponding improvements to , and to update pheromone rules according to the node resource hardware performance and load balancing value.
4.2. Ant Colony Optimization Task Scheduling Algorithm Rule
In the actual SWIM, in order to achieve effective load balancing, the hardware performance of the node resources must be considered. First, reasonably assign tasks to each node according to the node processing capacity. The hardware performance is represented by two key indicators—computing power and system communication bandwidth of resource nodes. The hardware performance can be expressed by Equation (2).
In Equation (2), represents the computing power of resource node , represents the communication bandwidth of resource node , and and are constants.
In SWIM, the indicators of impact on resource nodes load are mainly composed of CPU utilization, memory utilization and system bandwidth occupancy. The load balancing value of SWIM resource nodes can be expressed by Equation (3).
In Equation (3), represents the load balancing value of resource node ; represents the CPU utilization rate of resource node ; represents the memory usage of resource node ; represents the bandwidth occupancy of resource node ; the coefficients , and are constants, representing the weights of the three items, and .
The average value of the load of all resource nodes in SWIM is expressed by Equation (4).
Then the resource node load standard deviation is expressed by Equation (5).
By calculating the resource node load standard deviation in SWIM, it can reflect the system load balance degree. The larger the value of , the more unbalanced the system load. The smaller the value of , the more balanced the system load.
The ant colony algorithm pheromone update is expressed by Equation (6).
In Equation (6), is the information residual factor, is the pheromone volatilization coefficient, which is employed to avoid the pheromone infinite accumulation. The value range is limited to . is the pheromone concentration value on the path () at time ; is the pheromone increment value on the selected path (), the initial time is = 0.
This paper uses the resource node hardware performance parameters and the system average load difference to update the pheromone
, as expressed by Equation (7).
The heuristic function
in ant colony algorithm can be expressed by Equation (9).
In order to maximize the utility, we must also consider the completion time factor of all tasks and try to minimize the task completion time. So, the expected completion time is used to calculate the heuristic function of the ant colony algorithm. The smaller the expected completion time, the larger the value of and the higher the value of . Thus, the degree of expectation that an ant will transfer from task to node is improved.
The pheromone update calculation in Equation (6) and heuristic function calculation in Equation (9) are substituted into Equation (1) to obtain the probability value that task is scheduled to resource node after the optimization of the ant colony algorithm. The computational complexity of the improved ACO algorithm presented in this paper is expressed as , where is the number of cycles and is the total number of tasks.
4.3. Ant Colony Task Scheduling Algorithm Optimization Process
The main idea is to select the best order of task scheduling through the ant colony algorithm, in which the scheduler can be regarded as an ant and the task scheduling process is compared with the ant foraging process. The pheromone is updated according to the hardware performance parameters of the system resource node and the system average load difference, and the expected time to complete all tasks is at a minimum. So, the expected completion time is used to calculate the heuristic function of the ant colony algorithm. Finally, a resource node with high pheromone concentration (high performance and low load) and minimum completion time is selected to handle the assigned task.
The research content of SWIM task scheduling is to map each task to resource set
in task set
, and to improve the task scheduling performance of the system on the premise of ensuring load balancing. The specific steps of the ant colony optimization task scheduling algorithm for SWIM based on load balancing are as follows. The flow chart of the algorithm is shown in
Figure 2.
Step 1: First, initialize the parameters in the algorithm. There are tasks, resources, the number of ants is , the maximum number of cycles is , the initial time is set to 0 (it can be assumed that there are air flight planning missions and navigation information resources in SWIM in this scenario).
Step 2: Make . If , after initializing the search table, all ants search the path from the initial position and stop searching when all tasks enter the search table.
Step 3: The hardware performance and load balance difference of the resource nodes are obtained by calculations, and the transferred probability of each resource node is calculated by Equation (1), using to select the perform task resource node.
Step 4: Update the search table and add the scheduling task to the search table. Use to record the set of tasks that ant is currently passing through. Do not repeatedly select the path that has already passed as the next path.
Step 5: Update the pheromone of the ant colony algorithm according to Equation (6) and save the selected ant information with the best scheduling result.
Step 6: Repeat the above steps until
, when the algorithm ends [
19].
5. Experiment and Results Analysis
In the experiment, we referred to the SWIM structure that is already deployed by an air traffic administration of civil aviation, and we used the network simulation tool—NS-3 to verify the algorithm performance. The SWIM local task scheduling structure is shown in
Figure 3.
5.1. Experimental Environment
The simulation experiment designed a topology model with one scheduler, four routers, six servers and eight clients. The node 1 is the scheduler, nodes 2–5 are the routers, nodes 6–11 are the servers, and nodes 12–19 are the clients. The link bandwidth between the client and the router, between the router and the scheduler, and between the router and the server is 150 Mbps, and the one-way delay of 10 ms. The simulation experiment test topology is shown in
Figure 4.
5.2. Experimental Parameter Settings
For the selection of parameters
,
and
in the ant colony algorithm, there is no general method to determine their optimal combination. It should be noted that these parameters have a great impact on the performance of the ant colony algorithm. The smaller the value of
, the smaller the influence of the previously searched path on the current search, making the algorithm difficult to converge. The larger the value of
, the greater the influence of the previous search path on the current search, but this also increases the risk of falling into a local optimum. So, the
value is generally set at 0.5. In the parameter selection experiment, the
and
values were set as
= 0.5 and
= 0.6. The simulation program was cycled 1000 times, the best result was selected as one experiment, taking the average value of 100 experiments. The parameters are substituted into the algorithm for the simulation test. The simulation experiment data are shown in
Table 3.
From
Table 3, we can see that the optimal path is 568.10969, the running time is 39.86 s,
= 0.6,
= 1.0253 and
= 2.7526. After the comparative study and multiple experiments, the parameters in this paper simulation experiment were set to
= 1.0,
= 2.7 and
= 0.6. In the experiment, we found that when the same parameters were used for experiments of different scales, the results obtained were different. The weight parameters of the resource node load impact indicator in SWIM were set to
= 0.4,
= 0.3 and
= 0.3. The population size was 50, and the maximum number of iterations was 150. The simulation experiment test parameters are shown in
Table 4.
5.3. Experimental Results and Analysis
In order to verify the effectiveness of the ACTS-LB algorithm designed in this paper, we tested the method using the same parameter configuration conditions but in two different situations: (1) The service nodes number is fixed, but scheduling tasks number is changed; (2) The scheduling tasks number is fixed, but the resource nodes number is changed. This was compared with the classic min-min algorithm [
20], the ACO algorithm [
21] and the PSO algorithm [
10] in terms of transmission delay, task execution time and system load deviation value.
Experiment 1: The transmission delay comparison.
(1) Set the number of service nodes at m = 20 and the number of scheduling tasks at 10 < n < 200.
(2) Set the number of scheduling tasks at n = 100 and the number of service nodes at 5 < m < 50.
In the above two cases, by running the simulation software NS-3, the time delay data of the path was obtained by measuring different iterations of the ACTS-LB algorithm, min-min algorithm, ACO algorithm and PSO algorithm in the path finding process. The trace file obtained was counted and analyzed by the statistical analysis tool Wireshark. We obtained the total time delay value of the path after each iteration for each of the four algorithms, and the average value of each algorithm was obtained after each iteration was run 1000 times. The result was processed by MATLAB. The results are shown in
Figure 5 and
Figure 6.
It can be seen from the experimental results in
Figure 5 and
Figure 6 that, at the beginning of ant colony routing, the four algorithms have large fluctuations in the total transmission delay of the path and have the same performance. After 100 iterations, the ACO algorithm basically leveled off and no longer changed, indicating that the ACO algorithm falls into premature convergence after the 100th iteration and could no longer search for a better path. When the number of tasks increased in the later period, the transmission delay variation of the min-min algorithm and PSO algorithm were not obvious. The ACTS-LB algorithm expanded the path space due to the optimized updating of pheromones, and the ants disturbed other paths so that the obtained paths were evenly distributed in the path space, ensuring that the ants had a stronger search ability and could find better quality transmission paths. Thus, it can be seen that the ACTS-LB algorithm can effectively solve the premature convergence problem of the ACO algorithm, providing ants with a stronger search ability and obtaining a better-quality path set.
Experiment 2: The task execution time span comparison.
(1) Set the number of service nodes at m = 20 and the number of scheduling tasks at 10 < n < 200.
(2) Set the number of scheduling tasks at n = 100 and the number of service nodes at 5 < m < 50.
In the above two cases, the task execution time of the ACTS-LB algorithm, min-min algorithm, ACO algorithm and PSO algorithm were tested in the context of task scheduling. The experimental results were sorted according to the data in NS-3. The four algorithms were executed 1000 times and averaged. The results of the task execution time span are shown in
Figure 7 and
Figure 8.
It can be seen from the experimental results in
Figure 7 that, in the initial stage of task scheduling, the task execution time of the ACTS-LB algorithm, min-min algorithm, ACO algorithm and PSO algorithm are basically the same. However, the task execution time of the ACTS-LB algorithm becomes smaller than that of the other three algorithms as the number of tasks increases. From the experimental results in
Figure 8, it can also be seen that when the number service nodes increased, the ACTS-LB algorithm execution task scheduling time was less than that of the min-min algorithm, ACO algorithm and PSO algorithm. This shows that the method of calculating heuristic function by using expected execution time in the ACTS-LB algorithm plays an effective role in shortening the task execution time, and ants tend to choose the path with high pheromone concentration and minimum completion time. It can be concluded that the ACTS-LB algorithm can improve task execution efficiency and enable ants to have a stronger task scheduling ability.
Experiment 3: The system load deviation comparison.
(1) Set the number of service nodes at m = 20 and the number of scheduling tasks at 10 < n < 200.
(2) Set the number of scheduling tasks at n = 100 and the number of service nodes at 5 < m < 50.
In this paper, the load deviation value (
) of SWIM resource nodes is proposed as a reference standard for evaluating load balancing in task scheduling. The
definition can be expressed by Equation (10).
The definition of the SWIM resource nodes load value
is the same as in Equation (3);
and
are the maximum and minimum values of
, respectively;
is the average of all resource node loads. According to the experimental results, the load deviation values of the resource nodes during the task scheduling of the four algorithms were obtained. The results are shown in
Figure 9 and
Figure 10.
It can be seen from
Figure 9 and
Figure 10, the
value of the ACTS-LB algorithm is smaller than that of the min-min algorithm, ACO algorithm and PSO algorithm, and the transformation becomes slower as the number of tasks increases. This is because in the ACTS-LB algorithm, pheromones are updated according to the hardware quality and the load balancing of node resources, so ants tend to aim for resource nodes with high pheromone concentrations (high performance and low load of resource nodes) when selecting the next task processing target node. It can be concluded that the ACTS-LB algorithm with
as the reference standard can achieve the goal of load balancing and improve the overall performance of task scheduling.
5.4. ACTS-LB Algorithm Shortcomings
The ACTS-LB algorithm, however, can still be further improved, which requires further optimization and in-depth study in the following aspects:
(1) Because the ant colony algorithm has a slow convergence rate in the initial stage, it can easily fall into the local optimum. Therefore, the ACTS-LB algorithm also has some shortcomings. In the subsequent algorithm improvement, we will draw on other heuristic algorithm advantages, such as the simulated annealing (SA) algorithm and genetic algorithm, etc. The comprehensive application of these algorithms’ different characteristics, and the fact that their advantages and disadvantages complement each other, will aid us in improving the algorithm’s overall task scheduling performance.
(2) The ant colony algorithm parameter setting is very important. If it is not set properly, it will slow down the solving speed and affect the quality of the results. In the next SWIM task scheduling study, we will consider the task dynamics and the service quality of the system. We plan to dynamically adjust the corresponding parameters in the algorithm and study the impact of ant colony parameters and local search methods on the overall performance of the system, so as to continuously improve and perfect the task scheduling strategy.