A Virtual Machine Consolidation Algorithm Based on Dynamic Load Mean and Multi-Objective Optimization in Cloud Computing

High energy consumption and low resource utilization have become increasingly prominent problems in cloud data centers. Virtual machine (VM) consolidation is the key technology to solve the problems. However, excessive VM consolidation may lead to service level agreement violations (SLAv). Most studies have focused on optimizing energy consumption and ignored other factors. An effective VM consolidation should comprehensively consider multiple factors, including the quality of service (QoS), energy consumption, resource utilization, migration overhead and network communication overhead, which is a multi-objective optimization problem. To solve the problems above, we propose a VM consolidation approach based on dynamic load mean and multi-objective optimization (DLMM-VMC), which aims to minimize power consumption, resources waste, migration overhead and network communication overhead while ensuring QoS. Fist, based on multi-dimensional resources consideration, the host load status is objectively evaluated by using the proposed host load detection algorithm based on the dynamic load mean to avoid an excessive VM consolidation. Then, the best solution is obtained based on the proposed multi-objective optimization model and optimized ant colony algorithm, so as to ensure the common interests of cloud service providers and users. Finally, the experimental results show that compared with the existing VM consolidation methods, our proposed algorithm has a significant improvement in the energy consumption, QoS, resources waste, SLAv, migration and network overhead.


Introduction
With the rapid development of cloud computing, the scale of data centers becomes larger and larger, and a large number of hosts around the world consume a huge amount of power every day, resulting in high CO 2 emissions [1]. Studies have shown that the average CPU utilization of physical hosts in cloud data centers is only 15-20%, and the energy consumed by an idle server is 70% of the peak energy [2,3], which not only wastes energy and resources but also greatly increases the operating costs of cloud service providers. In addition, VMs perform data intensive applications by accessing physical networks and communicating with interdependent VMs on different servers, which not only increases the network traffic but also reduces the overall performance of the data center's network [4]. For users, this seriously hinders the customer's experience, while for cloud service providers, a QoS may not be guaranteed, leading to SLAv. Therefore, considering the interests of cloud service providers and users, the above issues generate a complex and challenging multi-objective resource management problem in cloud computing.
The cloud computing resources are provided to users in the form of a VM. Therefore, the core problem of resource management in cloud computing is the VMs management, and VM consolidation technology is the main method used to solve this problem. The key to VM consolidation is to dynamically obtain the optimal mapping between VMs and hosts, so as to minimize the energy consumption and resources waste to reduce the operating costs while ensuring a QoS for customers.
The main process of VM consolidation is to first determine the host load, then decide whether to migrate the VMs according to the host load status, and finally migrating the VMs to a new host. However, a dynamic cloud environment brings more challenges to an effective VM consolidation. First, in a dynamic cloud environment, especially in a cloud data center with thousands of hosts, detecting the host load state effectively and accurately has been an important problem to be solved. Some related studies [3,5,6] determine the host load status based on static thresholds, which lack the awareness of dynamic workloads in the data center. The current resources utilization cannot objectively and truly reflect the real load of hosts, which will lead to an excessive consolidation. The literature [7] proposes a dynamic threshold method, which considers the workload changes in the source host and target host after the VMs migration, but does not consider the resource load balance, which will lead to a large resources waste. In addition, some of the existing methods determine the host load state only based on the CPU resource [3][4][5][6][7][8][9][10], ignoring the impact of memory, network and other resources, which are also key factors affecting the QoS. Some studies [11][12][13][14] consider the impact of multi-dimensional resources, but mostly focus on the optimization of energy consumption and SLAv, ignoring the VM migration overhead and the network communication overhead. For example, during the process of a VMs migration, the performance of the VM will be reduced, and placing two interdependent VMs at a long distance will cause a large communication delay [15].
VM consolidation is a well-known NP-hard problem [16]. Some studies have used heuristic algorithms to solve this question [3,[17][18][19]. Heuristic algorithms are widely used in VM consolidation because of their simple implementation and low complexity. However, traditional heuristic algorithms are prone to fall into local optimal solutions. In the existing research, meta-heuristic algorithms are used to solve the VM placement problem [16,[20][21][22], such as ant colony optimization (ACO), artificial bee colony, genetic algorithm, etc. These algorithms can effectively avoid local optimal solutions when solving large-scale model problems. The ant colony system (ACS) is a classical method of ACO. It can get a better solution for complex combinatorial problems in an appropriate time. Because of its excellent performance in solving NP-hard and combinatorial optimization problems [23,24], it has attracted more and more attention in solving VM consolidation problems. In this paper, the VM consolidation is considered to be a multi-objective combinatorial optimization problem. Therefore, the ACS is selected as the solution. However, with the increasing size of the data centers, some studies [25][26][27] using the ACS to solve the VM consolidation problem have resulted in too long an execution time due to the increasing search space of the ACS as well. Therefore, it is necessary to further optimize the search performance.
To solve the above problems, this paper proposes a VM consolidation algorithm based on the dynamic load mean and multi-objective optimization in cloud computing, in which the host load status is comprehensively measured based on multi-dimensional resources, and the dynamic characteristics of the system load are considered. Then, the optimized ant colony algorithm is used to obtain the optimal mapping between the VMs and the hosts so as to realize the multiple objectives optimization of the resource utilization, energy consumption, migration and communication overhead. Our main contributions are as follows: (1) A host load detection method based on the dynamic load mean is proposed which uses multi-dimensional resources to comprehensively measure the host load status and considers the impact of load fluctuations to avoid an excessive consolidation. (2) A network-aware model is proposed to optimize the network communication overhead of interdependent VMs and the overall network traffic of the data center. (3) An improved ant colony optimization algorithm is proposed to obtain a better solution and execution efficiency through the optimization of the heuristic factors and execution process. The rest of the paper is organized as follows. Section 2 describes the related work. Section 3 introduces the relevant models. Section 4 introduces the host load detection method based on the dynamic load mean. Section 5 introduces the DLMM-VMC algorithm in detail. Section 6 gives the experimental results and performance evaluation. Section 7 concludes the paper and discusses the future work.

Related Work
A VM consolidation mainly solves three problems. That is, how to determine the host load status which mainly involves the host load detection methods, how to select the migration VMs which mainly involves the migration overhead calculation and how to select the placement hosts which mainly involves the best target host selection for the migration of the VMs.
For how to determine the host load status, some studies [3,5,6,28,29] used static thresholds of CPU utilization to determine whether a host was overloaded or underloaded, keeping the CPU utilization of the host between two fixed thresholds. However, in a cloud data center, resource utilization is constantly changing in multiple resource dimensions. In such as dynamic cloud environment, setting static thresholds or using the current utilization of a single resource is not an effective approach, which leads to an excessive consolidation. Therefore, some dynamic threshold algorithms were proposed. Beloglazov et al. [17] proposed an adaptive upper and lower thresholds method that sorted hosts based on the historical CPU data statistical analysis and CPU utilization prediction, which improves the ability to sense the dynamic changes in the host load. However, the load fluctuation was not considered. Chen et al. [30] proposed a host load detection method based on the time sliding window, which recorded the host CPU utilization in a certain time window through a regular sampling. When the host CPU utilization continuously exceeded the predefined threshold, the host was determined to be overloaded. Yadav et al. [31] proposed two adaptive methods based on robust regression to dynamically set the thresholds. Zhou et al. [32] proposed a dynamic adaptive three threshold host load detection method, using a K-Means clustering algorithm to divide the hosts into four types. However, these methods use CPU utilization as the main criterion to determine the host load. Therefore, it is not possible to accurately describe the load status of the hosts with multi-dimensional resource characteristics, which ultimately leads to an unnecessary migration and resources waste.
For how to select the migration of VMs, in [17,33], the authors proposed several migration VM selection strategies, which have been widely used. These strategies include the maximum utilization (MU) strategy, which selects a VM that has the highest CPU usage; the random selection (RS) strategy, which selects a VM randomly based on a uniformly distributed discrete random variable; the maximum correlation (MC) strategy, which selects a VM with the highest correlation with other VMs; and the minimum migration time (MMT) strategy, which selects a VM with the shortest migration time. Li et al. [34] used the similarity between memory contents of VMs to select a migration VM. This method aimed to select a VM with the highest similarity in memory contents from different hosts to reduce the migrated data and time. Masoumzadeh et al. [35] proposed a VM selection strategy based on fuzzy Q-learning, where multiple VM selection technologies were integrated, and the VM selection strategy was selected dynamically based on the fuzzy logic theory according to the current state of a host. Laili et al. [36] proposed a selection mechanism based on the iterative prediction algorithm, which used a reverse selection mechanism to select the most suitable VM from the candidate VM set for each randomly selected host. However, these studies only consider the migration overhead when selecting migration VMs, ignoring the impact on the host load. For overloaded hosts, the overload status should be quickly and accurately eliminated with the minimum migration overhead, while for low loaded hosts, the resources usage should be quickly reduced to shut down the host as soon as possible to reduce the energy consumption.
For how to select the placement hosts, the literatures [3,18] studied how to use the heuristic greedy algorithm to solve the VM placement problem. For instance, with the first fit (FF), first fit decreasing (FFD), best fit (BF) and best fit decreasing (BFD) algorithms [30,31], which aim to reduce the number of running hosts and the VM migration number. However, classical heuristic algorithms are not convenient for VM consolidation, so many subsequent studies have improved these algorithms to make them applicable to VM consolidation. Beloglazov et al. [17] proposed a power-aware best-fit decreasing (PABFD) VM placement algorithm, which selects a host with the least energy consumption increase as a placement host. Li et al. [37] proposed virtual switch aware BFD and FFD algorithms, which comprehensively considered the traffic between VMs and the CPU overhead generated by virtual switches. Moges et al. [38] proposed a modified best fit decreasing (MBFD) algorithm to improve SLAv and active hosts. Zhang et al. [39] further optimized the MBFD algorithm by combining FF and MBFD to achieve a better energy efficiency.
The heuristic greedy algorithm has a low complexity, but it is not suitable for solving large-scale problems and cannot be well applied to large-scale data centers. The metaheuristic algorithm has significant advantages in solving such problems. Li et al. [13] proposed a QoS-aware and multi-objective dynamic VM consolidation (QMOD) based on improved the genetic algorithm, which optimizes the three objectives of the load balancing, migration overhead and QoS. Li et al. [20] constructed the VM consolidation problem as a multi-objective optimization problem with multi resource constraints and solved the problem based on the artificial bee colony algorithm. They also proposed a VM consolidation method based on differential evolution (DE) [22]. However, in these methods, only the energy consumption and host overload risk are considered. Al-Moalmi et al. [40] proposed a VM placement method based on grey wolf optimization (GWO), which can use the CPU and RAM resources more effectively and reduce the number of active hosts, energy consumption and SLAv. Aryania et al. [26] proposed an energy-aware VM consolidation based on the ACS, which takes the energy consumption caused by VM migration as an important optimization goal. Farahnakian et al. [27] proposed an ACS-based VM consolidation approach that aims to maximize the number of dormant hosts and minimize the number of VM migrations. However, the above VM consolidation algorithms based on ant colony optimization take too long to execute for larger-scale data centers due to the large search space. Xiao et al. [25] proposed an improved ACS to solve the VM consolidation problem; they used the design heuristic factors to select both migration VMs and placement hosts and limited the search space of ants according to the type of host load state, thus reducing the blindness of the ant search and optimizing the execution efficiency. However, they ignore the optimization of selecting migration VMs, and the selection of migration VMs and placement hosts in VM consolidation jointly determines the search performance and execution efficiency of the ACS. Moreover, the above algorithms do not consider resources waste and network overhead. To some extent, the unbalanced resources utilization will greatly increase the resources waste, leading to an increase in the active hosts, thus increasing the energy consumption. In addition, the increase in the network overhead will greatly delay the response time of the application in the VM, which is easy to lead to a QoS degradation. Therefore, VM consolidation should comprehensively consider the interests of both the cloud service providers and the users to ensure that the cloud service providers minimize the operational costs while also ensuring a QoS for the users.

Data Center Resource Representation Model
Assume H = h 1 , h 2 , · · · , h j , · · · , h n is the collection of a data center's hosts, where n is the number of hosts, each host has D-dimensional resources and C d h j , d ∈ {1, 2, · · · , D} represents the capacity of resources d of a host h j . V = {v 1 , v 2 , · · · , v i , · · · , v m } is the collection of the data center's VMs, where m is the number of VMs; similarly, each VM also has D-dimensional resources, and C d v i , d ∈ {1, 2, · · · , D} represents the capacity of resources d of a VM v i . Further, assume V M h j indicates the VM collection of a host h j , U d t (v i ) represents the actual utilization of resource d of a VM v i at time t, and U d t h j is the actual utilization of resource d of a host h j at time t, which can be expressed as Equation (1).

Energy Consumption Model
The energy consumption of a host in a data center is mainly composed of its related components CPU, memory, hard disk and network, but studies have shown that CPU is the main energy consuming device, and there is a linear relationship between CPU utilization and host energy consumption [3]. Therefore, we can establish the following energy consumption model, as shown below.
where P max j is the energy consumption of a host h j when it is fully loaded, U d t h j is the cpu utilization of a host h j at time t, k is the energy consumption factor and studies show that the energy consumption overhead when the hosts are idle is 70% of that when they are fully loaded, so k is generally set to 0.7 [3]. In addition, through the above analysis, we can see that idle servers generate more energy consumption, and these hosts can be shut down in time to reduce the energy consumption.
Therefore, the total energy consumption in the data center is calculated as follows.

Resources Waste Model
Each host in the data center may run multiple VMs at the same time, and different VMs may run different applications, so the resources requirements in various dimensions are different. An unreasonable resources allocation will increase the resources waste. Therefore, it is important to ensure that the remaining resources of each dimension on the host are balanced in order to fully utilize the resources and prevent the resources waste. So, we propose a resources waste model, as shown below.
where U d x t h j and U d y t h j represent the utilization of resources d x and d y on a host h j , respectively. W h j and W denote the resources waste of the host h j and the data center, respectively.

Communication Overhead Model
The more links between VMs which pass through, the greater the network latency, which is one of the most important factors affecting the QoS. Therefore, by optimizing the communication path between VMs, the network latency can be reduced. The total network communication overhead N is calculated based on the network communication overhead between pairs of VMs and the distance to the physical hosts where they are located, as shown in the following equation. where a v i , v j indicates the network communication overhead between v i and v j , h(v i ) and h v j denote the hosts where v i and v j are located, respectively, and b h(v i ), h v j denotes the network communication distance between h(v i ) and h v j , which is measured using the number of switches and routes passed by both in the communication process; the larger the value, the greater the network communication overhead. Placing pairs of VMs with a high network communication overhead on the same or closer hosts can reduce the network communication overhead and greatly reduce the communication latency.

Migration Overhead Model
A VM migration overhead is also a very important optimization objective, because VM migration consumes additional compute resources, and excessive VM migrations can also lead to large workloads and energy consumption. What's worse, VM migration can degrade the QoS. Therefore, the number of VM migrations should be minimized during VM consolidation.
where m(v i ) = 1 indicates that VM v i needs to be migrated and M indicates the total number of migration VMs.

Multi-Objective Optimization
Minimizing the energy consumption, network overhead, migration overhead and resources waste are the multiple optimization objectives to be obtained for VM consolidation in this paper. According to Equations (3) and (5)-(7), we get the following multi-objective optimization model with some constraints. Constraints: where ϑ 1 + ϑ 2 + ϑ 3 + ϑ 4 = 1 and ϑ 1 , ϑ 2 , ϑ 3 , ϑ 4 are the weight values. Constraint (10) ensures that the VM is allocated to only one host. Constraints (11) and (12) help ensure that each host meets the resource requirements of the VMs on it and does not exceed the maximum threshold Thr d max h j .

Dynamic Load Mean
Host load detection is a key step in VM consolidation. Any kind of resource overload (e.g., CPU, memory, network and storage) will greatly degrade the service performance and lead to SLAv. Therefore, the host load should be detected based on multi-dimensional resources. When hosts are overloaded, VM consolidation is performed to limit the resources utilization within a certain range to avoid a performance degradation caused by a resource overload. In addition, the host load dynamically changes and fluctuates with its hosted VMs workload. A short-term fluctuation will not affect the performance of the system; if each fluctuation triggers the VMs migration, it will make the VMs consolidation too aggressive and bring a negative impact that not only poses the risk of a host overload, but also affects the performance of the applications on the VM.
In order to solve the above problems, we propose a host load detection method based on the dynamic load mean (DLM-HLD). On the one hand, the DLM-HLD uses multidimensional resources to calculate the comprehensive load of the hosts. On the other hand, it considers the impact brought by the system load's fluctuation, and uses the dynamic load mean within the recent sliding time window when calculating the resource's load in each dimension. The sliding time window size can be dynamically adjusted according to the load fluctuation's size, thus dynamically adjusting the load mean. The host h j comprehensive load L h j is calculated as shown in Equation (13).
where ω(d) is the weight coefficient of the resource d, and U d h j , T represents the resource d load mean in the sliding window T, which is calculated based on multiple discrete  (15). The larger the absolute difference, the larger the sliding time window size, and the larger the samples number, then the better ability to withstand load fluctuations during the VM consolidation. The parameter s defines the sensitivity to changes in resource d. The smaller s is, the more sensitive it is to perceive changes. Different resources have a different utilization on a host. The higher the resource utilization, the greater the impact on the host overload and the higher the weight is. The information entropy algorithm determines the weight according to the variation degree of the metric. The greater the variation degree, the greater the impact, the smaller the information entropy and the larger its weight. On the contrary, the smaller the variation degree, the smaller the impact, the larger the information entropy and the smaller its weight. In order to objectively perceive the comprehensive host load state, this paper uses information entropy to calculate the comprehensive host load L h j , and the specific steps are as follows.
(1) The decision matrix is calculated as the following equation, where each row of the matrix records the load mean of the resource d corresponding to the host h j , and each column of the matrix records the resource type.
(2) The matrix is normalized to obtain the matrix, as shown below.
where u xy = U dxy (hj,T) (3) Calculate the y-th term entropy E y = − 1 lnk ∑ k x=1 u xy lnu xy , E y ∈ [0, 1]. (4) The computational contribution degree is d y = 1 − E y , then the weight ω y is shown in the following equation.
The ω y is the weight of the resource d, and then the host h j comprehensive load L h j can be calculated by Equation (13).

Host Load Detection Base on Dynamic Load Mean
The pseudocode of the DLM-HLD is shown in Algorithm 1. First, we calculate the load mean U d h j , T of the host resource d; then, the host comprehensive load L h j is calculated based on U d h j , T . Finally, based on the limit thresholds, all the hosts are divided into three categories: the overload, normal load and underload. Assume that the host comprehensive load upper and lower thresholds are Thr max and Thr min , respectively, and the host resource d load upper and lower thresholds are Thr max d and Base (15) compute T

4.
Base (14) compute U d h j , T , t ∈ T and U d h j , t ∈ Ur

The Proposed DLMM-VMC Algorithm
The following describes the main ideas of the DLMM-VMC. First, according to the DLM-HLD method, the hosts are divided into three categories: the overload, normal load and underload. Then, selecting the migration VMs and placement hosts based on the optimization ant colony algorithm. On the one hand, when selecting migration VMs, prioritize the VMs that make the greatest reduction in the overloaded resource utilization on the host, which effectively reduces the migration VMs number. In addition, to save energy, migrate as many VMs on underload hosts as possible to shut down more hosts. On the other hand, when selecting placement hosts, prioritize the hosts that make the best utilization of the resources on the host, which effectively reduces the resources waste. Finally, based on the multi-objective function proposed in Equation (8) and the optimized ant colony algorithm, the optimal solution is obtained. Define the mapping tuple set of migration VMs and placement hosts as TC = v m , h p , where v m is a VM to be migrated and h p is a placement host for migration VMs to be placed. The elements in the collection TC are used as food for the ants. Ants search for solutions from TC and use objective function (8) to evaluate the solutions, and finally get the optimal solution.
To reduce the time complexity of this method, we optimized the execution process of the ACS, as shown in Figure 1. This method restricts the solution search space of the ACS to a certain range of hosts instead of all the hosts based on the host load types output by the DLM-HLD algorithm. During the VM consolidation process, select the overloaded hosts and the underloaded hosts in turn, and when selecting the placement hosts for migration VMs, select the normal host, underloaded host and new host in turn. Therefore, compared with the original ant colony optimization algorithm, this method reduces the search range of the ants. In addition, to further optimize the execution efficiency for the VM consolidation problem, the heuristic factor takes into account both the migration VMs selection and placement hosts selection. On the one hand, the optimized heuristic factor selects a different migration VM selection heuristic factor based on the source host load type, which ensures minimizing the migration overhead when selecting the migration VMs. On the other hand, the optimized heuristic factor selects a different placement host selection heuristic factor based on the target host load type, which ensures a maximum resource utilization when selecting the placement hosts. The heuristic factor comprehensively considers the two key processes of the migration VMs selection and placement hosts selection, which helps to ensure the solution quality while reducing the blindness of the ant search to improve the execution efficiency of the ACS. In the following, detailed definitions of these factors are given.

Pheromone Definition
Pheromone is the medium that ants communicate with each other. Ants find food sources by sensing other ants' pheromones, and the higher the pheromone concentration, the greater the preference. Suppose that denotes the pheromone on the combination ( , ℎ ) of VM and host ℎ , the pheromone value changes due to the new pheromone accumulation and the old pheromone volatilization. The local pheromone update rule is as follows.
where ∈ [0,1] is the pheromone volatility coefficient and 0 is the initial pheromone that is a constant. Updating the local pheromone will reduce the pheromone concentration to avoid a premature convergence to suboptimal solutions. After all the ants have constructed their solutions, the global optimal solution is obtained according to the objective function and the global pheromone is updated using the global optimal solution to enhance the experience of the global optimal solution. The global pheromone update rule is as follows.
where + is the global optimal solution.

Definition of Heuristic Factor
In addition to the pheromone, the heuristic factor is another very critical factor in an ant colony algorithm. The heuristic factor represents the expectation that the VM is assigned to host ℎ . The larger the heuristic information is, the greater the corresponding behavior probability is. Therefore, a reasonable setting of the heuristic factor can re-

Pheromone Definition
Pheromone τ ij is the medium that ants communicate with each other. Ants find food sources by sensing other ants' pheromones, and the higher the pheromone concentration, the greater the preference. Suppose that τ ij denotes the pheromone on the combination v i , h j of VM v i and host h j , the pheromone τ ij value changes due to the new pheromone accumulation and the old pheromone volatilization. The local pheromone update rule is as follows.
where ρ ∈ [0, 1] is the pheromone volatility coefficient and τ 0 is the initial pheromone that is a constant. Updating the local pheromone will reduce the pheromone concentration to avoid a premature convergence to suboptimal solutions. After all the ants have constructed their solutions, the global optimal solution is obtained according to the objective function and the global pheromone is updated using the global optimal solution to enhance the experience of the global optimal solution. The global pheromone update rule is as follows.
where X + is the global optimal solution.

Definition of Heuristic Factor
In addition to the pheromone, the heuristic factor η ij is another very critical factor in an ant colony algorithm. The heuristic factor represents the expectation that the VM v i is assigned to host h j . The larger the heuristic information is, the greater the corresponding behavior probability is. Therefore, a reasonable setting of the heuristic factor can reduce the search blindness and improve the search efficiency of the ant colony. This paper comprehensively considers the service performance and energy consumption, and sets the heuristic factor as shown in the following equation, which consists of two parts: the selection of migration VMs and placement hosts.
where η v (h i , −v i ) is the migration VMs selection heuristic factor, which indicates that the VM v i is migrated from the host h i , and η h h j , +v i is the placement hosts selection heuristic factor, which indicates that the VM v i is migrated to the host h j . λ is the relative weight to measure the relative importance of the two. The settings of η v (h i , −v i ) and η h h j , +v i are described in detail below.

Migration VM Selection
For overloaded hosts, any kind of resource overload may affect the QoS and result in SLAv. In addition, an improper policy for selecting migration VMs will cause too many VMs to be migrated, which increases the migration overhead. Therefore, for overloaded hosts, the main strategy for selecting a migration VM is to minimize the VM migration number and time under the premise of comprehensively considering multi-dimensional resources, and quickly restores the host from the overload state to the normal state. Therefore, we define the migration VM selection heuristic factor for the overload hosts as follows. where is the load's comprehensive descending gradient of the host h i after the VM v i is migrated out from the host h i and ω(d) is the weight value obtained based on Equation (18); the greater the descent gradient, the greater the probability of the VM v i being selected. Additionally, considering the migration time T mig (h i , −v i ) as a migration VM selection factor, this paper uses the migration time evaluation model proposed in the literature [41] to calculate T mig (h i , −v i ), which evaluates the VM migration time based on the current memory usage, dirty page and data transfer rate. η v (h i , −v i ) comprehensively considers the VM migration number and time. When selecting a migration VM, the faster the host overload state decreases and the shorter the migration time, the more likely the VM will be selected.
For underload hosts, in order to minimize the underload host number, preference is given to VMs that can significantly reduce the host's resources utilization after migration to shut down the host. Therefore, we define the migration VM selection heuristic factor on the underload hosts as follows.

Placement Host Selection
When any resource usage of a host is overloaded, the host performance will drop rapidly. Therefore, when the normal load host is selected as the placement host, the one with more remaining resources is preferred. Additionally, consider resources waste and choose the one with the less resources waste. We comprehensively consider the QoS and resources waste and set the heuristic factor for normal load hosts in the following formula.
where 1 − L h j , +v i and W h j are the remaining comprehensive load and resources waste of host h j after deploying VM v i . The larger the η h h j , +v i , the greater the remaining comprehensive load, and the less resources waste there is. For the underload hosts, their resources utilization are low and resources competition are weak, which can guarantee a QoS but cause a waste of resources and energy. Therefore, when selecting underload hosts as the placement hosts, the hosts with a higher resource utilization after deploying VMs are preferred to fully utilize the resources. The corresponding heuristic factor is defined as follows.

Pseudo-Random Proportion Rule
According to the heuristic factor and pheromone information, the ants construct the solution according to the following pseudo-random proportion rule.
where q ∈ [0, 1] is a uniformly distributed random number and q0 ∈ [0, 1] is a fixed parameter determining the relative importance of cumulative experience and random selection. The α and β indicate the importance of the pheromone and heuristic factor. When q ≤ q0 is called an exploitation, it is helping the ants to converge quickly to a high-quality solution, otherwise it is called an exploration, in which ants randomly select a tuple v i , h j according to the probability distribution defined in the following equation, helping the ants to discover more new choices.
where TC k allow denotes the set of tuples that ant ant k is allowed to traverse, and p k mp denotes the probability that ant ant k selects the tuple v m , h p next.
Based on Equation (26), select v m , h p ∈ TC.

9.
If v m , h p is null, then
If v m , h p is null, then 15. Break 16.
Update mapping relation matrix X.
while H u ! = ∅ do

22.
TC ← v m , h p ∀v m ∈ V M h j , h j ∈ H u and ∀h p ∈ H n

26.
If v m , h p is null, then

27.
TC ← v m , h p ∀v m ∈ V M h j , h j ∈ H u and ∀h p ∈ H u , j = d

30.
If v m , h p is null, then 31. Break 32.
Update mapping relation matrix X.

End for
As shown in Algorithm 2, we can conclude that the maximum time complexity of this algorithm is O(nI·nA·m·n). Where nI is the number of iterations, nA is the number of ants, m is the number of VMs and n is the number of hosts. In line 4 and line 21, the while loop traverses the overloaded and underloaded hosts with the number of traversals less than n, and then in line 5, line 10, line 22 and line 27, the VMs on the overloaded and underloaded hosts are traversed in turn to construct the solution space with the number of traversals less than m·n. Because we handle overloaded hosts and underloaded hosts separately, select normal load hosts and underloaded hosts sequentially when selecting the placement hosts. Therefore, the number of VMs and hosts traversed each time is less than m and n, respectively, and the final complexity of the DLMM-VMC is less than or equal to O(nI·nA·m·n).

Experimental Setup
The proposed algorithm was evaluated using the simulator CloudSim [30], which is a cloud computing environment simulation framework that can simulate most of the resources and behaviors of the cloud systems.
This experiment simulated a cloud data center with two types of hosts. The host types are HP ProLiant G4 and ProLiant G5, and the details of their configurations are shown in Table 1 and the energy consumption characteristics are shown in Table 2. The hosts were connected through a gigabit network. Four types of Amazon EC2 VMs [17] were used, and their configuration information is shown in Table 3. After the VM instances were created, they were initially deployed based on the resource requirements of the VM type.  To verify the validity of the proposed algorithm, the real-world workload dataset of the Google cluster data (GCD) [42] was used in the experiment. The GCD provided real tracking data for approximately a month in May 2011, which was tracked every five minutes and tracked multiple resources utilization, such as the CPU and memory. The data of different days were randomly selected from the processed data. The statistical characteristics of the 1600 VMs critical resources are shown in Table 4. The algorithm related parameter settings are shown in Table 5. Here, we set ϑ_i = 0.25 and d = 2, which indicates that the multiple objectives have the same weight and consider using the CPU and memory resources.

Performance Metrics
The service level agreement (SLA) refers to an agreement reached between the cloud service providers and users on services, priorities and responsibilities. If the SLA is violated, the users' interests cannot be guaranteed, and the cloud service providers may pay expensive fines to users as compensation. Therefore, the SLA is an important metric to measure a data center's QoS. SLAv [17] are an independent metric to measure SLA violations, which is measured from two aspects: the SLA violation time caused by the host overload (SLAHv) and the performance degradation caused by the VM migration (SLAMv). These two aspects are independent and have the same impact on SLAv. Therefore, the total SLAv are calculated as follows: The SLAHv indicates the percentage of time when the CPU or memory usage of a host reached 100%; meanwhile, the SLAMv indicate the overall performance degradation caused by the VM migration. The SLAHv and SLAMv values are calculated as follows: where n and m indicate the numbers of hosts and VMs in a data center, respectively; T h j and T a j represent the number of time when the host utilization reached 100% and the total running time, respectively; C v i represents the capacity of the unfulfilled resource requests caused by the VM migration, which is the estimation of the performance degradation caused by the VM migration; and C a i is the total CPU requirement for a VM v i during its lifetime. Studies have shown that [17] SLAMv can be set to 10% of the CPU utilization during the VM migration. The energy consumption in the VM consolidation is an important evaluation metric. However, when optimizing the energy consumption, SLAv need to be balanced. The comprehensive evaluation metric PSV is calculated from the combination of the total energy consumption and SLAv, and is defined as follows.
where P is the total energy consumption according to Equation (3); when the PSV value is low, it indicates that the data center has a good performance in terms of the energy consumption and QoS. In addition, the network communication overhead and migration overhead are also the optimization objectives in this paper, which are evaluated based on Equations (6) and (7), respectively. The network communication overhead a v i , v j between VMs v i and v j is calculated by referring to the literature [43].

Performance of DLM-HLD
In this experiment, data centers of different sizes were used to evaluate the performance of the DLM-HLD algorithm. The number of hosts in a data center varied from 100 to 1500, and each host was initially deployed with two VMs on average. The experiment focuses on the impact of the dynamic load mean on the overloaded, underloaded, active hosts number and the migration number in different sizes of data centers. The DLM-HLD scheme was compared with the static and dynamic threshold detection methods proposed in [17]. The static threshold detection method (THR-HLD) set the maximum utilization of the CPU and memory to 80%. The dynamic threshold detection method (LR-HLD) estimated the threshold using the local regression (LR) method and detected overloaded hosts according to the estimated CPU and memory utilization values. The VM consolidation test was performed every 5 min, and the test results were recorded over 24 h. Figure 2 shows the test results.
underloaded hosts in the VM consolidation. For a 1500-node data center, the DLM-HLD detected 56.8% and 40.6% fewer overloaded hosts and 58.9% and 59.5% fewer underloaded hosts compared to the THR-HLD and LR-HLD, respectively. The THR-HLD algorithm is based on the current resources utilization as the criterion for detecting a host overload or underload, without considering the dynamic changes in the resources load. As long as the current resource utilization exceeds the set threshold, the host is judged as an overload or underload, and even occasional load fluctuations will detect the hosts overload or underload, which leads to the misjudgment of the hosts overload or underload in the VM consolidation, thus increasing the number of overloaded or underload hosts. Although the LR-HLD can predict a future resources utilization, it cannot predict occasional load fluctuations. The DLM-HLD algorithm considers the dynamic load mean of resources over a period, which not only accurately judges the trend of resources usage but also filters out occasional load fluctuations, thus effectively reducing the misjudgment in the host load detection.
Next, the impact of the dynamic load mean (DLM) on the number of active hosts in different sizes of data centers was analyzed. As shown in Figure 2c, the number of hosts that need to be activated using the DLM-HLD method was the smallest. From the above analysis, we know that the THR-HLD and LR-HLD algorithms detected more overloaded and underloaded hosts. However, each VM consolidation required a VMs migration on the overloaded and underloaded hosts, which led to the migration of more VMs, of which Figure 2d shows the result, and finally more hosts were activated when more migration VMs were placed.

Multi-Objective Optimization Performance
This section evaluated the multi-objective optimization performance of the DLMM-VMC. Four heuristic algorithms and two meta-heuristic algorithms were used as the comparison benchmarks. For heuristic algorithms, host load detection used two algorithms: According to Figure 2a,b, compared with the THR-HLD and LR-HLD algorithms, the DLM-HLD algorithm proposed in this paper detected the least number of overloaded or underloaded hosts in the VM consolidation. For a 1500-node data center, the DLM-HLD detected 56.8% and 40.6% fewer overloaded hosts and 58.9% and 59.5% fewer underloaded hosts compared to the THR-HLD and LR-HLD, respectively. The THR-HLD algorithm is based on the current resources utilization as the criterion for detecting a host overload or underload, without considering the dynamic changes in the resources load. As long as the current resource utilization exceeds the set threshold, the host is judged as an overload or underload, and even occasional load fluctuations will detect the hosts overload or underload, which leads to the misjudgment of the hosts overload or underload in the VM consolidation, thus increasing the number of overloaded or underload hosts. Although the LR-HLD can predict a future resources utilization, it cannot predict occasional load fluctuations. The DLM-HLD algorithm considers the dynamic load mean of resources over a period, which not only accurately judges the trend of resources usage but also filters out occasional load fluctuations, thus effectively reducing the misjudgment in the host load detection.
Next, the impact of the dynamic load mean (DLM) on the number of active hosts in different sizes of data centers was analyzed. As shown in Figure 2c, the number of hosts that need to be activated using the DLM-HLD method was the smallest. From the above analysis, we know that the THR-HLD and LR-HLD algorithms detected more overloaded and underloaded hosts. However, each VM consolidation required a VMs migration on the overloaded and underloaded hosts, which led to the migration of more VMs, of which Figure 2d shows the result, and finally more hosts were activated when more migration VMs were placed.

Multi-Objective Optimization Performance
This section evaluated the multi-objective optimization performance of the DLMM-VMC. Four heuristic algorithms and two meta-heuristic algorithms were used as the comparison benchmarks. For heuristic algorithms, host load detection used two algorithms: the static threshold THR and dynamic threshold LR [17]. The migration VMs selection used the minimum migration time algorithm MMT [17], and the placement host selection used both the FF [3] and PABFD [17] algorithms. The two meta-heuristic algorithms were ACS-VMC [40] and QMOD [13], respectively. The data center was sized to deploy 400, 800 and 1200 VMs based on 400 physical servers for testing. The VM consolidation experiments were executed every 5 min, and the test results were recorded over 24 h, as shown in Figure 3. the static threshold THR and dynamic threshold LR [17]. The migration VMs selection used the minimum migration time algorithm MMT [17], and the placement host selection used both the FF [3] and PABFD [17] algorithms. The two meta-heuristic algorithms were ACS-VMC [40] and QMOD [13], respectively. The data center was sized to deploy 400, 800 and 1200 VMs based on 400 physical servers for testing. The VM consolidation experiments were executed every 5 min, and the test results were recorded over 24 h, as shown in Figure 3. Figure 3a shows the energy consumption comparison. For the data center with 1200 VMs, the DLMM-VMC reduced the energy consumption by 30.8%, 27.7%, 30.3%, 19.1%, 23.5%, 19.3% and 9.8% compared with the THR-MMT-FF, THR-MMT-PABFD, LR-MMT-PABFD, ACS-VMC and QMOD, respectively. On the one hand, energy consumption was a major optimization objective in the DLMM-VMC, and the DLM-DLH method effectively reduced the number of active hosts. In addition, the resources waste was also our optimization objective. Figure 3b shows that the DLMM-VMC method has the least resources waste, which proves that the DLMM-VMC makes full use of resources so that it can minimize the active hosts' number when deploying the same number of VMs. On the other hand, the DLMM-VMC optimized the heuristic factor of the ant colony algorithm. When selecting the placement hosts, the optimization heuristic factor fully considered the comprehensive resources utilization and resources waste of the host and selected the host with less resources waste under constraints. Therefore, compared with other algorithms, the DLMM-VMC effectively reduced the energy consumption.  Figure 4a illustrates the SLAv comparison. The results show that the DLMM-VMC has the best performance in SLAv, followed by the QMOD, and the THR-MMT-PABFD has the worst performance. For the data center with 1200 VMs, the DLMM-VMC SLAv were 73.6% of the QMOD, but was only 24.1% of the LR-MMT-PABFD, which proved that the DLMM-VMC algorithm effectively guaranteed the QoS. The SLAv were composed of SLAHv and SLAMv. In order to analyze the SLAv in more detail, we further analyzed the SLAHv and SLAMv. Figure 4b illustrates the SLAHv comparison. These results show that DLMM-VMC algorithm has the lowest SLAHv, which indicates that the DLMM-VMC has a significant improvement in ensuring the host's QoS. Because the DLMM-VMC considered the multidimensional resources of the host in the host overload detection, which avoided the SLAv caused by any kind of resource overload on the host, it effectively guaranteed the host's QoS. Figure 4c shows that the DLMM-VMC has the best performance in SLAMv compared to the other algorithms, which indicates that the DLMM-VMC effectively reduces the impact of migration on the VMs QoS. On the one hand, the objective function defined by the DLMM-VMC tends to minimize the VM migrations number; Figure 4d demonstrated the result. On the other hand, the DLM-DLH effectively avoided unnecessary VM migrations caused by the load fluctuation. In addition, the DLMM-VMC ensured that the  Figure 3a shows the energy consumption comparison. For the data center with 1200 VMs, the DLMM-VMC reduced the energy consumption by 30.8%, 27.7%, 30.3%, 19.1%, 23.5%, 19.3% and 9.8% compared with the THR-MMT-FF, THR-MMT-PABFD, LR-MMT-PABFD, ACS-VMC and QMOD, respectively. On the one hand, energy consumption was a major optimization objective in the DLMM-VMC, and the DLM-DLH method effectively reduced the number of active hosts. In addition, the resources waste was also our optimization objective. Figure 3b shows that the DLMM-VMC method has the least resources waste, which proves that the DLMM-VMC makes full use of resources so that it can minimize the active hosts' number when deploying the same number of VMs. On the other hand, the DLMM-VMC optimized the heuristic factor of the ant colony algorithm. When selecting the placement hosts, the optimization heuristic factor fully considered the comprehensive resources utilization and resources waste of the host and selected the host with less resources waste under constraints. Therefore, compared with other algorithms, the DLMM-VMC effectively reduced the energy consumption. Figure 4a illustrates the SLAv comparison. The results show that the DLMM-VMC has the best performance in SLAv, followed by the QMOD, and the THR-MMT-PABFD has the worst performance. For the data center with 1200 VMs, the DLMM-VMC SLAv were 73.6% of the QMOD, but was only 24.1% of the LR-MMT-PABFD, which proved that the DLMM-VMC algorithm effectively guaranteed the QoS. The SLAv were composed of SLAHv and SLAMv. In order to analyze the SLAv in more detail, we further analyzed the SLAHv and SLAMv. overloaded hosts were quickly and accurately restored to a normal load level with a minimal migration overhead based on the optimized heuristic factors when selecting migration VMs. Therefore, compared with other comparison algorithms, the DLMM-VMC had obvious advantages in SLAv.  Figure 5a shows the network overhead comparison. Based on the tree network topology, the results show that the DLMM-VMC has the minimum network overhead, which proves that the network overhead model proposed in this paper effectively reduces the network communication cost. The DLMM-VMC placed the interdependent VMs close to each other so as to reduce the number of network elements that pass through during the network's communication. It is well known that transmission information is processed and forwarded as it passes through the network's elements, which increases the corresponding transmission delay. If VMs that communicate with each other are maximally deployed on the same server, the communication traffic handled by the network elements in the data center is greatly reduced, which not only reduces the overhead of the network resources but also improves the overall communication performance of the data center. Figure 5b shows the comprehensive performance comparison. The results show that the DLMM-VMC has the lowest value, indicating that its comprehensive performance is the best. The PSV was composed of the total energy consumption P and SLAv. The above analysis shows that P and SLAv achieve the optimal results compared to other algorithms. Therefore, the PSV is also optimal.   Figure 4b illustrates the SLAHv comparison. These results show that DLMM-VMC algorithm has the lowest SLAHv, which indicates that the DLMM-VMC has a significant improvement in ensuring the host's QoS. Because the DLMM-VMC considered the multidimensional resources of the host in the host overload detection, which avoided the SLAv caused by any kind of resource overload on the host, it effectively guaranteed the host's QoS. Figure 4c shows that the DLMM-VMC has the best performance in SLAMv compared to the other algorithms, which indicates that the DLMM-VMC effectively reduces the impact of migration on the VMs QoS. On the one hand, the objective function defined by the DLMM-VMC tends to minimize the VM migrations number; Figure 4d demonstrated the result. On the other hand, the DLM-DLH effectively avoided unnecessary VM migrations caused by the load fluctuation. In addition, the DLMM-VMC ensured that the overloaded hosts were quickly and accurately restored to a normal load level with a minimal migration overhead based on the optimized heuristic factors when selecting migration VMs. Therefore, compared with other comparison algorithms, the DLMM-VMC had obvious advantages in SLAv. Figure 5a shows the network overhead comparison. Based on the tree network topology, the results show that the DLMM-VMC has the minimum network overhead, which proves that the network overhead model proposed in this paper effectively reduces the network communication cost. The DLMM-VMC placed the interdependent VMs close to each other so as to reduce the number of network elements that pass through during the network's communication. It is well known that transmission information is processed and forwarded as it passes through the network's elements, which increases the corresponding transmission delay. If VMs that communicate with each other are maximally deployed on the same server, the communication traffic handled by the network elements in the data center is greatly reduced, which not only reduces the overhead of the network resources but also improves the overall communication performance of the data center. Figure 5b shows the comprehensive performance comparison. The results show that the DLMM-VMC has the lowest value, indicating that its comprehensive performance is the best. The PSV was composed of the total energy consumption P and SLAv. The above analysis shows that P and SLAv achieve the optimal results compared to other algorithms. Therefore, the PSV is also optimal.
in the data center is greatly reduced, which not only reduces the overhead of the network resources but also improves the overall communication performance of the data center. Figure 5b shows the comprehensive performance comparison. The results show that the DLMM-VMC has the lowest value, indicating that its comprehensive performance is the best. The PSV was composed of the total energy consumption P and SLAv. The above analysis shows that P and SLAv achieve the optimal results compared to other algorithms. Therefore, the PSV is also optimal.

Execution Efficiency Analysis
In order to deeply analyze the efficiency of the DLMM-VMC, the execution time was analyzed, as shown in Figure 6a. Due to the low time complexity, the four heuristic algorithms were shorter than the three meta-heuristic algorithms. However, the DLMM-VMC was better than the other two meta-heuristics and was close to the three heuristics. The DLMM-VMC algorithm limited the solution search space of the ant colony based on the host load types, which effectively improved the execution efficiency.

Execution Efficiency Analysis
In order to deeply analyze the efficiency of the DLMM-VMC, the execution time was analyzed, as shown in Figure 6a. Due to the low time complexity, the four heuristic algorithms were shorter than the three meta-heuristic algorithms. However, the DLMM-VMC was better than the other two meta-heuristics and was close to the three heuristics. The DLMM-VMC algorithm limited the solution search space of the ant colony based on the host load types, which effectively improved the execution efficiency.
In addition, we compared the DLMM-VMC with the ACS-VMC in terms of the convergence. We calculated the objective function value according to Equation (8) and run the two algorithms 10 times separately. The number of VMs was set to 400. As seen in Figure 6b, both the DLMM-VMC and ACS-VMC converged in 500 iterations, and the DLMM-VMC solution was smaller. The DLMM-VMC algorithm converges significantly faster than the ACS-VMC algorithm, which starts to converge after 150 iterations, and the ACS-VMC algorithm has a convergence trend after 260 iterations. It can be seen that the DLMM-VMC has been improved in terms of the algorithm convergence performance.

Conclusions
This paper focuses on how to optimize the energy consumption, resource utilization, QoS, migration overhead and network communication overhead in cloud data centers, and thus proposes a DLMM-VMC algorithm to do so. The DLMM-VMC constructs the VM consolidation problem as a multiple-objective optimization problem. Fist, a host load detection method based on the dynamic load mean is proposed to objectively and accurately evaluate the real load state of the hosts, which avoids the deficiency of only considering single-dimensional resources in VM consolidation and also optimizes the problem of unnecessary VM migrations caused by system load fluctuations. Then, the optimized ant colony algorithm is proposed to obtain the optimal mapping scheme between the hosts and the VMs. In this process, the heuristic factor and the execution process of the ACS are optimized to achieve the improvement in the multiple objective optimization and execution efficiency. Finally, the experimental results show that the DLMM-VMC is effective in reducing the energy consumption, optimizing resources utilization, guaranteeing a QoS and reducing a migration overhead and network communication overhead compared with other algorithms.
This paper ignores the energy consumption generated by other devices in the data center and the impact on the system's performance, such as the network elements and refrigeration equipment. In the future, we will comprehensively consider various factors to conduct VM consolidation research to further optimize the energy consumption.
Author Contributions: Conceptualization, P.L. and J.C.; methodology, P.L.; software, P.L.; validation, P.L. and J.C.; formal analysis, P.L.; investigation, P.L.; resources, P.L.; data curation, P.L.; writing-original draft preparation, P.L.; writing-review and editing, P.L. and J.C.; visualization, P.L.; In addition, we compared the DLMM-VMC with the ACS-VMC in terms of the convergence. We calculated the objective function value according to Equation (8) and run the two algorithms 10 times separately. The number of VMs was set to 400. As seen in Figure 6b, both the DLMM-VMC and ACS-VMC converged in 500 iterations, and the DLMM-VMC solution was smaller. The DLMM-VMC algorithm converges significantly faster than the ACS-VMC algorithm, which starts to converge after 150 iterations, and the ACS-VMC algorithm has a convergence trend after 260 iterations. It can be seen that the DLMM-VMC has been improved in terms of the algorithm convergence performance.

Conclusions
This paper focuses on how to optimize the energy consumption, resource utilization, QoS, migration overhead and network communication overhead in cloud data centers, and thus proposes a DLMM-VMC algorithm to do so. The DLMM-VMC constructs the VM consolidation problem as a multiple-objective optimization problem. Fist, a host load detection method based on the dynamic load mean is proposed to objectively and accurately evaluate the real load state of the hosts, which avoids the deficiency of only considering single-dimensional resources in VM consolidation and also optimizes the problem of unnecessary VM migrations caused by system load fluctuations. Then, the optimized ant colony algorithm is proposed to obtain the optimal mapping scheme between the hosts and the VMs. In this process, the heuristic factor and the execution process of the ACS are optimized to achieve the improvement in the multiple objective optimization and execution efficiency. Finally, the experimental results show that the DLMM-VMC is effective in reducing the energy consumption, optimizing resources utilization, guaranteeing a QoS and reducing a migration overhead and network communication overhead compared with other algorithms. This paper ignores the energy consumption generated by other devices in the data center and the impact on the system's performance, such as the network elements and refrigeration equipment. In the future, we will comprehensively consider various factors to conduct VM consolidation research to further optimize the energy consumption.