A Novel Hybrid-Copy Algorithm for Live Migration of Virtual Machine

.


Introduction
Cloud computing [1] has become one of the most dominant features in the field of computing.In cloud environments, dynamic resource scheduling [2] is a key problem.The live migration [3] of virtual machine is an important approach for dynamic resource scheduling in cloud environments.With the help of live migration, virtual machine (VM) can be transferred from a host to another host while the services provided by the VM are still available.It is useful in several of scenarios, such as load balancing [4], online maintenance [5], fault tolerance [6], and energy conservation [7].
The key to VM migration is how to handle the resources associated with the migrating VM.The main resources related to VM migration are as follows: memory data, network connecting, CPU state, virtual devices, and storage.In practice, the storage of a VM is always stored in a network-attached storage device (NAS) [8].The NAS mechanism can avoid the migration of storage because the NAS is accessible to all hosts in the datacenter.In this situation, memory data becomes the main transferred data in the migration process.
There are many approaches to migrate a VM from a host to another host.However, downtime and total migration time should be considered when the migrated VM is in working condition [3].On the one hand, we need to minimize the downtime in which period the service is totally unavailable.On the other hand, the source host may be shut down at any time, and the performance of the VM is seriously affected during the migration process.So, the total migration time should be shortened.Many live migration algorithms have been proposed to minimize both downtime and total migration time.In terms of the memory data transfer order, live migration algorithms can be classified into three Future Internet 2017, 9, 37; doi:10.3390/fi9030037www.mdpi.com/journal/futureinternetcategories: the pre-copy algorithm [3,[9][10][11][12], the post-copy algorithm [13][14][15], and the hybrid-copy algorithm [16][17][18].The hybrid-copy algorithm is an excellent algorithm that combine the pre-copy algorithm with the post-copy algorithm to remedy the defects of the pre-copy and post-copy algorithms.The hybrid-copy algorithm consists of two phases: the pre-copy phase and the post-copy phase.In the pre-copy phase, all memory pages are transferred from the source to the destination in advance.Therefore, the page faults occurring in the post-copy phase can be effectively reduced.Currently, the hybrid-copy algorithm copies all memory pages only once in advance.In a write-intensive workload, copying memory pages once may be enough.However, more copy behavior can significantly reduce the page faults in a read-intensive workload.The page faults always incur a significant performance loss [13][14][15]; sometimes page faults can even cause the services in migrated VM to be totally unavailable.
In this paper, we propose a novel hybrid-copy algorithm.The main target is to improve the performance of hybrid-copy algorithm by reducing the number of page faults, while keeping the migration time at the same level.A new parameter named the switched decision factor (SDF) is proposed to decide the moment to switch from the pre-copy phase to the post-copy phase in the hybrid-copy algorithm.It helps reduce the number of page faults and minimize the downtime caused by page faults in the migration process.A Markov model is used in our novel hybrid-copy algorithm to reduce invalid transfers.This ensures that the migration time will not be increased.
We implemented the novel hybrid-copy algorithm based on Xen virtualization software and ran experiments on a write-intensive workload and a read-intensive workload.Through extensive evaluations, we demonstrate that the novel hybrid-copy algorithm can significantly reduce page faults in read-intensive workloads while achieving the same level of total migration time.In write-intensive workloads, we also keep the good performance of original hybrid-copy algorithm.
In this paper, we analyze the memory access pattern of different workloads and adopt a Markov model to predict the memory page accessed order.We propose a new parameter to decide the opportune moment when the pre-copy algorithm stopped and the post-copy algorithm kicked.As a result, we can get great performance in both read-intensive workloads and write-intensive workloads.We tested our work with two workloads.We successfully demonstrate the novel hybrid-copy algorithm can get better performance with kinds of workloads.
The rest of the paper consists of Section 2, which analyzes the pre-copy algorithm, post-algorithm and the hybrid-algorithm, Section 3, which introduces the prototype of the novel hybrid-copy algorithm, including how to use the Markov Model and how the new parameter works, and the evaluation is shown in Section 4. Related works are introduced in Section 5, and the summary and the future works follows in Section 6.

Background
Many live migration algorithms have been proposed.In terms of the memory data transformed order, live migration algorithms always be classified into three categories:

Pre-Copy
The pre-copy algorithm is the most popular algorithm in VM live migration domain.The core thought of the pre-copy algorithm is to minimize the downtime in the migration process through maximizing the memory mirroring synchronization between the source host and the destination host.Figure 1 provides an overview of the pre-copy algorithm architecture.The pre-copy algorithm can be divided into the following two phases: (1) Iteration phase: In this phase, memory mirroring synchronization between the source host and the destination host will be maximized by iteratively copying the memory data from the source to the destination.In this phase, the source VM still works, so part of memory pages would be modified during the previous transmission turn.These memory pages should be synchronized until the number of the remaining memory pages is less than a threshold or the iteration turn is greater than an iterative threshold.(2) Stop-and-copy phase: In this phase, the source VM will be stopped and the remaining pages will be copied to the destination VM.Then the new VM in the destination host is started.
The pre-copy algorithm can shorten the downtime significantly in case of a read-intensive VM.However, if the VM has write-intensive workloads, the speed of memory write may be faster than the network transmission speed.As a result, the iteratively copy behavior would just waste bandwidth resources and overrun the total migration time.
Future Internet 2017, 9, 37 3 of 13 (2) Stop-and-copy phase: In this phase, the source VM will be stopped and the remaining pages will be copied to the destination VM.Then the new VM in the destination host is started.
The pre-copy algorithm can shorten the downtime significantly in case of a read-intensive VM.However, if the VM has write-intensive workloads, the speed of memory write may be faster than the network transmission speed.As a result, the iteratively copy behavior would just waste bandwidth resources and overrun the total migration time.

Post-Copy
In contrary to the pre-copy algorithm, the post-copy algorithm transfers the memory data to the destination after the new VM in the destination begins.The first step of the post-copy algorithm is to suspend the VM at the source.Then, the VCPUs, device states, and some kernel data are transferred to the destination.Then, the new VM resumed at the destination.Last, memory data at the source are transferred to the destination mainly by two ways: on-demand paging and active pushing.
On-demand paging is the simplest way to synchronize the memory data.Once the VM starts at the destination and visits the memory page, which is not synchronized, it will result in page faults.The VM will then get the corresponding pages from the source.However, the network delay will cause a seriously degradation of the VM performance.From the user's perspective, it feels like the server crashed when the page faults occur.So, we should try to avoid page faults resulting in the migration process.Besides, if there is no other mechanism, the migration process can last a long time because some part memory pages of the VM may not have been visited in a long term.
Active pushing is one way to reduce the duration of residual dependencies on the source node.The source will remain until the migration process is complete.The source VM will proactively push the remaining memory pages to the destination.

Post-Copy
In contrary to the pre-copy algorithm, the post-copy algorithm transfers the memory data to the destination after the new VM in the destination begins.The first step of the post-copy algorithm is to suspend the VM at the source.Then, the VCPUs, device states, and some kernel data are transferred to the destination.Then, the new VM resumed at the destination.Last, memory data at the source are transferred to the destination mainly by two ways: on-demand paging and active pushing.
On-demand paging is the simplest way to synchronize the memory data.Once the VM starts at the destination and visits the memory page, which is not synchronized, it will result in page faults.The VM will then get the corresponding pages from the source.However, the network delay will cause a seriously degradation of the VM performance.From the user's perspective, it feels like the server crashed when the page faults occur.So, we should try to avoid page faults resulting in the migration process.Besides, if there is no other mechanism, the migration process can last a long time because some part memory pages of the VM may not have been visited in a long term.
Active pushing is one way to reduce the duration of residual dependencies on the source node.The source will remain until the migration process is complete.The source VM will proactively push the remaining memory pages to the destination.
In fact, on-demand paging and active pushing always work together in the post-copy algorithm.The post-copy algorithm ensures that all memory data are transferred only once, so the post-copy strategy has minimal total migration time closer to its equivalent time achieved by non-live VM migration.However, the disadvantages of post-copy algorithm are obvious: too many page faults may create network traffic, and the performance of the VM is seriously affected.

Hybrid-Copy
The hybrid-copy algorithm works by running the pre-copy algorithm just once as the first step of the migration.In this phase, the source VM continues providing service to users while all the memory data are copied to the destination.The VM is then suspended and its processor state is copied to the destination without the remaining dirty memory pages.Further, the VM resumes at the destination immediately, and the post-copy algorithm kicks.The rest of the memory pages will be synchronized in the post-copy phase.
Page faults in post-copy phase always incur a significant performance loss; sometimes page faults can even cause the services in migrated VM to be totally unavailable.Hybrid-copy algorithm can avoid many page faults because of the pre-copy algorithm.Further, as with the post-copy mechanism, it also helps solve the trouble of the pre-copy algorithm in write-intensive workloads.In the original hybrid-copy algorithm, the pre-copy algorithm only works for a single round.The page faults occurring in the post-copy phase can be further reduced if more copy iterations are executed.However, a predetermined number of rounds cannot be satisfied with various workloads.If the network is in good condition and the VM runs a read-intensive workload, more iterative copy round should be executed, reducing the page faults further.On the contrary, if the network condition is poor or the VM runs a write-intensive workload, less iterative copy behavior should be performed to save the network resources and the migration time.In this paper, we proposed a new parameter to determine the right moment to stop the iterative copy phase based on the real situation.

Novel Hybrid-Copy Algorithm
Here, we present the design of the novel hybrid-copy algorithm.The novel hybrid-copy algorithm needs to forecast the memory write pattern in each round of iteration in the pre-copy phase.Only a half of dirty pages that are least likely to be modified in the transmission process are transferred in each iterative copy round, instead of transmitting all dirty pages.Part of any duplicate transmissions can therefore be avoided.After each round of iterative copy, the SDF value is calculated to determine the proper time to stop the iteration phase.The post-copy algorithm will then continue running until the VM is transferred completely.In the post-copy phase, the Markov Model will still forecast the memory access pattern, and the source will actively push the remain dirty memory pages to the destination based on the predicted result to further reduce the number of page faults.

Markov Model of Memory Write
Assume that the set M = {m 1 , m 2 , . . . ,m n } represents all memory pages in the source VM.The set M = m 1 , m 2 , . . ., m n is a set of memory pages in the destination VM.We need to ensure that M = M after the live migration of the VM.

Definition 1 (Dirty Page).
A dirty page at the source is a memory page whose state is different from the destination.These pages should be copied to the destination.
All memory pages in the set M are marked as dirty pages in the beginning.Assume that m 1 would be modified in i th round of the iteration phase.If the m 1 page was synchronized before the i th round of the iteration phase, this page should be synchronized again after i th iterative copy round.If it can be forecasted when the memory page would be modified, lots of time and bandwidth resources will be saved.The Markov model is a classical forecasting model: Joseph et al. [19] have proven that the Markov model achieves a good result in memory prefetching.Therefore, we chose the Markov model as the forecasting model in our novel hybrid-copy algorithm.
In the observation period, let all memory pages in the source be the input.We need to build a forecasting model based on the analysis of the input data to forecast which pages are going to be dirtied in the next round.Assume a VM possessing 1 GB memory with a 4-KB page size.Such a VM will contain 262,144 pages.In a real case, there may be even larger memory in a VM.It is a heavy burden to analysis such a large data set for the live migration task.
In the pre-copy algorithm implemented by Clark et al. [3], a concept of the "writable working set" (WWS) is proposed.It is a set of the global frequently modified pages in the iteration phase.We can use the WWS as the input instead of all the memory pages.Assume W = {w 1 , w 2 , . . . ,w t } represents the memory pages that belong to the WWS.
To simplify the question, we use a dirty memory page to describe a memory write state.For example, if w 1 is modified, then w 1 is used to describe present state.There will arise to kinds of states during the observation period.The VM is configured with 1-GB RAM and two VCPUs.Various workloads are running in different times.We track dirty pages through the shadow page every 50 ms.Assume that the w 1 and w 2 were modified during the first 50 ms, the w 2 , w 3 , and w 4 are modified in the next 50 ms, and the w 1 , w 4 and w 5 are changed during the third 50 ms.We believe that every state that occurred during the second 50 ms are associated with the states appeared during the first 50 ms.Similarly, the states happed in the third 50 ms are related to the states that arise during the second 50 ms.If there are only three observation periods, we can calculate the state transition probabilities.The conversion relations between all states are shown in Figure 2. In this example, different states are described by the dirtied memory pages.Each transition from node X to node Y in the diagram is assigned a weight representing the transition probability from X to Y.For example, if the w 1 is modified, there is a 50% possibility that the w 2 will be changed in the next 50 ms, and both w 3 and w 4 have a 25% probability to be changed in the next 50 ms.More concretely, we can use a state transition matrix to represent the Markov transition diagram.The observation period lasts 8 s at least, and we can get more than 160 groups of states transitions like these.We can then calculate the state transition matrix P t * t based on these state transitions [19].
Future Internet 2017, 9, 37 5 of 13 In the observation period, let all memory pages in the source be the input.We need to build a forecasting model based on the analysis of the input data to forecast which pages are going to be dirtied in the next round.Assume a VM possessing 1 GB memory with a 4-KB page size.Such a VM will contain 262,144 pages.In a real case, there may be even larger memory in a VM.It is a heavy burden to analysis such a large data set for the live migration task.
In the pre-copy algorithm implemented by Clark et al. [3], a concept of the "writable working set" (WWS) is proposed.It is a set of the global frequently modified pages in the iteration phase.We can use the WWS as the input instead of all the memory pages.Assume W = { , , … , } represents the memory pages that belong to the WWS.
To simplify the question, we use a dirty memory page to describe a memory write state.For example, if is modified, then is used to describe present state.There will arise to kinds of states during the observation period.The VM is configured with 1-GB RAM and two VCPUs.Various workloads are running in different times.We track dirty pages through the shadow page every 50 ms.Assume that the and were modified during the first 50 ms, the , , and are modified in the next 50 ms, and the , and are changed during the third 50 ms.We believe that every state that occurred during the second 50 ms are associated with the states appeared during the first 50 ms.Similarly, the states happed in the third 50 ms are related to the states that arise during the second 50 ms.If there are only three observation periods, we can calculate the state transition probabilities.The conversion relations between all states are shown in Figure 2. In this example, different states are described by the dirtied memory pages.Each transition from node X to node Y in the diagram is assigned a weight representing the transition probability from X to Y.For example, if the is modified, there is a 50% possibility that the will be changed in the next 50 ms, and both and have a 25% probability to be changed in the next 50 ms.More concretely, we can use a state transition matrix to represent the Markov transition diagram.The observation period lasts 8 s at least, and we can get more than 160 groups of states transitions like these.We can then calculate the state transition matrix P * based on these state transitions [19]

Switched Decision Factor(SDF)
Before we introduce the SDF, let us discuss the flaw of the current hybrid-copy algorithm.It is the key thought of the hybrid-copy algorithm to combine the pre-copy algorithm and the post-copy algorithm to remedy the defects of the pre-copy and post-copy algorithms.In the current hybridcopy algorithm, all memory pages are transmitted to the destination only once.Then, the post-copy

Switched Decision Factor(SDF)
Before we introduce the SDF, let us discuss the flaw of the current hybrid-copy algorithm.It is the key thought of the hybrid-copy algorithm to combine the pre-copy algorithm and the post-copy algorithm to remedy the defects of the pre-copy and post-copy algorithms.In the current hybrid-copy algorithm, all memory pages are transmitted to the destination only once.Then, the post-copy algorithm starts working.All memory pages are copied once so that the number of page faults are notably reduced, compared with the pure post-copy algorithm.However, if the iterative process of the pre-copy algorithm increases, the number of page faults and the downtime caused by the page faults will be reduced as well.Thus, the best time to switch the pre-copy algorithm to the post-copy algorithm is a key issue in hybrid-copy algorithm.
Here, we propose SDF to determine when to switch the pre-copy phase to the post-copy phase.Some definitions and symbols are first introduced in Table 1 to facilitate discussion.

Definition 2 (Invalid Transfer
).An Invalid transfer means a memory page at the source is dirtied again, after this page has been transferred to the destination at least once.

N
The total number of the VM's memory pages S(n) The number of the transmitted memory pages in n th round R(n) The total number of the transmitted memory pages after n th round T(n) The number of invalid transfers in The total number of invalid transfers after n th transfer V 2 (n) The number of the dirty pages after n th transfer If a transmitted memory page at source is dirtied again, it should be retransmitted.This means that all transmissions of this memory page before this copy round did nothing but waste resources and time.Therefore, the number of the invalid transfers should be as small as possible.
At the beginning, there is no any transfer between source and destination, so the value of n in Table 1 is 0 at first.We therefore arrive at the following conclusion: As shown in Table 1, V 1 (n) represents the total number of invalid transfers after n th transfer, while V 2 (n) represents the number of the dirty pages after n th transfer.It is obvious that V 1 (n) is related to the wasted bandwidth resources in the pre-copy phase, and V 2 (n) has ties with the number of page faults and the downtime caused by page faults.Therefore, we introduce the following function to evaluate the state of migration process: Here, the 0 ≤ α ≤ 1.If α = 0, only the number of dirty pages needs to be considered, and the pure pre-copy algorithm should be adopted.In this situation, we get the smallest downtime.However, lots of bandwidth resources may be wasted and the migration process may last a long time.Otherwise, if α = 1, then (1 − α) = 0.This means that we should ensure that the number of invalid transfers is as small as possible.In this case, the pure post-copy algorithm is a good choice.We can guarantee that all memory pages are transferred only once and that there is no any invalid transfer.Numerous page faults always caused network traffic and seriously affect the user experience.
According to the definition of the symbols in the Table 1, Equations ( 3) and ( 4) can be derived as, Based on Formulas (1) and ( 2), we can easily get Formula (5) as follow: If n > 0, then we can get Formula (6) through Formulas ( 2)-( 4), We hope the value of F(n) is as small as possible to save bandwidth resources and reduce the number of page faults.Under ideal condition, the V 2 (n) should tend to 0 and the V 1 (n) would not increase.As a result, the value of F(n) would tend to 0 with the increase of n.However, in write-intensive workloads, a large number of invalid transmissions are produced during the VM migration process.Therefore, the value of F(n) may actually increase rather than decrease.At this point, the iteration phase should be stopped, and the pre-copy algorithm should be switched to the post-copy algorithm because more iterative copies are just wasting network resources and time.If the n th round iterative copy has a positive effect, the F(n) should be smaller than F(n-1).Then, Formula (7) can be derived as follow: We define (V 2 (n − 1) − V 2 (n))/S(n) as the switched decision factor (SDF).SDF(n) represents the value of SDF in nth round.If the SDF(n) is smaller than α, the iteration phase will be stopped, and the post-copy algorithm kicks.According to formula (2), α represents the tradeoff between the number of page faults and bandwidth resource consumption.As α increases, a few iterative copies are executed.

Novel Hybrid-Copy Algorithm
With the Markov model and the SDF described above, the complete Novel Hybrid-Copy Algorithm is implemented here.Algorithm 1 shows the framework of the complete Novel Hybrid-Copy Algorithm, and Algorithm 2 shows the function that is used to determine the send list using the Markov model to forecast the memory write order.
The input parameters in Algorithm 1 consists of three variables.The "memory_size" is the total number of the memory pages in the VM.The "dirty_threshold" is the threshold of the number of the remaining dirty pages.If the number of the dirty pages is smaller than the threshold, the iterative copy phase should be stopped in advance because the dirty pages are running out.The last variable "a" represents the α.The function "Get_Dirty_List()" is a Xen system call: it returns the dirty pages set.We do the pre-copy phase in the condition-controlled loop.The stopping condition of the process is thus if the value of SDF is greater than α settled in advance or the number of the dirty pages is smaller than the threshold of the dirty page number.The function "Filter_Dirty()" is defined in Algorithm 2. The function "Send_live_dirty()" is a Xen system call to send the memory pages stored in "send_list".And the function "Get_Dirty_Size()" is also a Xen system call, which will return the number of the dirty pages.Last, the pre-copy algorithm is stopped, and the post-copy algorithm is started.

Experimental Results and Comparative Analyses
In this section, performances of the Novel Hybrid-Copy Algorithm and analyses are presented.The experimental platform consists of three physical hosts, each with four eight-core 2.6GHz Intel Xeon E5-2550 CPU, 32GB DDR3 RAM, and Intel 82,576 gigabit network connections.All VMs used in our experiment are configured to use two VCPUs and 1GB RAM.We use the latest stable version of Xen (version of xen is 4.8.0) as the VM monitor.The guest kernel is Linux 4.9.13 (stable version), and the host kernel is a modified version of Linux 4.9.13 for both the source and the destination.The storage is accessed via iSCSI protocol from the third physical host configured as the NAS.The workloads used in the experiments are as follow: SPEC VIRT_SC 2013: A benchmark addressing performance evaluation of datacenter servers used in virtualized server consolidation.SPEC VIRT_SC 2013 (v1.1) measures the end-to-end performance of all system components including hardware, the virtualization platform, and the virtualized guest operating system and application software.The benchmark supports hardware virtualization, operating system virtualization, and hardware partitioning schemes.It is a write-intensive workload.
Linux Kernel Compile: A Linux 4.9.13 kernel is compiled in the migrated VM in our second experiment.It is an intensive workload.

Total Migration Time
Figure 3 shows the total migration time for both the SPEC VIRT_SC 2013 and Linux Kernel Compile.The α is set to 0.3, 0.5, 0.7 and 1.0.We also compare with the original hybrid-copy algorithm: the last column in the Figure 3 represents the results of the original hybrid-copy algorithm.We found that as the α increased, the total migration time decreased in Novel Hybrid-Copy algorithm because additional iterative copies are conducted with the increase of α.
When the α is set as 1.0, the copy operation before the post-copy phase executed only once.Unlike the original hybrid-copy algorithm, it does not transfer all memory pages in the iterative copy phase.Only half of the memory pages selected by Markov Model are transferred in advance.So, less memory pages would be transferred in total.Less migration time should be spent in the Novel Hybrid-Copy algorithm when α is set to 1.0.The results in Figure 3b are broadly in line with our forecasts.However, Figure 3a shows a theoretically inconsistent result because the migration time was not only associated with the number of transferred memory pages.It also related to the real-time network conditions, VM states, and the number of the page faults occurred during the post-copy phase.In fact, because all memory pages are transferred at most twice in the original hybrid-copy algorithm, there is little reduction in total migration time.
Future Internet 2017, 9, 37 9 of 13 and the host kernel is a modified version of Linux 4.9.13 for both the source and the destination.The storage is accessed via iSCSI protocol from the third physical host configured as the NAS.The workloads used in the experiments are as follow: SPEC VIRT_SC 2013: A benchmark addressing performance evaluation of datacenter servers used in virtualized server consolidation.SPEC VIRT_SC 2013 (v1.1) measures the end-to-end performance of all system components including hardware, the virtualization platform, and the virtualized guest operating system and application software.The benchmark supports hardware virtualization, operating system virtualization, and hardware partitioning schemes.It is a writeintensive workload.
Linux Kernel Compile: A Linux 4.9.13 kernel is compiled in the migrated VM in our second experiment.It is an intensive workload.

Total Migration Time
Figure 3 shows the total migration time for both the SPEC VIRT_SC 2013 and Linux Kernel Compile.The α is set to 0.3, 0.5, 0.7 and 1.0.We also compare with the original hybrid-copy algorithm: the last column in the Figure 3 represents the results of the original hybrid-copy algorithm.We found that as the α increased, the total migration time decreased in Novel Hybrid-Copy algorithm because additional iterative copies are conducted with the increase of α.
When the α is set as 1.0, the copy operation before the post-copy phase executed only once.Unlike the original hybrid-copy algorithm, it does not transfer all memory pages in the iterative copy phase.Only half of the memory pages selected by Markov Model are transferred in advance.So, less memory pages would be transferred in total.Less migration time should be spent in the Novel Hybrid-Copy algorithm when α is set to 1.0.The results in Figure 3b are broadly in line with our forecasts.However, Figure 3a shows a theoretically inconsistent result because the migration time was not only associated with the number of transferred memory pages.It also related to the real-time network conditions, VM states, and the number of the page faults occurred during the post-copy phase.In fact, because all memory pages are transferred at most twice in the original hybrid-copy algorithm, there is little reduction in total migration time.

Total Transferred Pages
As shown in Figure 4, the number of transferred pages is associated with the α value.In the original hybrid-copy algorithm, all memory pages are transferred only once in the pre-copy phase.We found that a total 297,564 pages are transferred in SPEC VIRT_SC 2013 workload case with the original hybrid-copy algorithm.In our experimental platform, every VM has 262,144 memory pages.This means that there are 35,420 invalid transfers in the migration process.
From Figure 4a,b, we know that if the α is large enough, the novel hybrid-copy algorithm has

Total Transferred Pages
As shown in Figure 4, the number of transferred pages is associated with the α value.In the original hybrid-copy algorithm, all memory pages are transferred only once in the pre-copy phase.
We found that a total 297,564 pages are transferred in SPEC VIRT_SC 2013 workload case with the original hybrid-copy algorithm.In our experimental platform, every VM has 262,144 memory pages.This means that there are 35,420 invalid transfers in the migration process.
From Figure 4a,b, we know that if the α is large enough, the novel hybrid-copy algorithm has less transferred pages than original hybrid-copy algorithm because the Markov model avoided parts of the invalid transfers.In our experiments, if the α is set to 0.7, our novel hybrid-copy algorithm transfers a similar number of memory pages to the original hybrid-copy algorithm.If the α is set to 1.0, it always has less total transferred memory pages.If the α is small, we may transfer many more pages than the original hybrid-copy algorithm because more iterative copies are used in the pre-copy phase.The number of transferred pages is positively correlated with the consumption of network resources.Therefore, if the network condition is poor, we need set a large value for α to save more bandwidth resources.
Future Internet 2017, 9, 37 10 of 13 transfers a similar number of memory pages to the original hybrid-copy algorithm.If the α is set to 1.0, it always has less total transferred memory pages.If the α is small, we may transfer many more pages than the original hybrid-copy algorithm because more iterative copies are used in the pre-copy phase.The number of transferred pages is positively correlated with the consumption of network resources.Therefore, if the network condition is poor, we need set a large value for α to save more bandwidth resources.

Number of Page Faults
Figure 5a illustrates the number of page faults in the SPEC VIRT_SC 2013 workload, when the α takes a different value.As shown in Figures 3a and 5a, 1515 page faults are reduced when α is set to 0.7 and only extra 0.09 s are spent than the original hybrid-copy algorithm.Less page faults can greatly improve the user experience.
As shown in Figure 5b, when the VM runs a Linux Kernel Compile workload, it is best to set the α as 0.3.The least page faults will be occurred in the post-copy phase because the Linux Kernel Compile is a read-intensive workload compared to the SPEC VIRT_SC 2013 workload.In the SPEC VIRT_SC 2013 workload, even the α is set to 0.3, and there is no significant improvement.However, the total migration time is increased, which adds to the burden of the network.Thus, it is appropriate to set the α to 0.7 in the SPEC VIRT_SC 2013 workload.

Summary of Experiments
The experiments described above prove that the novel hybrid-copy algorithm can effectively reduce the number of page faults, especially when the migrated VM runs a read-intensive workload.Based on the analysis of the experimental results, we arrive at the conclusion that as the increases,   As shown in Figure 5b, when the VM runs a Linux Kernel Compile workload, it is best to set the α as 0.3.The least page faults will be occurred in the post-copy phase because the Linux Kernel Compile is a read-intensive workload compared to the SPEC VIRT_SC 2013 workload.In the SPEC VIRT_SC 2013 workload, even the α is set to 0.3, and there is no significant improvement.However, the total migration time is increased, which adds to the burden of the network.Thus, it is appropriate to set the α to 0.7 in the SPEC VIRT_SC 2013 workload.
Future Internet 2017, 9, 37 10 of 13 transfers a similar number of memory pages to the original hybrid-copy algorithm.If the α is set to 1.0, it always has less total transferred memory pages.If the α is small, we may transfer many more pages than the original hybrid-copy algorithm because more iterative copies are used in the pre-copy phase.The number of transferred pages is positively correlated with the consumption of network resources.Therefore, if the network condition is poor, we need set a large value for α to save more bandwidth resources.

Number of Page Faults
Figure 5a illustrates the number of page faults in the SPEC VIRT_SC 2013 workload, when the α takes a different value.As shown in Figures 3a and 5a, 1515 page faults are reduced when α is set to 0.7 and only extra 0.09 s are spent than the original hybrid-copy algorithm.Less page faults can greatly improve the user experience.
As shown in Figure 5b, when the VM runs a Linux Kernel Compile workload, it is best to set the α as 0.3.The least page faults will be occurred in the post-copy phase because the Linux Kernel Compile is a read-intensive workload compared to the SPEC VIRT_SC 2013 workload.In the SPEC VIRT_SC 2013 workload, even the α is set to 0.3, and there is no significant improvement.However, the total migration time is increased, which adds to the burden of the network.Thus, it is appropriate to set the α to 0.7 in the SPEC VIRT_SC 2013 workload.

Summary of Experiments
The experiments described above prove that the novel hybrid-copy algorithm can effectively reduce the number of page faults, especially when the migrated VM runs a read-intensive workload.Based on the analysis of the experimental results, we arrive at the conclusion that as the increases, fewer iterative copy are executed.In this situation, more bandwidth resources can be saved, but the page faults and the downtime caused by page faults may increase.Therefore, if a VM runs a write-

Summary of Experiments
The experiments described above prove that the novel hybrid-copy algorithm can effectively reduce the number of page faults, especially when the migrated VM runs a read-intensive workload.Based on the analysis of the experimental results, we arrive at the conclusion that as the α increases, fewer iterative copy are executed.In this situation, more bandwidth resources can be saved, but the page faults and the downtime caused by page faults may increase.Therefore, if a VM runs a write-intensive workload, the α should be set to a large value.For example, it is appropriate to set the α to 0.7 in SPEC VIRT_SC 2013 workload.In this case, we spent an additional 0.4% live migration time to reduce about 10% number of page faults.On the contrary, a small value of α is a good choice for a read-intensive workload.More iterative copies can significantly reduce the page faults, and only a little more migration time is spent.For instance, in a Linux Kernel Compile workload, when the α is set as 0.3, almost 75% number of page faults are reduced, and only an additional 9.5% of migration time is spent.Page faults always seriously affect the performance of the migrating VM.Sometimes, the services provided by the migrating VM are totally unavailable because of the page faults.Therefore, reducing the number of page faults will significantly improve the user experience.

Related Work
Since Clark et al. [3] proposed the live migration of virtual machines, there has been considerable work in the development of VM live migration.Nelson et al. [9] implemented their system in a VMware platform.It is the first system that supported the migration of unmodified applications on an unmodified mainstream Intel x86-based operating system.Memory compression technology [2,20] has been widely used to improve migration performance.In these systems, the compression ratio is very important.A small compression ratio will have little effect, while a larger compression ratio may consume too much computing resource.
Hsu et al. [10] presented an adaptive pre-copy algorithm called MPP, which only transmits memory pages when a predefined threshold is met.It significantly reduces the unnecessary migration of memory pages.Baruchi et al. [11] introduced a method to identify the workload cycles of a VM based on the information to reduce live migration overhead.Ruan et al. [12] proposed a novel data filter mechanism to improve the performance of the pre-copy algorithm in terms of the migration time and the amount of migrated data.
Adaptive pre-paging and dynamic self-ballooning [13] are applied in post-copy algorithm by Hines et al.Adaptive pre-paging technology will forecast the memory access pattern based on the page faults position and the spatial locality principle.The source VM will then adjust the memory pages of the transport order.Abe et al. presented a new mechanism for post-copy algorithm that focuses on recovering the aggregate performance of the VMs being affected.Su et al. [15] improved the traditional post-copy algorithm by eliminating unnecessary remote page faults.
Based on the pre-copy algorithm and the post-copy algorithm, Chen et al. [17] designed a hybrid-copy algorithm.All memory pages in the source are copied to the destination first, then the post-copy method kicks.Many other hybrid-copy algorithm [16,18] are proposed.However, there has been little research done on when to switch from the pre-copy phase to post-copy phase.
Recently, Arif et al. [5] studied live migration over wide area networks.They proposed a MLDO approach to reduce downtime during live migration over wide area networks.Esposito and Cerroni [21] proposed a geometric programming model and an online multi-VM live migration algorithm based on such model.Sun et al. [22] also studied the live migration for multiple VMs and the parallel migration strategy.

Conclusions and Future Works
Although scholars have done a lot of research in the field of VM live migration, little research has been done on when to switch from pre-copy phase to post-copy phase in hybrid-copy algorithm based on real situation.In this paper, a novel hybrid-copy algorithm is proposed.We introduced SDF to decide the best time to switch the pre-copy phase to the post-copy phase.A Markov model is used to forecast the memory access pattern to reduce the number of invalid transfers.The experiments demonstrate that our mechanism has good performance on write-intensive workloads and read-intensive workloads.Compared with the original hybrid-copy algorithm, we can effectively reduce the page faults while achieving the same level of total migration time.Page faults always cause a seriously degradation of the VM performance.Therefore, our algorithm will effectively improve the user experience by reducing the number of the page faults.
In the future, we plan to design a new algorithm to calculate the SDF value automatically so that the SDF value will be adjusted automatically based on the real-time state of the VM.In addition, we can find a better model than the Markov model to further reduce the invalid transfer and save more migration time.

Figure 2 .
Figure 2. The Markov model representing the previous dirty pages via transition probabilities.

Figure 2 .
Figure 2. The Markov model representing the previous dirty pages via transition probabilities.

Figure 3 .
Figure 3.The total migration time of the experiment.(a) SPEC VIRT_SC 2013; (b) Linux Kernel Compile.

Figure 3 .
Figure 3.The total migration time of the experiment.(a) SPEC VIRT_SC 2013; (b) Linux Kernel Compile.

4. 3 .
Figure5aillustrates the number of page faults in the SPEC VIRT_SC 2013 workload, when the α takes a different value.As shown in Figures3a and 5a, 1515 page faults are reduced when α is set to 0.7 and only extra 0.09 s are spent than the original hybrid-copy algorithm.Less page faults can greatly improve the user experience.As shown in Figure5b, when the VM runs a Linux Kernel Compile workload, it is best to set the α as 0.3.The least page faults will be occurred in the post-copy phase because the Linux Kernel Compile is a read-intensive workload compared to the SPEC VIRT_SC 2013 workload.In the SPEC VIRT_SC 2013 workload, even the α is set to 0.3, and there is no significant improvement.However, the total migration time is increased, which adds to the burden of the network.Thus, it is appropriate to set the α to 0.7 in the SPEC VIRT_SC 2013 workload.

Table 1 .
Symbols and definitions.