Long and continuous running of software can cause software aging-induced errors and failures. Cloud data centers suffer from these kinds of failures when Virtual Machine Monitors (VMMs), which control the execution of Virtual Machines (VMs), age. Software rejuvenation is a proactive fault management technique that can prevent the occurrence of future failures by terminating VMMs, cleaning up their internal states, and restarting them. However, the appropriate time and type of VMM rejuvenation can affect performance, availability, and power consumption of a system. In this paper, an analytical model is proposed based on Stochastic Activity Networks for performance evaluation of Infrastructure-as-a-Service cloud systems. Using the proposed model, a two-threshold power-aware software rejuvenation scheme is presented. Many details of real cloud systems, such as VM multiplexing, migration of VMs between VMMs, VM heterogeneity, failure of VMMs, failure of VM migration, and different probabilities for arrival of different VM request types are investigated using the proposed model. The performance of the proposed rejuvenation scheme is compared with two baselines based on diverse performance, availability, and power consumption measures defined on the system.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited