In recent years, with the rapid spreading and development of cloud computing technology around the world, a large number of computing operations in the datacenters need to respond rapidly and efficiently to ensure service capabilities. However, the growing demand for cloud infrastructure has led to a dramatic increase in the power consumption of datacenters, which has become a significant issue need to be solved. Datacenters around the world consume a lot of energy each year, and the average power consumed by each datacenter is almost equal to the power consumed by 25,000 homes in the United States [1
]. In 2017, there were approximately 8 million datacenters around the world that consumed 416.2 terawatt hours of electricity [2
]. This is equivalent to 2% of the total electricity consumption in the world and is expected to reach 5% of global electricity consumption by 2020. As estimated, the energy cost of a datacenter approximately accounts for nearly 50% of the total operating cost of the datacenter. This results in most of the total electrical energy consumed not being sufficiently utilized. One of the major reasons for this is that the datacenter has a certain proportion of idle energy consumption during its operation. Even at very low utilization rates, such as 10% CPU usage, the power consumed exceeds 50% of the peak power [3
]. Some methods that dynamically migrate or consolidate tasks on some less-utilized servers or turn off idle servers have been proven to be energy-efficient strategies [4
]. However, most of these studies only considered the computing power consumption of the datacenter or the cooling power consumption separately, without combining them together.
In recent years, the use of green energy sources such as solar energy, wind energy, and tidal energy has become a global trend in building green sustainable datacenters [7
]. Kong et al. [10
] conducted a survey on the renewable energy used and carbon emission in many datacenters. Compared with brown energy, green energy has lots of advantages, such as being natural, renewable, clean, and low cost. However, the generation of the renewable energy such as wind, solar, and tidal energy are usually intermittent and unstable. Hence, a way to accurately predict the amount of available renewable energy is worth being studied in order to make full use of renewable sources in the datacenters. In recent years, famous IT companies such as Microsoft
, and IBM
were all operating large-scale datacenters around the world to cope with the growing computation demand, and they were also trying to use the renewable energy as partial supply to their datacenter to further reduce the energy cost. Therefore, a way to effectively manage such energy supply in the datacenter becomes an important issue for these service providers.
Nowadays, most datacenters support various types of workloads, including critical interactive workloads and batch-type workloads, wherein the latter can be deferred for a certain time to be processed. In general, interactive workloads include web browsing, real-time gaming, data query, and other workloads which need an immediate response. In contrast, batch-type workloads like image processing, scientific applications, and financial data analysis can be scheduled later as long as they could be completed before their deadlines [11
]. This provides the feasibility of scheduling workloads in the datacenter in the time dimension.
The main objective of this paper is to manage workloads effectively in a green datacenter, aiming at making full use of renewable energy and minimizing the total power cost of the datacenter. In this paper we adopt solar energy as the renewable energy supply in the experiments. Due to fluctuant energy input and dynamically changing workload over time, we adjusted the number of workloads in each time slot and the temperature supplied by the cooling device to maximize the use of renewable energy. In this way, the brown power consumption of both IT devices and cooling devices could be decreased. Moreover, we adopted the neural network model to predict the amount of solar power generated to facilitate more accurate workload allocation decisions. This paper is an extended version of our prior work [12
]. The biggest difference between this paper and the previous version is that we considered two types of workloads at the same time (interactive workloads and batch-type workloads). In addition, due to the unstable solar power generation, we conducted a prediction of the amount of solar energy generated in advance to better schedule the batch-type jobs, set more methods for comparison, and produce a more detailed analysis of results. And the number of words has increased by 50%.
The remainder of this paper is organized as follows. Section 2
introduces some related work on datacenter energy management. Section 3
presents the problem definition and the model used in this paper. Section 4
depicts the architecture of the green datacenter and the solar power prediction method. Section 5
describes the methods and strategies we designed to solve the defined problem. Section 6
analyzes the experiment results by comparing three different strategies. At last, Section 7
concludes the whole paper and discusses possible future work.
3. The Architecture of the Green Datacenter
In this section, we present the architecture of the green datacenter using mixed energy supplies and the prediction model used to forecast the energy generation amount.
3.1. Datacenter Architecture
Assume the datacenter system consists of N hosts, denoted as host 1 to host N. These hosts complete distributed tasks individually or collaboratively. The resources of the host generally include CPU, storage, bandwidth (bw). We use cmax to represent the maximum CPU capacity of a host, the maximum storage and the maximum bw of a host can offer is denoted as smax and bmax, respectively. We use to represent the total resources of a host. The total simulation time is one day and divided into τ = 24 time slots, and thus the length of each time slot is 60 min. In order to schedule the tasks and adjust the temperature provided in each time slot, we assume that the supply air temperature of the cooling device can be set dynamically on demand.
depicts the architecture of the green datacenter powered by both renewable energy and traditional energy from the utility grid. The grid utility and renewable energy are combined together through the automatic transfer switch in order to provide power for the datacenter. The IT devices include servers, storage, and networking switches that support applications and services hosted in the datacenter. The cooling devices deliver the cooling resources to dispatch the heat generated by IT equipment. In this paper, the cooling capacity is delivered to the datacenter through the computer room air conditioning units (CRAC
) from the cooling micro-grid that consists of the traditional chiller plant. The architecture does not consider the energy storage equipment, because the energy storage equipment has the following shortcomings [32
]: (a) The internal resistance and self-discharge of the battery can result in loss of energy; (b) the battery-related costs predominate in solar powered systems; and (c) the chemicals in the battery can cause harm to the environment to some extent.
3.2. Renewable Energy Forecasting
Considering the use of renewable energy as a part of the energy supply in the sustainable datacenter and the fact that the solar power generation is unknown and unstable, we conducted a prediction of the amount of solar energy generated in advance to better schedule the batch-type jobs. In this way, the impact of unstable solar energy supply on datacenter scheduling jobs can be avoided to some extent. There are many researchers who use a variety of methods to predict the amount of power generation [11
]. In this paper, we use solar energy as the renewable energy supply. The neural network model LSTM (long short-term memory) is adopted for solar prediction, which is a recurrent neural network trained using backpropagation through time and overcomes the vanishing gradient problem. Because the LSTM model adds a “processor” to determine whether the information is useful or not, it has a better memory function for historical data than some other neural network models, and this model is suitable for processing and predicting important events with relatively long intervals and delays in the time series. We select m
-day historical solar power generation data as a training dataset to predict the solar power generation on day m
In order to have a precise prediction to achieve a more accurate allocation of jobs, we derived the data from the public data sharing website [42
]. We selected solar power data for sunny days and cloudy (rainy) days in February, March, and April 2018 as training sets to forecast the amount of solar power generation of 14 July 2018 and the amount of solar power generation of 8 May 2018, respectively, where the 14 July 2018 is a sunny day and the 8 May 2018 is a cloudy day. We combined the scikit-learn
library of machine learning for model training and data normalization. The forecast result of a sunny day shown in Figure 2
a,b represents the forecast value under a cloudy (rainy) day, wherein the error rate between the predicted and actual values remains within 7% and 20% on a sunny day and cloudy (rainy) day, respectively.
5. Methods and Strategies
To address the issue proposed and defined in Section 4
, we proposed a thermal-aware approach for workload and power management of the datacenter, and also implemented three other methods for comparison. We considered the characteristic of different jobs in the datacenter. We mainly took into account two categories, including interactive workloads and batch-type workloads. As previously described, the interactive workloads should be responded immediately, while the batch workloads are delay-tolerant.
5.1. Static Method (ST)
Under this method, the batch-type jobs will be processed when they arrive as soon as possible without any other scheduling actions.
5.2. Load Balancing Distribution over Time (LB)
Under this strategy, the batch-type jobs are scheduled and distributed evenly over multiple time slots, while the interactive tasks will be processed immediately after they arrive.
5.3. Best Effort Strategy (BS)
Under this strategy, a scheduling plan will be made for the submitted batch-type jobs based on the predicted solar power generation amount. The number of active hosts in a time slot constrained by the supply of solar energy can be calculated by using Equation (9). Then, it should be judged whether the current number of interactive jobs could be handled by these hosts. If the number is not enough, brown energy has to be used to power on some extra hosts according to the demand of jobs.
5.4. Thermal-Aware Workload Management (TM)
Under this strategy, we take both workload and temperature adjustment of the datacenter into account simultaneously. The critical component of this method is the time-shifting of batch-type jobs to consume more solar power. On the basis of BS
method, we further perform pre-cooling actions if there is surplus solar energy at t
th time slot and there are no jobs need to be processed so as to deal with more jobs when the solar energy generation becomes insufficient in the future; but if there are jobs that have not been processed, then we use the surplus solar energy to perform the jobs first. Specifically, the temperature will be adjusted dynamically according to the amount of surplus solar energy, denoted as Et
= 1, 2… τ
. We can use Equations (4)–(6) and (15) to calculate the temperature, which the cooling devices should supply given the extra energy from solar generation. In this method, the main consideration is to perform as many jobs as possible when solar power is sufficient and to decrease the power consumption of cooling devices. Compared with the first three strategies, this strategy considered power consumption both IT devices and cooling devices. The pseudo codes of the algorithm are shown as follows (Algorithm 1).
|Algorithm 1: The process of TM method|
|Input: the number of jobs, solar power generation, Tmax|
|Output: job schedule plan|
|1. St ← getSolarPrediction()|
|2. Nt ← calculate the number of hosts can be powered by the provided solar power at time slot t|
|3. if the number of hosts Nt is enough to process jobs then|
|4. if Et > 0 then|
|5. perform the pre-cooling action (decrease the temperature of the cooling device)|
|6. end if|
|8. power on some hosts using brown energy according to the workload demand|
|9. if Tin < =Tmax and performed pre-cooling then|
|10. allocate more batch-type jobs|
|11. update the remaining number of batch-type jobs|
|12. end if|
|13. end if|
|14. return: job schedule plan|
6. Evaluation Results
Here, we set up a series of numerical simulation and experiments to evaluate the four methods proposed in Section 5
. We simulated a data center using the CloudSim-plus
tool, a cloud computing simulation tool that was extended based on the 3.0 version of the CloudSim
tool, using some of the features proposed by JDK 1.8. The Cloudsim
is developed in the Cloud Computing and Distributed Systems (CLOUDS) Laboratory, at the Computer Science and Software Engineering Department of the University of Melbourne. We simulate the four methods proposed in this paper under a random load. We specify the Tmax
= 30 °C, which specifies that the datacenter inside temperature should be less than 30 °C [29
]. The solar power data are derived from a public solar data sharing website [42
Due to the particularity of the load model defined in this paper, we did not consider using actual load tracking data, such as Google traces. Therefore, in order to facilitate the simulation but without loss of generality, we randomly generated jobs that arrived over time and assumed that some batch-type jobs would be submitted at middle night and around noon [11
]. Then we use the method proposed in Reference [11
] to specify that the maximum time of batch-type jobs could be deferred is 12 h, which means dtmax
= 12. Figure 3
shows the number of arrived interactive and batch-type jobs during each time slot. The number of servers in datacenter and the power consumption parameters are shown in Table 1
6.1. Power Consumption
In this subsection, the results illustrate the solar power utilization in detail under the four strategies described previously. As shown in Figure 4
a, the blue column represents solar power usage under different strategies on a sunny day. It can be seen that the solar utilization of TM
method is the highest, reaching 98%, while ST
is the lowest. For the sake of having a clear examination, we show detailed utilization values in Table 2
. We also illustrate the solar utilization on a cloudy (rainy) day shown in Figure 4
b and statistical results in Table 3
; we can intuitively obtain that solar power utilization under the TM
strategy is also the highest in cloudy conditions.
depicts the detailed power consumption under the four strategies, with power consumption including computing power consumption (batch-type jobs and interactive jobs) and the cooling power consumption. As shown, Figure 5
a–d illustrate the particular power consumption under the ST
, and TM
, respectively. Obviously, ST
do not make full use of solar energy but use more of the brown energy for power supply. The power consumption of BS
can vary according to the supplied solar power, but some solar power is also wasted. Compared with the other three methods, TM
can make better use of solar energy by jointly scheduling workloads and adjusting the cooling temperature. Hence, TM
only uses brown energy sources to supply power when necessary.
We also analyzed the detailed power consumption under a cloudy (rainy) day, as shown in Figure 6
d, due to the fact that there are some batch-type jobs that need to be processed in addition to interactive jobs when there is solar power available, so there is not surplus solar energy for taking pre-cooling action. This means that the proposed strategy consumes more brown energy on a cloudy (rainy) day, but it also maximizes the use of solar energy. However, the detailed power consumption in the other three methods is similar to the situation in sunny weather conditions.
Here, the results illustrate, in detail, the power consumption and solar energy utilization of the four strategies mentioned previously. As shown in Figure 7
, the green column represents the use of solar power under the current strategy and the orange portion represents the use of brown energy used. The total power consumption of TM
was the most, since it almost consumed all of the generated solar energy, with the least consumption of brown energy. In contrast, the other three methods consumed more brown energy. For a more detailed explanation, we have listed the solar values actually used under various strategies in Table 4
and Table 5
on a sunny and cloudy (rainy) day, respectively. It is clear that the TM
strategy uses the most solar energy in both weather conditions.
6.2. Job Scheduling Details
As shown in Table 6
, we can obtain the average waiting time that batch-type jobs can be responded to under the four methods. It is obvious that the waiting time under ST
is the lowest because this method gives the response as quickly as possible to the arrived batch-type jobs. However, under the other three methods, batch-type jobs have different waiting times. The average waiting time under LB
is longest since some batch-type jobs submitted at morning will be evenly deferred to be executed at all the next time slots. Compared with the BS
method, the waiting time of the TM
method is relatively short because more jobs can be performed when the solar energy is insufficient in the afternoon, while the BS
method will postpone more batch-type jobs to be executed at night, which results in a long waiting time. The average waiting time for batch-type jobs in cloudy (rainy) weather is also given in Table 7
. We can see that the waiting time under each method is longer than in sunny weather condition. This is because solar energy is not sufficient in cloudy weather, and the TM
method does not work well. However, the waiting time of the TM
method is shorter than other strategies except for the ST
method. Therefore, we can obtain the TM
method is suitable for various (Service Level Agreement) SLA-constrained environments, but the energy-saving effect may not be very ideal with strong SLA constraints.
shows the job scheduling conditions under the four methods. We can see from the figure that more jobs can be executed when there is sufficient solar power supply under TM
. Compared with BS
, there are more jobs were scheduled than BS
in several time slots after 12:00. This is because TM
conducted pre-cooling actions, which made the room cooler when solar power was surplus, and thus facilitated the later scheduling of more jobs when the solar generation dropped. Furthermore, since a part of batch-type jobs were processed between 13:00 and 16:00, so fewer jobs were processed than BS
in several time slots after 20:00, which further saved the consumption of the utility grid power.
6.3. Temperature Provided by the Cooling Device
shows the temperature variation provided by the cooling equipment under BS
strategies. As shown in Figure 9
, we can observe that the temperature of TM
was lower than BS
in several time slots before 12:00 because the surplus solar energy was not fully utilized and there were no jobs that needed to be processed at these moments, so TM
used the extra solar energy for cooling. TM
could fully consider the extra solar power and carry out pre-cooling actions to cope with the power consumption demand for a period of time in the future, thereby utilizing the solar energy more.
7. Conclusions and Future Work
In this paper, we studied renewable-aware and thermal-aware workload management approaches to fully use the green energy provided for the datacenter and in turn to minimize the total energy cost. Due to the uncertain nature of renewable energy generation, we use the neural network model to predict solar energy. After considering the multiple characteristics of workloads and high energy cost of the cooling device, the TM strategy proposed in this paper shows a good effect for scheduling hybrid types of workloads and adjusting the temperature dynamically according to surplus solar energy supplied in each time slot in the datacenter. However, the TM strategy does not work well when the solar energy supply is insufficient in cloudy (rainy) weather through experimental analysis. But the experiment results illustrate that TM can better achieve the energy saving goal as well as minimize the overall power cost of the datacenter no matter on a sunny or cloudy weather condition.
The method proposed in this paper mainly takes into account two types of hybrid workload including interactive workloads and batch-type workloads. By delaying the execution of some more delay-tolerant jobs until the solar energy is sufficient and using excess solar energy for pre-cooling to cope with the cooling needs for the next period of time, the possible waste of the extra generated solar power can be avoided, and more jobs can be scheduled when solar energy is supplied.
Currently, our proposed workload scheduling method only considers the amount of renewable energy generated, in order to help the datacenter maximize the use of solar energy, by scheduling some batch-type workload and adjusting the supply temperature of the cooling equipment. In the future, we plan to combine the more demand response signals on the grid side to enable the datacenter to participate in the response plan by adjusting its load and power consumption.