In this section, for the energy-efficient data center, we first establish a policy-based continuous-time Markov process with a finite block structure. Then, we define a suitable reward function with respect to both states and policies of the Markov process. Note that this will be helpful and useful for setting up a MDP to find the optimal asynchronous dynamic policy in the energy-efficient data center.
3.1. A Policy-Based Block-Structured Continuous-Time Markov Process
The data center in
Figure 1 shows Group 1 of
servers, Group 2 of
servers, and a buffer of size
. We need to introduce both “states” and “policies” to express the stochastic dynamics of this data center. Let
, and
be the numbers of jobs in Group 1, Group 2, and the buffer, respectively. Therefore,
is regarded as a state of the data center at time
t. Let all the cases of such a state
form a set as follows:
where
For a state it is seen from the model description that there are four different cases: (a) By using the transfer rules, if , then if either or , then . (b) If and then the jobs in the buffer can increase until the waiting room is full, i.e., (c) If and then the total numbers of jobs in Group 2 and the buffer are no more than the buffer size, i.e., (d) If and then the jobs in the buffer can also increase until the waiting room is full, i.e.,
Now, for Group 2, we introduce an asynchronous dynamic policy, which is related to two dynamic actions (or sub-policies): from sleep to work (setup) and from work to sleep (close). Let and be the numbers of working servers and of sleeping servers in Group 2 at State , respectively. By observing the state set , we call and the setup policy (i.e., from sleep to work) and the sleep policy (i.e., from work to sleep), respectively.
Note that the servers in Group 2 can only be set up when all of them are idle, while we cannot simultaneously have the setup policy () because the servers in Group 2 are always affected by the sleep policy () if they still work for some jobs. This is what we call asynchronous dynamic policies. Here, we consider the control optimization of the total system. For such sub-policies, we provide an interpretation of four different cases as follows:
(1) In if , then due to the transfer rule. Thus, there are no jobs in Group 2 or in the buffer, so that no policy in Group 2 is used;
(2) In the states will affect how to use the setup policy. If , , then is the number of working servers in Group 2 at State . Note that some of the slow servers need to first start, so that some jobs in the buffer can enter the activated slow servers, thus, , each of which can possibly take place under an optimal dynamic policy;
(3) From to the states will affect how to use the sleep policy. If , , then is the number of sleeping servers in Group 2 at State We assume that the number of sleeping servers is no less than . Note that the sleep policy is independent of the work policy. Once the sleep policy is set up, the servers without jobs must enter the sleep state. At the same time, some working servers with jobs are also closed to the sleep state, and the jobs in those working servers are transferred to the buffer. It is easy to see that
(4) In if and , then may be any element in the set it is clear that
Our aim is to determine when or under what conditions an optimal number of servers in Group 2 switch between the sleep state and the work state such that the long-run average profit of the data center is maximal. From the state space
, we define an asynchronous dynamic energy-efficient policy
as
where
and
are the setup and sleep policies, respectively; ‘⊠’ denotes that the policies
and
occur asynchronously; and
Note that is related to the fact that if there is no job in Group 2 at the initial time, then all the servers in Group 2 are at the sleep state. Once there are jobs in the buffer, we quickly set up some servers in Group 2 such that they enter the work state to serve the jobs. Similarly, we can understand the sleep policy In the state subset it is seen that the setup policy will not be needed because some servers are kept at the work state.
For all the possible policies
given in (
1), we compose a policy space as follows:
Let
for any given policy
. Then
is a policy-based block-structured continuous-time Markov process on the state space
whose state transition relations are given in
Figure A3 in
Appendix B (we provide two simple special cases to understand such a policy-based block-structured continuous-time Markov process in
Appendix A). Based on this, the infinitesimal generator of the Markov process
is given by
where every block element
depends on the policy
(for simplification of description, here we omit “
”) and it is expressed in
Appendix C.
It is easy to see that the infinitesimal generator
has finite states, and it is irreducible with
, thus, the Markov process
is a positive recurrent. In this case, we write the stationary probability vector of the Markov process
as
where
Note that the stationary probability vector
can be obtained by means of solving the system of linear equations
and
where
is a column vector of the ones with a suitable size. To this end, the RG-factorizations play an important role in our later computation. Note that some computational details are given in Chapter 2 in Li [
33].
Now, we use UL-type RG-factorization to compute the stationary probability vector
as follows. For
and
, we write
Clearly,
and
. Let
and
Then the UL-type RG-factorization is given by
where
and
By using Theorem 2.9 of Chapter 2 in Li [
33], the stationary probability vector of the Markov process
is given by
where
is the stationary probability vector of the censored Markov chain
to level 0, and the positive scalar
is uniquely determined by
Remark 3. The RG-factorizations provide a unified, constructive and algorithmic framework for the numerical computation of many practical stochastic systems. It can be applied to provide effective solutions for the block-structured Markov processes, and are shown to be also useful for the optimal design and dynamical decision-making of many practical systems. See more details in Li [33]. The following theorem provides some useful observations on some special policies , in which the special policies will have no effect on the infinitesimal generator or the stationary probability vector .
Theorem 1. Suppose that two asynchronous energy-efficient policies satisfy one of the following two conditions: for each , if , then we take as any element of the set ; for each if , then we take Under both such conditions, we have Proof of Theorem 1. It is easy to see from (
2) that all the levels of the matrix
are the same as those of the matrix
, except level 1. Thus, we only need to compare level 1 of the matrix
with that of the matrix
.
For the two asynchronous energy-efficient policies
satisfying the conditions (a) and (b), by using
in (
2), it is clear that for
, if
, then
Thus, it follows from (
2) that
This also gives that
and thus
This completes the proof. □
Note that Theorem 1 will be necessary and useful for analyzing the policy monotonicity and optimality in our later study. Furthermore, see the proof of Theorem 4.
Remark 4. This paper is the first to introduce and consider the asynchronous dynamic policy in the study of energy-efficient data centers. We highlight the impact of the two asynchronous sub-policies: the setup and sleep policies on the long-run average profit of the energy-efficient data center.
3.2. The Reward Function
For the Markov process , now we define a suitable reward function for the energy-efficient data center.
Based on the above costs and price notations in
Table 1, a reward function with respect to both states and policies is defined as a profit rate (i.e., the total revenues minus the total costs per unit of time). Therefore, according to the impact of the asynchronous dynamic policy on the profit rate, the reward function at State
under policy
is divided into four cases as follows:
Case (a): For
and
, the profit rate is not affected by any policy, and we have
Note that in Case (a), there is no job in Group 2 or in the buffer. Thus, it is clear that each server in Group 2 is at the sleep state.
However, in the following two cases (b) and (c), since there are some jobs either in Group 2 or in the buffer, the policy will play a key role in opening (i.e., setup) or closing (i.e., sleep) some servers of Group 2 to save energy efficiently.
Case (b): For
,
and
the profit rate is affected by the setup policy
, we have
where
is an indicator function whose value is 1 when the event is in
, otherwise its value is 0.
Case (c): For
,
and
, the profit rate is affected by the sleep policy
, we have
Note that the job transfer rate from Group 2 to Group 1 is given by . If , then and . If and , then . If and , then .
Case (d): For
,
and
, the profit rate is not affected by any policy, we have
We define a column vector composed of the elements
,
and
as
where
In the remainder of this section, the long-run average profit of the data center (or the policy-based continuous-time Markov process
) under an asynchronous dynamic policy
is defined as
where
and
are given by (
3) and (
8), respectively.
We observe that as the number of working servers in Group 2 decreases, the total revenues and the total costs in the data center will decrease synchronously, and vice versa. On the other hand, as the number of sleeping servers in Group 2 increases, the total revenues and the total costs in the data center will decrease synchronously, and vice versa. Thus, there is a tradeoff between the total revenues and the total costs for a suitable number of working and/or sleeping servers in Group 2 by using the setup and sleep policies, respectively. This motivates us to study an optimal dynamic control mechanism for the energy-efficient data center. Thus, our objective is to find an optimal asynchronous dynamic policy
such that the long-run average profit
is maximized, that is,
Since the setup and sleep policies
and
occur asynchronously, they cannot interact with each other at any time. Therefore, it is seen that the optimal policy can be decomposed into
In fact, it is difficult and challenging to analyze the properties of the optimal asynchronous dynamic policy , and to provide an effective algorithm for computing the optimal policy . To do this, in the next sections we will introduce the sensitivity-based optimization theory to study this energy-efficient optimization problem.