Energy Efﬁcient and Delay Aware 5G Multi-Tier Network

: Multi-tier heterogeneous Networks (HetNets) with dense deployment of small cells in 5G networks are expected to effectively meet the ever increasing data trafﬁc demands and offer improved coverage in indoor environments. However, HetNets are raising major concerns to mobile network operators such as complex distributed control plane management, handover management issue, increases latency and increased energy expenditures. Sleep mode implementation in multi-tier 5G networks has proven to be a very good approach for reducing energy expenditures. In this paper, a Markov Decision Process (MDP)-based algorithm is proposed to switch between three different power consumption modes of a base station (BS) for improving the energy efﬁciency and reducing latency in 5G networks. The MDP-based approach intelligently switches between the states of the BS based on the offered trafﬁc while maintaining a prescribed minimum channel rate per user. Simulation results show that the proposed MDP algorithm together with the three-state BSs results in a signiﬁcant gain in terms of energy efﬁciency and latency.


Introduction
The next generation (5G) of wireless networks will serve an unprecedented number of devices, providing ubiquitous connectivity as well as innovative and rate-demanding services. Hence the design of 5G networks will necessarily have to consider energy efficiency as one of its key pillars. Recent studies prove that using a multi-tier 5G network is a very good approach to reduce energy consumption in the entire network [1]. Furthermore, the gain might be very high if the architecture allows switching off the network resources or base stations (BSs) that are not necessary to guarantee the target QoS for the offered traffic. In that case, minimal or zero power consumption mode of a base station can play a very crucial role if the aim is to increase energy efficiency of the network. Thus a 5G BS, which is known as gNB, can be turned off or put in low power consumption mode to reduce global energy consumption when its service not needed to ensure QoS. However we should also consider the delay caused by the required wake up time from sleep mode. The deepest sleep mode consumes zero power, however can cause significant delay in service due to wake up time from sleep mode; whereas in a lighter sleep mode (also known as stand-by mode) a resource consumes little power but wakes up very quickly. Hence our proposed model fills a gap in the performance evaluation of two type of minimal power consumption mode with a trade-off between energy efficiency and wake-up delay, while ensuring user-perceived QoS. In this work, low and zero power consumption modes are implemented on the BSs of a multi-tier 5G network in order to reduce the energy consumption of the network. We propose three different power consumption modes for BSs and a novel MDP-based algorithm to control these modes of the BSs under QoS constraint. Novelty of our work: As already presented in some pioneering works [2][3][4], switching off a base station within a HetNet is a very good approach to save energy consumption during low traffic condition. However each sleeping BS takes significant amount of time to wake up and become fully operable. As a consequence, some of the users might experience a delay in accessing the network, even some calls might be dropped as well. In order to overcome the problem of call dropping due to the wake up delay, we propose an energy efficient '3-state MDP model' for the BSs within the network. We propose three different modes namely 'Active mode', 'Stand-by mode' and 'Sleep mode' for the BSs. In this paper, we utilize Markov model to find the optimal policy in terms of energy efficiency using Markov decision process (MDP). We have also shown the energy-delay trade-off in order to design an energy efficient as well as delay aware network. We utilize the stand-by mode and have differentiated it from the sleep mode by defining that a BS in stand-by mode consumes some small amount of energy but takes negligible time to go back to the active mode; whereas a BS in sleep mode does not consume any energy but takes more time to wake up. Hence sleep mode gives us the advantage of power saving, whereas stand-by mode allows us to avoid the wake-up delay. That is why we are implementing both the stand-by mode and sleep mode in our algorithm so that we can reduce energy consumption as well as wake up delay of BSs. The hardware and software setups for these low power consumption and zero power consumption modes have already been proposed in the literature [5,6]. We have explained the operation and switching procedure among these modes in the following sections. Another novelty of our work is in defining a novel reward function for the Markov decision process (MDP) which helps us to get an optimal policy in selecting a particular mode for each BS. Because of applying the proposed 3-state MDP model, we find that a significant amount of energy can be saved in low traffic condition along with fulfilling the QoS requirement in terms of data rate.
The rest of the paper is organized as follows: Section 2 outlines related work in the concerned area. Network model, three state Markov model and proposed MDP-based algorithm model are presented in Sections 3-5, respectively. Section 6 presents the power consumption model. The simulation results and performance analysis are presented in Section 7. Finally Section 8 concludes the paper.

Related Work
Energy efficiency in cellular networks and communication has been studied widely in the literature [2][3][4][5][7][8][9]. Some research papers [2][3][4][5] have proposed different algorithms to implement sleep mode in the LTE BSs. The authors in [5] proposed an approach to reduce energy consumption in mobile networks by introducing discontinuous transmission on the base station side. In some of the pioneering works, MDP has been used as an effective approach for sleep mode implementation [2,[7][8][9] for green communication as well as to solve some other optimization problems [10,11]. The authors in [10] use a Markov decision process-based model to schedule consumers' behaviors in order to optimize the consumers' net benefits. They utilize a networked smart grid system, where future electricity generation is predicted with reasonable accuracy based on weather forecasts. In [11], the authors have proposed an algorithm to design efficient coding tools and optimize frame structure for transmission to facilitate view switching and contain error propagation in differentially coded video due to packet losses. In [2], the authors proposed an MDP-based optimal controller that associates to an activation/deactivation policy which maximizes a multiple objective function of the QoS and improve energy efficiency. Other papers such as [7,8] consider a single user and use a Markov chain technique to evaluate the energy savings due to the sleep mode mechanism of a single user terminal. The authors in [8] take correlated packet arrivals into account to evaluate an MDP-based sleep mode mechanism. The authors in [9] consider a similar setting of one user and one station and show how to derive the optimal sleep policy numerically by formalizing the problem as a Semi-MDP. The authors in [3] proposed a novel scheme for the sleep scheduling based on decentralized partially observable MDP (Dec-POMDP). However almost all of the above mentioned papers have proposed sleep mode of the BS in low traffic condition and the BS will be in active mode for rest of the time. We refer to this model as '2-state model'. The authors of references [2][3][4][5]12] have shown that this 2-state power consumption model can reduce energy consumption of the whole network. However, the problem is that the BS takes significant time to wake up from sleep mode, which may cause call drop to new users. This wake up time can range from tens of seconds to couple of minutes for small cell and up to 10-15 min for macro cell [6]. This is clearly a constraint for an energy efficient system. The authors in reference [5] proposed the low power consumption mode, which consumes some small amount of power but wakes up within negligible time; their power consumption model is similar to what we propose as stand-by mode. In contrast to works done in the literature so far, we propose a MDP-based algorithm on the base stations so that they can intelligently switch among three different modes.

Network Model
We consider a multi-tier 5G network where some base stations can be put in low power consumption mode as shown in Figure 1. In our proposed model we are implementing low and zero power consumption mode on the base stations within this multi-tier 5G network. The proposed algorithm is managed in the algorithm management unit, which is included in the base band unit of the network architecture. The Mobility Management Entity (MME) unit of the Evolved Packet Core (EPC) also has some contribution in the algorithm, as MME can inform the BS if any new user is approaching the cell and if any handover is about to happen. Thus, the MME can help the BS to update the total number of active users. Hence the centralized management unit is included in the mobility management entity (MME). The home subscriber server (HSS), mobility management entity (MME), serving gateway (S-GW) and packet data network gateway (P-GW) are included in the evolved packet core (EPC). The nomenclature used in rest of the paper is provided in Table 1.

Traffic Model:
We adopt a traffic model where the users arrive according to Poisson process in the coverage area of the BS with a certain arrival rate, u λ and death rate, u µ . Here we include all the users originating calls in the cell as well as the users being handed over from other cells.
Propagation Channel Model: We consider the log-normal shadowing model with a pathloss exponent, α and a shadowing variance, σ under AWGN channel. The received signal to noise ratio (SNR) per user is determined by link budget calculation from Equation (1), where ¶ out is the transmit power of the BS; PL is pathloss; G t and G r are transmit and received antenna gain respectively; X is log-normal shadowing and P n is the noise power. Equation (2) is used to find the pathloss PL, where d 0 is a reference distance, f c is carrier frequency, c is the speed of light and d is the distance between the BS and the user. Please note that for the sake of simplicity, we consider zero level of interference in the network. The values of d 0 and c are constant and are provided in Table 2.
Then Shannon's capacity formula [13] is used to determine the required bandwidth for all of the active users for a minimum target data rate, as shown in Equation (3); here R t is the target data rate and snr is signal to noise ratio in linear value. This BW req is the measure of the minimum required QoS. Our proposed algorithm ensures that this QoS requirement is fulfilled while reducing energy consumption and delay.

Proposed Power Consumption Modes
We have proposed a three state model, where the BSs of the network can be in any of the following three states:

1.
Active Mode: In active mode, the BS is in fully operational mode and consumes maximum power.

2.
Stand-by Mode: In stand-by mode, the BS is in low power consumption non-operational mode, where it consumes a small amount of power but requires negligible wake-up time to go back to the active mode. Researchers from Ericsson company have reported that a BS consumes approximately 10 W power in stand-by mode and takes approximately 30 µs to go into the active mode [5].

3.
Sleep Mode: In sleep mode, the BS is totally switched off so that it consumes almost zero power, however takes longer time to wake up. The authors in [6] have reported that small cells can take tens of seconds to couple of minutes to wake up from sleep mode, where a macro BS takes 10-15 min of wake up time from sleep mode. Please note that there is some non-zero ultra-low power consumption during sleep mode, however this ultra-low power consumption is assumed to be negligible compared to the power consumed in active mode in this work. Hence sleep mode is treated as zero power consumption mode in this work. Table 3 outlines the delay required for a BS to remain in active mode or become active from stand-by and sleep mode.

Motivation of Using Three Different Modes
The well-known advantage of sleep mode is that a BS consumes zero power while sleeping, but its disadvantage is that it takes around 40 s [14] to get back into fully operational mode. On contrary, though a BS in stand-by mode consumes some small amount of power, but it takes negligible time to wake up. Therefore we utilized both of the modes so that we can have the power saving advantage from sleep mode and reduce activation delay by utilizing stand-by mode. Figure 2 graphically depicts the time and energy required to switch between active mode and sleep mode. The energy gain for being in sleep mode equals ( ¶ act − ¶ sl p )t sl p , where ¶ act and ¶ sl p represent the power consumption in active and sleep mode, respectively, and t sl p is the duration of the time spent in sleep mode. It is evidence from the figure that a BS needs quite a long time to wake up from sleep mode to active mode. This wake up delay issue of sleep mode has clearly been reported in the paper [14] as in Figure 3. These results had been acquired from experiments and shows that a resource takes approximately 5 s to turn off completely (i.e., to go in sleep mode) and approximately 60 s to go on active mode and be completely operational again. That is why we need to put one of the BSs in stand-by mode so that it can be activated with minimal delay when needed. It is worth mentioning that we consider the fact that implementing three different states spend extra overhead, however we make sure that the energy consumption of the overhead is less that the energy saved by the algorithm.

Power Consumption Model
The total power consumed in the network depends on the number of active, standby and sleeping BSs. The following power consumption model presented in Equation (4) is used to determine the total power consumption of a BS for our proposed 3-state model, it is derived from the paper [15,16].
where N act , N std and N sl p denote the number of BSs in active mode, stand-by mode and sleep mode respectively; ¶ out is the output power or transmit power; d ¶ is the slope of load-dependent power consumption which is represented as a linear transmission power dependence factor in [15]; ¶ o and ¶ std are the power consumption at minimum non-zero load and in stand-by mode respectively. Please note that the BSs in sleep mode consume zero power. The reference values of all these variables for a macro BS [16] have been shown in Table 2. We assume d ¶ to be unchanged as suggested in the paper [15,16]. We apply trapezoidal numerical integration [17] on the power consumption curve to find the total energy consumption of the BS. For the 'All BS active' mode we do not apply any sleep or standby mode to the BSs, hence we consider that all the BSs are always active regardless of the traffic requirement. The power consumption model for this mode become as Equation (5) where N is the total available BSs in the eNodeB and N tx is the number of transmitting BSs which is equivalent to N act in Equation (4).

Two-State Power Consumption Model:
In the commonly used 'two-state model' [2][3][4][5]12], all of the inactive BSs are kept in either stand-by mode or in sleep mode, when there is no active user at all. This model can be expressed as the following power consumption equation Equation (6). We have compared our proposed model with two different 'two-state models' namely 'Rangisetti 2 state (active-standby) model' where N low = N std and 'MDP-based 2 state (active-standby) model' where N low = N sl p .

Three State Markov Model for a Base Station
The base stations (BSs) follow a Markov decision process (MDP) to decide when to switch among different states. The transition among all of the three states can be represented as a three-state Markov model [18] and can be presented as Figure 4. In this figure, S 0 , S 1 and S 2 represents the states of the nth base station at sleep mode, stand-by and active mode respectively; and a 0 , a 1 and a 2 represent the action of (n − 1)th BS going into or staying at sleep, stand-by and active mode respectively. According to this figure, while the nth BS is in state S 0 (sleep mode), and an action a 0 or a 1 is taken so that (n − 1)th BS goes into or remains in sleep mode or stand-by mode respectively, then the nth BS will remain in the same state S 0 with transition probability 1. However if an action a 2 is taken so that (n − 1)th BS goes into active mode, then the nth BS will go to state S 1 (stand-by mode) with transition probability 1. While the nth BS is in state S 1 (stand-by) then the only possible actions are a 1 and a 2 as per our proposed model. This is because we are following the sequence of the state transition of the BSs, which means when load increases then the (n − 1)th BS will go to active mode before nth BS. In other words, we can say that if nth BS is in stand-by mode then (n − 1)th BS cannot be in sleep mode rather it would be in stand-by or active mode. Therefore when the nth BS is in state S 1 then action a 1 will take it to state S 0 with probability 1; however if action a 2 is taken then it will take the nth BS to state S 2 with probability Pr n 12 or will remain in state S 1 with probability Pr n 11 . Finally when The nth BS is in state S 2 (active mode) then the only possible action is a 2 which will take the nth BS to state S 1 with probability Pr n 21 or will remain in state S 2 with probability Pr n 22 .

Proposed MDP-Based Algorithm
In this section, we describe our markov decision process (MDP)-based algorithm to obtain the optimal policy of deciding a particular mode for each base station. The following reward function, transition probability and the Value Iteration Algorithm (VIA) are used to solve the optimization problem.

Reward:
The total reward function for an action, a is defined as Equation (7) R where b a is the total required bandwidth, b max and b min are the maximum and minimum required bandwidth respectively. It is noteworthy that the required bandwidth is proportional to the required energy consumption, because less required bandwidth will put more BSs in low power consumption mode, hence would save more energy. Therefore the maximum expected reward from Equation (7) would help us to find the optimal policy in terms of energy efficiency.

Transition Probability:
The transition probability between two states of the base stations is learned by Monte-Carlo simulation for a particular call arrival rate and death rate. The treatment of the learning part is not within the scope of this paper.

Value Iteration Algorithm:
If we denote V(s) as the maximum expected total reward for an initial state s and future state s then the optimality equation is given by Equation (8) Here, R(s, a) is the reward function for a state s and action a as explained in Equation (7); Pr[s |s, a] is the transition probability between current state s and future state s for an action a; A is the set of all possible actions and S is the set of all possible states. The solution of the optimality equation correspond to the maximum expected total reward V(s) and the MDP optimal policy π(s). This MDP optimal policy π(s) indicates the decision of allocating a certain mode to the appropriate base station. As explained in Algorithm 1 value iteration algorithm (VIA) [19] is used to solve this optimization problem. The number of epochs before reaching equilibrium is distributed with mean 1 (1−λ) where 0 ≤ λ < 1; here λ is the discount factor. When expected service duration is known, λ is set accordingly to match the equivalent number of epochs. For example, we have put λ = 0.975 to model mean service time of 40 epochs.

Simulation Results and Performance Analysis
The traffic model as presented in Section 3 is used to determine the total number of users in the coverage area of the BS at each decision epoch under the above mentioned propagation channel environment. We use Matlab to simulate our proposed algorithm. The Markov decision process along with the value iteration algorithm is used to find the required number of active, stand-by and sleeping base stations. The total observation period is 3600 s and the simulation parameters are given in Table 2. The simulation has been run for 10,000 iteration and then the average of the results are taken. The algorithm takes approximately 18,700,000 addition operation, 23,320,000 subtraction operation, 20,830,000 subtraction operation, 3,510,000 division operation and 2,900,000 logical AND operation to complete the 10,000 iteration. For the set of parameters provided in Table 2, we have found the transition probability matrix for the second BS (BS2) as the Equation (9) and the transition probabilities for the third base station(BS3) are shown in Figure 5.
For our proposed model, BS2 is capable of moving only between stand-by mode and active mode depending on the traffic condition. For the third BS, the transition among all of the three states can be presented as Figure 5. In this figure, S 0 , S 1 and S 2 represents the states of the base station (BS3) at sleep mode, stand-by and active mode respectively; and a 0 and a 1 represent the action of BS2 becoming stand-by and active respectively. Figure 6 compares the power consumption of the network for always active BSs referred as 'All BS Active'; 2-state model proposed in [5]   The plots clearly show that using the proposed MDP model, we can save a significant amount of power than 'All BS active' and 'Rangisetti 2 state model' in low traffic condition. This is because, as per our proposed model, some of the BSs are in active mode, one of them is in stand-by mode and rest of them are in sleep mode; whereas in 'Rangisetti 2 state model' all of the unused BSs are in stand-by mode, which causes higher energy consumption. However, our proposed MDP-based 3 state model consumes bit more energy than the 'MDP-based 2 state model', that is because 'Combes Model' put all of the unused BSs in sleep mode only, which results in less power consumption compared to our proposed model but causes more wake up delay. This wake up delay can cause some call drops for some new users who would need to wait for a sleeping BS to wake up. However, as expected, the power consumption is the same for the two states and three states models in higher traffic condition, as almost all of the BSs needs to be active in order to support the load. In order to see the delay performance of the above mentioned four different models we generate Figure 7, which shows the percentage of the total users experiencing delay (from delay group defined in Table 3) in receiving service within the observation period for the four different models. As expected, all the users in 'All BS active' model will have no delay (delay group-1) at all because all of the BSs are always active for this mode. On the other hand, 'Rangisetti 2 state model' uses stand-by mode for the unused BSs and 'our proposed model' uses standby and sleep mode for the unused BSs. Therefore only a few (around 12%) of the users experience approximately 30 µs (delay group-2) of delay for both of these models. It is noteworthy that as per the proposed MDP-based 3 state model, the users will experience delay only when a base station will have transition from stand-by to active mode. The time needed for the sleeping BS to go to stand-by mode will not cause delay to the users. Note that as per, Rangisetti model, the delay is also caused by the transition between stand-by to active mode. That is why the delay for 'Rangisetti 2 state model' and 'proposed 2 state model' are the same. On the contrary, these users (around 12%) experience 40 s (delay group-3) of delay in the 'MDP model' as this model implements only sleep mode for unused BSs, and it takes more time for the BSs to go from sleep mode to active mode. Therefore, our proposed model offers a fair share of energy efficiency and delay in low traffic condition.

Effect of the Parameter Variation
We apply trapezoidal numerical integration [17] on the power consumption curves in order to find the total energy consumption of the network. We find that the MDP-based three state model can reduce approximately 40% energy consumption compared to the 'All BS active' model. However, as one can expect from the power consumption curves, our proposed model is consuming little bit more energy than the MDP-based 2 state model, but this disadvantage is outweighed by the benefits of reduced delay compared to the Combes model. In this section, we vary the different parameters and have shown their effect on the energy consumption of the BS. At first, we vary the coverage range of the BS and find that the energy consumption increases rapidly after a certain range in Figure 8. For the set of parameters we have used as in Table 2, we find that the energy consumption remains almost constant for up to BS range of 1.5 km, however increases very rapidly for higher BS ranges. This scenario is true for all of the models we have used. These results depict that our proposed model consumes less energy than 'All TRC active' model and 'Rangisetti 2 state (active-standby) model', however consumes little more energy than 'MDP-based 2 state (active-sleep) model'. Figure 9 shows the expected delay experienced by the users, from where we can see that 'MDP-based 2 state (active-sleep) model' offers more delay than 'Proposed 3 state model' and 'Rangisetti 2 state (active-standby) model'. As per the proposed MDP-based 3 state model, the users will experience delay only when a base station will have transition from stand-by to active mode. The time needed for the sleeping BS to to go stand-by will not cause delay to the users. Note that as per, Rangisetti model, the delay is also caused by the transition between stand-by to active mode. That is why the delay for 'Rangisetti 2 state model' and 'proposed 2 state model' are the same. As expected the delay increases with the BS range; however after a certain range when most of the BSs are active then the activation delay reduces. Please note that different ranges of BSs represent different types of BS (Pico, Femto, Micro, Macro etc). Figures 8 and 9 provide us a good evidence to claim that our proposed model offers a fair share of energy efficiency and wake-up delay for any type of BSs.
We also observe the effect of different pathloss exponent and shadowing variance on the energy consumption (for macro BS as an example) and expected delay and the results are shown in Figures 10-12 respectively. As we can expect, energy consumption of the BS increases with the path loss exponent and shadowing variance, which is clearly depicted in the figures. Figure 12 depicts that 'MDP-based 2 state (active-sleep) model' offers higher delay than 'Proposed 3 state model' and 'Rangisetti 2 state (active-standby) model'. All of these results again show that our proposed model is an efficient model which offers a fair share of reduced energy consumption as well as reduced wake-up delay.

Conclusions
In this paper, we propose a novel strategy to implement sleep mode in the base stations within a multi-tier 5G network using the Markov Decision process. The proposed method considers a three-state base station model with active, sleep and standby modes where the states are adopted between each other based on the MDP-based algorithm. In the MDP-based approach we have proposed a novel reward function, which helps us to find the optimal policy depending on the traffic condition and QoS requirement to improve the energy efficiency. The results are compared with other state-of-art algorithms, from where we find that 'MDP-based 2 state (active-sleep) model' offers the lowest energy consumption, but the highest amount of delay. Whereas, 'Rangisetti 2 state (active-standby) model' reduces this delay but increases the energy consumption significantly. On contrary, Our proposed MDP bases 3 state model offers a fair share of energy efficiency and delay, where the network can save a good amount of energy when compared to Rangisetti 2 state model with ignore-able delay compared to MDP-based 2 state model. Therefore the proposed model has proven to offer a good trade-off between the energy efficiency and delay. Our future work is focused on considering the effect of interference introduced in HetNet and updating the algorithm to incorporate the effect of interference. Moreover, a stochastic geometry based energy efficiency analysis is also expected to be done as part of the future work.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

BS
Base station HetNet Heterogeneous network MDP Maarkov decision process QoS Quality of Service SNR Signal to Noise ratio VIA Value iteration algorithm