Next Article in Journal
A PEC Thrice Subtraction Method for Obtaining Permeability Invariance Feature in Conductivity Measurement of Ferromagnetic Samples
Previous Article in Journal
Fault Diagnosis of Rolling Bearing Based on Multiscale Intrinsic Mode Function Permutation Entropy and a Stacked Sparse Denoising Autoencoder
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Power Control and Link Selection for Wireless Relay Networks with Hybrid Energy Sources

School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2019, 9(13), 2744; https://doi.org/10.3390/app9132744
Submission received: 9 June 2019 / Revised: 1 July 2019 / Accepted: 3 July 2019 / Published: 6 July 2019

Abstract

:
The Hybrid energy supply (HES) wireless relay system is a new green network technology, where the source node is powered by the grid and relay is powered by harvested renewable energy. However, the network’s performance may degrade due to the intermittent nature of renewable energy. In this paper, our purpose is to minimize grid energy consumption and maximize throughput. However, improving the throughput requires increasing the transmission power of the source node, which will lead to a higher grid energy consumption. Linear weighted summation method is used to turn the two conflicting objectives into a single objective. Link assignment and a power control strategy are adopted to maximize the total reward of the network. The problem is formulated as a discrete Markov decision model. In addition, a backwards induction method based on state deletion is proposed to reduce the computational complexity. Simulation results show that the proposed algorithm can effectively alleviate performance degradation caused by the lack of renewable energy, and present the trade-off between energy consumption and throughput.

1. Introduction

In order to expand the coverage of wireless networks and improve the communication quality of edge users, relay has been widely used in wireless networks such as long term evolution (LTE) and 5G [1,2]. However, dense deployment of relays can lead to problems such as high energy consumption and greenhouse gas emissions [3,4]. Energy harvesting technology is the most promising technology to solve the economic and environmental problems caused by the dense deployment of relays, which can collect and utilize renewable energy such as solar and wind energy [5,6,7,8,9]. In addition, energy harvesting technology can reduce the dependence of wireless networks on grid energy. Deploying green relays in areas where grid energy is scarce can effectively expand the coverage of wireless communication networks. However, the intermittent and random nature of the renewable energy may cause a decline in network performance. Therefore, it is critical to establish a renewable energy allocation and link selection mechanism to ensure the performance of wireless relay networks.
Optimal power control strategies to alleviate network performance degradation caused by the lack of renewable energy have been proposed in [10,11,12,13]. In [10], both the source and relay nodes are powered by renewable energy, and an off-line power control strategy is made to minimize data transmission time with throughput constraints. It was also proved that the power control strategy has water injection structure. Differing from [10], literature [11] considered how the source node is powered by the grid while the relay node is powered by renewable energy. An off-line power control strategy is proposed to maximize the grid energy efficiency of the source node. In [12], the research was extended to a Gauss fading channel, with both off-line and on-line power control schemes are proposed. In addition to the traditional relay, the literature [13] considers the relay with cache function. Off-line and on-line power control strategies are developed to maximize network throughput. All of the above studies have developed off-line or on-line power control strategies under the constraints of user’s quality of service (QoS).
Besides power control schemes, link selection strategies are also critical. In [14], the source node is powered by grid energy while the relay is powered by renewable energy. With causal side information, a link selection strategy is made to maximize the average throughput of all time slots. In [15], the relay is powered by radio frequency (RF) energy with a finite battery capacity, and only one data packet needs to be transmitted in each slot. A link selection scheme based on battery level is proposed to minimize the outage probability of the relay. Further, literature [16] jointly optimizes power and link selection to reduce outage probability of the relay. In [17], the research was extended to multi-relay with data caching, Under the constraints of battery capacity, on-line and off-line power allocation and link selection mechanisms were developed to minimize data transmission time.
Most previous studies on hybrid energy supply (HES) wireless relay systems assume that the amount of data transmitted by the source node are fixed in each slot, and then optimize the outage probability or transmission time. However, in practical applications, the source node needs to serve a lot of different users and deliver unfixed data bits in each slot. So, data bits should be transmitted as much as possible to reduce the network congestion with grid energy consumption constraints. Consequently, in this paper, we consider that the source node has available bits to be transmitted all the time. Our goal is to maximize network throughput while minimizing grid energy consumption.
The rest of this paper is organized as follows. Section 2 describes the system model. In Section 3, the Markov decision process (MDP) problem is formulated, for which a low-complexity algorithm is proposed. Simulations are shown in Section 4. Finally, Section 5 highlights the conclusion.

2. System Model

2.1. Network Model

We consider a green wireless relay network as shown in Figure 1, which consists of a source node (S), a relay node (R), and a destination node (D). The source node is powered by the electric grid, and its maximum transmit power is denoted as p S max . The relay is powered purely by the renewable energy which is harvested from the nature and stored in a battery, with the maximum transmit power as p R max . The distances between source node and relay, relay and destination node, source node and destination node are d S R , d R D and d S D , respectively. We consider the performance changes of the network in N slots with length T. The slot index set is denoted as = { 1 , 2 , N } .
We assume that the source node has available bits to be transmitted all the time. The relay operates in a half-duplex manner that it receives data from the base station (BS) in the first half of the time slot and forwards it in the second half [11]. In each slot, there are two links for S to transmit bits: the relay link and direct link. The link assignment indicator in the i -th slot is denoted as I j i { 0 , 1 } , with j { S R D , S D } . I S R D i = 1 indicates that the relay link is assigned to deliver the bits, and I S D i = 1 represents the event that the direct link is selected. In each slot, only one of these links can be selected, so
I S D i + I S R D i = 1 , i .
In addition, transmission power of source node S and relay R in slot i is p S i and p R i , respectively. Static power consumption is neglected in this work.
We consider that the channel is time-varying with small-scale fading; denoted h i k , with k { S R , R D , S D } as the channel fading factor between two nodes. In addition, all channels share the bandwidth (B) together. Therefore, once the direct link is selected in the i -th slot, the number of bits transmitted by the source node are calculated as
R S D i = B T log ( 1 + h S D i g 0 d S D α σ 2 P S D i ) ,
according to the Shannon theorem, where σ 2 is the noise variance of node D, g 0 is the channel fading constant, p S D i is the transmit power of node S and α is the path loss exponent. While the relay link is selected, the total bits delivered by the source node and the relay are given as
R S R i = B T 2 log ( 1 + h S R i g 0 d S R α σ 2 p S R i )
and
R R D i = B T 2 log ( 1 + h R D i g 0 d R D α σ 2 p R i ) ,
where p S R i is the transmit power of node S when selecting the relay link.

2.2. Energy Model

In order to simplify the energy harvesting model, the harvesting process is thought to be accomplished at the beginning of each time slot [18]. Then, discrete energy model is adopted to describe the process of energy harvesting with E H i energy packets arriving at each time slot [19], which obeys the Poisson distribution with mean λ [20]. And λ represents the intensity of energy harvesting. In addition, each packet contains the energy of E e . In the initial time slot, the energy stored in the battery is the original energy plus the harvested energy. While in the slot of i > 1 , the energy consumed by the relay should be subtracted. Therefore, battery level in the i -th slot is expressed as:
E ( i ) = { E 0 + E H i E e i = 1 E ( i ) C R i 1 + E H i E e i > 1 ,
where E 0 is the initial energy, and C R i 1 is energy consumed by the relay in the last slot, which is given by:
C R i 1 = I S D i 1 T p S D i 1 + I S R D i 1 T 2 p S R D i 1 .
Since the battery is of the limited size, the following energy constraint should be satisfied:
E ( i ) E max ,
where E max is the maximum capacity of the battery. The energy consumed by the relay should be no more than the energy of the battery, that is C R i E ( i ) .

2.3. Optimizing Objective

Most previous studies on two-hop green wireless relay networks concentrated on minimizing the outage probability or transmission time with fixed bits at each slot [10,21,22]. However, when the network is busy and needs to provide services for multiple users, there will be a continuous stream of data bits to be transmitted. In this case, throughput can show the carrying capacity of the network, which is attractive to us. In addition, the source node powered by the grid energy can also be applied to some wireless networks. For example, macro base stations powered by the grid energy can maintain basic coverage in heterogeneous wireless networks. To take above aspects into consideration, we set grid energy consumption and throughput as optimization objectives. However, according to Equations (2) and (3), improving the throughput requires increasing the transmission power of the source node, which will lead to higher grid energy consumption. Therefore, increasing throughput and reducing grid energy consumption are conflicting objectives, which can be formulated as:
o b j = { max i = 1 N I S D i R S D i + I S R D i R R D i min i = 1 N I S D i T p S D i + I S R D i T 2 p S R i .
There are many ways to solve multi-objective problems, and linear weighted summation is an effective method of them [23]. We use the weighted summation method to transform the conflicting multi-objectives into a single objective. Thus, the conflict objectives can be denoted as:
T R = i = 1 N [ ω t ( I S D i R S D i + I S R D i R R D i ) ω g ( I S D i T p S D i + I S R D i T 2 p S R i ) ] ,
where ω t and ω g are weight coefficients of throughput and grid energy consumption, respectively. ω t ( I S D i R S D i + I S R D i R R D i ) and ω g ( I S D i T p S D i + I S R D i T 2 p S R i ) are the weighted throughput reward and grid energy reward in the i -th slot.

3. Optimal Control for Expected Total Rewards

In this section, we assume h S R i , h S D i and h R D i are causally known. Consequently, we aim to adapt the transmission power and link selection to maximize the expected total rewards. Thus, the problem can be formulated as:
P 1 : max I j i , p S R i , p S D i , p R i E { i = 1 N [ ω t ( I S D i R S D i + I S R D i R R D i ) ω g ( I S D i T p S D i + I S R D i T 2 p S R i ) ] }
s.t. Equations (1) and (7)
C R i E ( i )
R S R i = R R D i
p j i p j max i , j { S , R }
I j i { 0 , 1 } , i N , j { S D , S R D } .
In P 1 , the objective is the expected total rewards over N time slots. Equation (12) is the energy constraint that the energy consumed by relay should not exceed that of the battery. Equation (13) is the throughput constraint that the bits delivered in the two stages of relay link should be Equations (14) and (15) are the power constraint and the link selection constraint.

3.1. Problem Simplification

From Equation (10), we can see that the optimized variables of P 1 include the 0–1 variable I j i and the continuous variable p S R i , p S D i and p R i . Thus, it is very difficult to optimize so many different types of variables at the same time. Therefore, we do some simplification of P 1 to reduce the optimized variables. And, the problem of P 1 can be expressed as:
P 2 : max p R i E { i = 1 N [ β i D i ( 1 β i ) H i ] }
s . t .   C R i E ( i )
p R i p R max i .
And β i is the 0–1 variable, which can be calculated as:
β i = { 1   p R i 0 0   p R i = 0   i .
That is, when the transmission power of relay is non-zero, β i = 1 and relay link is chosen to transmit data. While the power of relay is zero, β i = 0 and direct link is selected to deliver bits. In addition, D i is the network total reward when relay link is chosen, which is given by:
D i = ω t R R D i ω g G S R i ,
where R R D i is the throughput, and G S R i is the grid energy consumed by the source node. We assume that the amount of data bits transmitted in the two stages of relay link should be equal. So, there is
R S R i = R R D i = B T 2 log ( 1 + h S R i g 0 d S R α σ 2 p S R i ) = B T 2 log ( 1 + h R D i g 0 d R D α σ 2 p R i ) .
From equation (21), we can know that
h S R i d S R α p S R i = h R D i d R D α p R i .
According to Equation (20) to (22), D i can be further given as:
D i = w t B T 2 log ( 1 + h R D i g 0 d R D α σ 2 p R i ) w g T 2 h R D i d R D α p R i h S R i d S R α .
H i is the reward when direct link is selected, and the problem of solving H i can be expressed as:
P H : max p S D i H i = max p S D i [ ω t R S D i ω g G S D i ] = ω t T B log ( 1 + h S D i g 0 d S D α σ 2 p S D i ) ω g T p S D i
s . t .   p S D i p S max i .
Proposition 1.
The reward of the direct link H i is convex in transmission power p S i .
Proof: 
The second derivative of formula (24) is
f ( p S D i ) = w t T B ( h S D i g 0 d S D α σ 2 ) 2 ( 1 + h S D i g 0 d S D α σ 2 p S D i ) 2 ln 2 < 0 ,
which means that f ( p S D i ) is a convex function. □
According to the properties of convex functions, we can easily know that H i gets the maximum value at p S D i = ω t ω g 1 h S D i g 0 d S D α σ 2 . Therefore, the optimal p S D i and H i are known once h S D i is given. However, p S D i has practical significance only on [0, p S max ] in this work. Consequently, when ω t ω g 1 h S D i g 0 d S D α σ 2 < 0 , P H decreases monotonously on [0, p S max ] with maximum value p H ( 0 ) . While ω t ω g 1 h S D i g 0 d S D α σ 2 > p S max , P H increases monotonously on [0, p S max ] with maximum value p H ( max ) .
It can be seen from Equation (16) to (26), that the values of β i and D i are determined by p R i , and the maximum value of H i is a fixed value in each time slot which is only related to h S D i . Therefore, the optimal variable of problem P 2 is only p R i .

3.2. MDP Model for Expected Total Rewards

Our goal is to maximize the total rewards over N slots through a relay power control scheme. However, due to the limited battery capacity, the relay power selection results in each slot will affect the initial battery capacity at the next moment. So, power decisions on different time slots are mutually influential. The MDP is a useful model to handle such decision problems, and backward induction is an effective algorithm to solve this problem [24].
Therefore, we formulate P 2 as a Markov decision process (MDP) problem, which can also be expressed as:
P 2 : max π E π i = 1 N R ( i ) ,
where π is a feasible relay power policy, denotes the set of all feasible policies. R ( i ) is the reward of slot i, which is given by [ β i D i ( 1 β i ) H i ] .

3.2.1. MDP Basics

A sequential decision-making method is the selection of one of several action strategies in each time slot during the operation of the system [25]. In the sequential decision process, if the transfer of the system state obeys the known probability law and is independent of the previous history, then this sequential problem is called an MDP problem [26]. An MDP model consists of a reward function, system states, actions, state transition probability and objective, each of which will be described in detail later.

3.2.2. Reward Function

In an MDP model, the reward function is defined as r ( j , a i ) . It indicates that the system gets the reward with action a i at state j [27]. This is denoted o i O , i as the rule for selecting relay power in slot i. Thus, the rules over N slots can be expressed as π = ( o 1 , o 2 , o 3 , o N ) , and the set of all possible rules is denoted as . Given the initial state k and strategy π m O , the expected total rewards can be also written as:
V N ( π , s 1 ) = i = 1 N a i A i , j S P π { s i = j , Δ i = a i | s 1 = k } r ( j , a i ) ,
where s i and Δ i are the states of the relay system and the selected action in the slot i, respectively. P π { s i = j , Δ i = a i | s 1 = k } is the conditional probability of using strategy π m O , starting from state k, selecting action a i , and moving to state j at slot i . Our aim is to find the optimal action selection scheme as π = ( o 1 , o 2 , , o N ) , which makes V N ( π , s 1 ) the maximum value.

3.2.3. Discretization of System States and Actions

The optional values of the system states and actions should be finite in MDP model. However, the system states include links and the battery states, and the relay power actions are continuous values in the wireless relay network. Therefore, it is necessary to discretize the system states and relay power actions. The relay system states consist of channel fading values and battery levels, which can be given as s i h S R i , h S D i , h R D i , ε ( i ) . We discretize the channel states by reference to the method in literature [28]. Denoting H = { H 1 , H 2 , H 3 } as the set of channel fading values, which is an equal-difference sequence. The probability when the channel fading is H k with k { 1 , 2 K } can be calculated as:
p { h j i = H k } = 1 K i , j { S R , S D , S R D } , k { 1 , 2 K } .
We divide the battery into M + 1 energy level. And the battery states set is taken as ε ( i ) ε = [ 0 , 1 , , m , M ] . The real-time energy level of the battery can be calculated by
ε ( i ) = m = E i M E max .
Denote A i = [ 0 , p R max L , , p R max ] as the action set, which is also an equal-difference sequence. Actually, the value of the relay transmit power is constrained by the battery level. Thus, the action set is given as A i = [ 0 , p R max L , , min ( p R max , p x i ) ] , where p x i is calculated by:
p x i = 2 ε ( i ) E max L M T p R max p R max L .
x is the function that rounds the variable x down.

3.2.4. State Transition Probability

After action a i is selected, the system states will migrate from s i to s i + 1 , which can be expressed as s i h S R i , h S D i , h R D i , ε ( i ) s i + 1 h S R i + 1 , h S D i + 1 , h R D i + 1 , ε ( i + 1 ) . Since the value of channel fading is equal probability, the state transition probability is:
p { s i + 1 | s i , a i } = 1 K 3 p { ε i + 1 | ε i , a i } .
We assume that ε ( i ) and ε ( i + 1 ) are in the M 1 and M 2 levels of the battery, which should satisfy the Equation: M 2 E max M M 1 E max M a i T 2 + E H i E e < ( M 2 + 1 ) E max M . As mentioned in Section 2.2, the energy harvesting process obeys the Poisson distribution with mean λ. Therefore, Equation (32) can be further given as:
p { s i + 1 | s i , a i } = 1 K 3 n = n 1 n 2 λ n n ! e λ ,
where n 1 and n 2 is given as n 1 = M 2 E max M + a i T 2 M 1 E max M E e and n 2 = ( M 2 + 1 ) E max M + a i M 1 E max M E e 1 , respectively. x is the function that rounds the variable x up.

3.3. The Backward Induction Algorithm for MDP Problem

The backward induction algorithm is an effective solution to the optimal strategy and value function in the finite-stage Markov decision programming problem [26]. A new function, V n ( i ) , was proposed based on the backward induction algorithm, which is formulated as:
V n ( i ) = max a i A i [ r ( k , a i ) + j s p ( j | k , a i ) V i + 1 ( j ) ] = r ( k , f i ( k ) ) + j s p ( j | k , a i ) V i + 1 ( j ) ( k s , i = { N , N 1 , N 2 , , 0 } ) ,
where V N + 1 ( k ) = 0 , k s [26]. According to Equation (34), the optimal value function of the expected total rewards can be calculated as V 1 = ( V 1 ( 1 ) , V 1 ( 2 ) , , V 1 ( q ) ) . Meanwhile, the decision sequence π = ( o 1 , o 2 , , o N ) obtained is the optimal strategy.
With the backward induction algorithm, the number of states required to traverse is M × L 3 . The state space may be very large if some of the elements are of large size and may encounter the curse of dimensionality [29]. An effective method to reduce the computational complexity in MDP model is proposed in literature [28]. In this case, we also eliminate some states that do not need to be searched according to the wireless relay network properties in our model by reference [28].
Proposition 2.
When h S R i , h R D i , ε ( i ) are fixed value, and the optimal action is a i = 0 at state s i h S R i , h S D i , h R D i , ε ( i ) , the optimal action is a i = 0 for any state s i + of h S D i + > h S D i .
Proof: 
If the optimal action for the state s i is a i = 0 , according to Equation (34), we know that:
r ( s i , 0 ) + j s p ( j | s i , 0 ) V i + 1 ( j ) > max a i A i a n d a i 0 [ r ( s i , a i ) + j s p ( j | s i , a i ) V i + 1 ( j ) ] .
From Equation (21), we know that
r ( s i , 0 ) = H i = ω t T B log ( 1 + h S D i g 0 d S D α σ 2 p S D i ) ω g T p S D i ,
Which becomes larger as h S D i grows. Therefore, for any state s i + with h i S D + > h i S D , r ( s i + , 0 ) > r ( s i , 0 ) . Since h i S R , h i R D and ε i are fixed value, we can get
j s p ( j | s i + , 0 ) V i + 1 ( j ) = j s p ( j | s i , 0 ) V i + 1 ( j ) ,
and
max a i A i a n d a i 0 [ r ( s i + , a i ) + j s p ( j | s i + , a i ) V i + 1 ( j ) ] = max a i A i a n d a i 0 [ r ( s i , a i ) + j s p ( j | s i , a i ) V i + 1 ( j ) ] .
Finally,
r ( s i + , 0 ) + j s p ( j | s i + , 0 ) V i + 1 ( j ) > max a i A i a n d a i 0 [ r ( s i + , a i ) + j s p ( j | s i + , a i ) V i + 1 ( j ) ] ,
which proves that the optimal action is a i = 0 in the state s i + . □
Proposition 3.
When h S D i , h R D i , ε ( i ) are fixed value, and the optimal action is a i = 0 at state s i + h S R i + , h S D i , h R D i , ε ( i ) , the optimal action is a i = 0 for any state s i of h S R i < h S R i + .
Proof: 
If the optimal action for state s i + is a i = 0 , according to Equation (34), we can get:
r ( s i + , 0 ) + j s p ( j | s i + , 0 ) V i + 1 ( j ) > max a i A i a n d a i 0 [ r ( s i + , a i ) + j s p ( j | s i + , a i ) V i + 1 ( j ) ] .
While p R i 0 , r ( s i , p R i ) is given as
r ( s i , p R i ) = D i = w t B T 2 log ( 1 + h R D i g 0 d R D α σ 2 p R i ) w g T 2 h R D i d R D α p R i h S R i d S R α ,
which becomes smaller as h S R i grows. Thus, For the stat s i with h i S R < h i S R + , r ( S i , a i ) < ( S i + , a i ) a i A i   a n d   a i 0 . Since h i S D , h i R D and ε i are fixed value, there are:
max a i A i a n d a i 0 [ r ( s i , a i ) + j s p ( j | s i , a i ) V i + 1 ( j ) ] < max a i A i a n d a i 0 [ r ( s i + , a i ) + j s p ( j | s i + , a i ) V i + 1 ( j ) ] ,
and
r ( s i , 0 ) + j s p ( j | s i , 0 ) V i + 1 ( j ) = r ( s i + , 0 ) + j s p ( j | s i + , 0 ) V i + 1 ( j ) .
Finally,
r ( s i , 0 ) + j s p ( j | s i , 0 ) V i + 1 ( j ) = r ( s i + , 0 ) + j s p ( j | s i + , 0 ) V i + 1 ( j ) ,
which indicates that the optimal action is a i = 0 in the state s i .
Algorithm 1. Backward Induction Algorithm Based on States Elimination
Input: p R max , p S max , d S D , d R D , d S R , T , B , N , K , L , λ , E e , ω t , ω g , ε
Output: π
1: Initialize π = z e r o s ( N , K 3 × M ) , V N + 1 = 0
2: While N 1
3:   p x N = 2 ε ( N ) L T p R max p R max L , A N = [ 0 , p R max L , , min ( p R max , p x N ) ] ;
4:     For m = 1 to M, k S R = 1 to K
5:        k R D = K, k S D = 1;
6:          While k S D K + 1 , k R D 0
7:           s = < H k S R , H k S D , H k R D , ε ( m ) > ;
8:              For j = 1: length ( A N )
9:      Calculate π ( N , s ) = arg max A N ( j ) V N ( s ) = arg max A N ( j ) ( r ( s ) , A N ( j ) ) + l s p ( l | s , A N ( j ) ) V N + 1 ( l ) ) ;
10:               End For
11:                 If π ( N , s k S R , k S D , k R D , m ) = 0
12:                    π ( N , s k S R , k S D + , k R D , m ) = 0   k S D + > k S D , π ( N , s k S R , k S D , k R D , m ) = 0   k R D < k R D ;
13:                    k R D = k R D − 1, k S D = k S D + 1;
14:                 Else
15:                   if k S D = K
16:                     k R D = k R D − 1, k S D = 1;
17:                    Else
18:                      k S D = k S D + 1;
19:                    End If
20:                 End If
21:       End For
23:       N = N − 1;
24: End While

4. Numerical Simulations

In this section, we run some numerical simulations to analyze the total reward, grid energy consumption and throughput in two-hop wireless relay networks. In the simulations, we set B = 10 MHz, T = 1 ms, p S max = 2 W, p R max = 0.5 W, σ 2 = −97.5 dBm, g 0 = −40 dB, α = 4 [28]. And E e = 0.01 mJ, E max = 1.6 mJ, K = 10, L = 20, d S D = 80 m, ω g = 1 . The detailed numerical results are shown as follows.

4.1. Baseline Schemes

Joint Power Control and Link Selection Algorithm (JPLA): The JPCALSA only considers the current system state, and calculates the maximum rewards of relay and direct link, respectively. Then, the optimal access link is selected by comparing the rewards.
Power Control Algorithm (PCA): The PCA is to maximize the reward in single slot by adjusting the power of the relay, and link selection scheme is not taken into account [11].
When the energy of relay is sufficient, the system will be in the ideal state. In order to compare the ideal results with our results in different situations, we propose JPLA-F and BIABoSE-F, which are JPLA and BIABoSE with enough renewable energy.

4.2. Parameter Analysis

Figure 2 demonstrates the total rewards with different number of battery levels at different time slots. In any slot, the total rewards increase with the number of battery levels rises. Actually, the energy between two adjacent levels is expressed by the lower level, and the interval of two adjacent levels is smaller as the number of levels becomes larger. At this point, the error between the true value and the expressed value will be smaller, which makes a more accurate result. When the number of battery levels reach 80 and 160, their rewards are close and maximal. Consequently, for reducing the computational complexity, M = 80 is used for simulation analysis in the follow-up.
We assume that d S D = d S R + d R D = 80 m , and the total rewards vary with d S D is shown in Figure 3 and Figure 4. As d S D increases, the rewards of all algorithms become larger first and then decrease. When d R D is small, d S R = 80 d R D is large, and the path loss between the source node and relay is high. In this case, the source node delivers a few bits to relay with high grid energy consumption, which leads to low total rewards. As d R D increases, the path loss between the source node and relay decreases, and the total rewards rise. Once d R D is larger than a certain threshold, the path loss between the relay and destination is high, the number of bits that can be transmitted by relay is lower than that by source node. In this case, the total reward is gradually reduced as the throughput of relay tapers off.
In addition, the JPLA-F and BIBAoSE-F achieve the maximum value near d R D = 40 m in both Figures. Meanwhile, the PCA, JPLA and BIBAoSE obtain the maximum value at different d R D in two Figures. Unlike the JPLA-F and BIBAoSE-F, the other three algorithms are affected by the energy harvesting intensity. The energy that the relay needed for data transmission grows larger as d R D increases. Therefore, when the energy is more sufficient, the total rewards will be closer to optimal result. The total rewards reach the maximum value at d R D = 40. Thus, we choose d R D = 40 for subsequent simulations to better observe the improvement of system performance in the absence of energy.

4.3. Total Reward Maximization

Figure 5 shows the total rewards changes with the time slots. Compared with the PCA, the JPLA adds a link selection mechanism. Therefore, the JPLA can transmit data through the direct link when the battery is very low, which can increase the total rewards. The BIBAoSE takes the future system states into account, which makes a more efficient green energy allocation over N slots than the JPLA. However, all the algorithms can only alleviate the system performance degradation caused by insufficient energy and cannot replace the green energy supply. Therefore, the JPLA-F and the BIBAoSE-F always have the highest total rewards.
Figure 6 displays the total rewards vary as energy harvesting intensity increases. The system is in a green energy-deficient state, when the energy intensity is low. In this case, the relay can deliver more bits as the intensity increases, which leads to a higher reward. However, the rewards will be constant once the energy intensity reaches a certain threshold, because the battery capacity is limited. It should be noted that the rewards of the BIABoSE are lower than the other algorithms when the green energy is enough due to the discretization of states. However, the BIABoSE achieves better performance in our main application scenario, which is a lack of green energy.

4.4. Grid Energy Consumption and Throughput Trade-Off

Figure 7 shows the grid energy consumption and throughput when ω t takes different values. When ω t is very small, the energy consumption and throughput of all schemes are similar. In this case, the system has a high demand for grid energy consumption, which will impose strict limits on energy consumption. When ω t increases, the throughput plays an increasingly important role in the reward. Although our schemes consume a little more energy than the JPLA and PCA, it greatly improves the throughput. When the value of ω t is large, all schemes pursue maximum throughput regardless of energy consumption costs. Therefore, all throughput gains are very close. However, the BIBAoSE consumes the least energy and is closest to the JPLA-F and BIBAoSE-F. In addition, once the throughput constraints are given, we can find the value of ω t and get the minimum grid energy consumption.
The energy consumption and throughput are shown in Figure 8. As can be seen from the graph, the BIBAoSE consumes less grid energy than the JPLA when achieves the same throughput. And the BIBAoSE can transmit more bits than the JPLA with the same grid energy supply. In short, the BIBAoSE has a better trade-off between energy consumption and throughput, which is closer to the ideal situation such as the JPLA-F and BIBAoSE-F.

5. Conclusions

In this paper, we proposed an online power allocation and link selection strategy to maximize the total rewards of two-hop relay wireless networks where the source node and relay are powered by grid and green energy, respectively. Simulation results show that the total reward of this scheme is optimal under different settings compared with some conventional schemes. Next, we will continue to study energy harvesting technology in multifunctional relay nodes. Then, the research results will be applied to practical scenarios such as 5G heterogeneous networks, the Internet of Things and other networks.

Author Contributions

R.W. and H.X. conceived and designed the experiments; R.W. and H.X. performed the simulations; H.X. and Z.C. wrote the paper; R.W. and L.T. technically reviewed the paper.

Funding

This research was funded by the National Natural Science Foundation of China (No. 51677065).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liu, T.; Li, J.; Feng, S.; Guan, H.; Yan, S.; Jayakody, D.N.K. On the Incentive Mechanisms for Commercial Edge Caching in 5G Wireless Networks. IEEE Wirel. Commun. 2018, 25, 72–78. [Google Scholar] [CrossRef]
  2. Ru, W.; Jia, L.; Zhang, G.; Huang, S.; Yuan, M. Energy Efficient Power Allocation for Relay-Aided D2D Communications in 5G Networks. China Commun. 2017, 14, 54–64. [Google Scholar]
  3. Li, Z.; Fu, X.; Wang, S.; Pei, T.; Li, J. Achievable Rate Maximization for Cognitive Hybrid Satellite-Terrestrial Networks With AF-Relays. IEEE J. Sel. Areas Commun. 2018, 36, 304–313. [Google Scholar] [CrossRef]
  4. Andrawes, A.; Nordin, R.; Ismail, M. Wireless Energy Harvesting with Cooperative Relaying under the Best Relay Selection Scheme. Energies 2019, 12, 892. [Google Scholar] [CrossRef]
  5. Lei, C.; Yu, F.R.; Hong, J.; Rong, B.; Li, X.; Leung, V.C.M. Green Full-Duplex Self-Backhaul and Energy Harvesting Small Cell Networks with Massive MIMO. IEEE J. Sel. Areas Commun. 2016, 34, 3709–3724. [Google Scholar]
  6. Zhu, Z.; Huang, S.; Zheng, C.; Zhou, F.; Zhang, D.; Lee, I. Robust Designs of Beamforming and Power Splitting for Distributed Antenna Systems with Wireless Energy Harvesting. IEEE Syst. J. 2018, 13, 30–41. [Google Scholar] [CrossRef]
  7. Wang, L.; Wong, K.K.; Shi, J.; Gan, Z.; Robert, W.H., Jr. A New Look at Physical Layer Security, Caching, and Wireless Energy Harvesting for Heterogeneous Ultra-Dense Networks. IEEE Commun. Mag. 2017, 56, 49–55. [Google Scholar] [CrossRef]
  8. Al-Hraishawi, H.; Baduge, G.A.A. Wireless Energy Harvesting in Cognitive Massive MIMO Systems with Underlay Spectrum Sharing. IEEE Wirel. Commun. Lett. 2017, 6, 134–137. [Google Scholar] [CrossRef]
  9. Zhao, C.; Cai, L.X.; Yu, C.; Shan, H. Sustainable Cooperative Communication in Wireless Powered Networks with Energy Harvesting Relay. IEEE Trans. Wirel. Commun. 2017, 16, 8175–8189. [Google Scholar]
  10. Ozel, O.; Tutuncuoglu, K.; Yang, J.; Ulukus, S.; Yener, A. Transmission with Energy Harvesting Nodes in Fading Wireless Channels: Optimal Policies. IEEE J. Sel. Areas Commun. 2011, 29, 1732–1743. [Google Scholar] [CrossRef]
  11. Zhao, M.; Zhao, J.; Zhou, W.; Zhu, J.; Zhang, S. Energy efficiency optimization in relay-assisted networks with energy harvesting relay constraints. China Commun. 2015, 12, 84–94. [Google Scholar] [CrossRef]
  12. Ahmed, I.; Ikhlef, A.; Schober, R.; Mallik, R.K. Power Allocation for Conventional and Buffer-Aided Link Adaptive Relaying Systems with Energy Harvesting Nodes. IEEE Trans. Wirel. Commun. 2014, 13, 1182–1195. [Google Scholar] [CrossRef]
  13. Zhi, C.; Dong, Y.; Fan, P.; Letaief, K.B. Optimal Throughput for Two-Way Relaying: Energy Harvesting and Energy Co-Operation. IEEE J. Sel. Areas Commun. 2016, 34, 1448–1462. [Google Scholar]
  14. Luo, Y.; Zhang, J.; Letaief, K.B. Relay selection for energy harvesting cooperative communication systems. In Proceedings of the IEEE Global Communications Conference, Atlanta, GA, USA, 9–13 December 2013. [Google Scholar]
  15. Lee, Y.H.; Liu, K.H. Battery-aware relay selection for energy-harvesting relays with energy storage. In Proceedings of the IEEE 26th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Hong Kong, China, 30 August–2 September 2015. [Google Scholar]
  16. Wang, F.; Guo, S.; Yang, Y.; Xiao, B. Relay Selection and Power Allocation for Cooperative Communication Networks with Energy Harvesting. IEEE Syst. J. 2016, 12, 1–12. [Google Scholar] [CrossRef]
  17. Yuan, W.; Li, P.Q.; Liang, H.; Shen, X. Optimal Relay Selection and Power Control for Energy-Harvesting Wireless Relay Networks. IEEE Trans. Green Commun. Netw. 2018, 2, 471–481. [Google Scholar]
  18. Yu, P.S.; Lee, J.; Quek, T.Q.S.; Hong, Y.-W.P. Traffic Offloading in Heterogeneous Networks with Energy Harvesting Personal Cells—Network Throughput and Energy Efficiency. IEEE Trans. Wirel. Commun. 2015, 15, 1146–1161. [Google Scholar] [CrossRef]
  19. Zhang, S.; Zhang, N.; Zhou, S.; Gong, J.; Niu, Z.; Shen, X. Energy-Aware Traffic Offloading for Green Heterogeneous Networks. IEEE J. Sel. Areas Commun. 2016, 34, 1116–1129. [Google Scholar]
  20. Dhillon, H.S.; Li, Y.; Nuggehalli, P.; Pi, Z.; Andrews, J.G. Fundamentals of Heterogeneous Cellular Networks with Energy Harvesting. IEEE Trans. Wirel. Commun. 2014, 13, 2782–2797. [Google Scholar]
  21. Ahmed, I.; Ikhlef, A.; Schober, R.; Mallik, R.K. Joint Power Allocation and Relay Selection in Energy Harvesting AF Relay Systems. IEEE Wirel. Commun. Lett. 2013, 2, 239–242. [Google Scholar] [CrossRef]
  22. Huang, C.; Zhang, R.; Cui, S. Throughput Maximization for the Gaussian Relay Channel with Energy Harvesting Constraints. IEEE J. Sel. Areas Commun. 2013, 31, 1469–1479. [Google Scholar] [CrossRef]
  23. Yu, G.; Jiang, Y.; Xu, L.; Li, G.Y. Multi-Objective Energy-Efficient Resource Allocation for Multi-RAT Heterogeneous Networks. IEEE J. Sel. Areas Commun. 2015, 33, 2118–2127. [Google Scholar] [CrossRef]
  24. Gong, J.; Zhou, Z.; Zhou, S. On the Time Scales of Energy Arrival and Channel Fading in Energy Harvesting Communications. IEEE Trans. Green Commun. Netw. 2018, 2, 482–492. [Google Scholar] [CrossRef]
  25. Benjaafar, S.; Morin, T.L.; Talavage, J.J. The strategic value of flexibility in sequential decision making. Eur. J. Oper. Res. 1995, 82, 438–457. [Google Scholar] [CrossRef] [Green Version]
  26. Bertsekas, D.P. Dynamic Programming and Optimal Control; Athena Sci.: Belmont, MA, USA, 2005. [Google Scholar]
  27. Hu, Q.; Chen, X. The finiteness of the reward function and the optimal value function in Markov decision processes. Math. Methods Oper. Res. 1999, 49, 255–266. [Google Scholar]
  28. Mao, Y.; Zhang, J.; Letaief, K.B. Grid Energy Consumption and QoS Tradeoff in Hybrid Energy Supply Wireless Networks. IEEE Trans. Wirel. Commun. 2016, 15, 3573–3586. [Google Scholar] [CrossRef] [Green Version]
  29. Song, H.; Liu, C.C.; Lawarrée, J.; Dahlgren, R.W. Optimal electricity supply bidding by Markov decision process. IEEE Trans. Power Syst. 2000, 15, 618–624. [Google Scholar] [CrossRef]
Figure 1. A green wireless network with an energy harvesting relay.
Figure 1. A green wireless network with an energy harvesting relay.
Applsci 09 02744 g001
Figure 2. Total reward vs. time slots for different number of the battery intervals, ω t = 1 , d S D = d S R + d R D = 40 m .
Figure 2. Total reward vs. time slots for different number of the battery intervals, ω t = 1 , d S D = d S R + d R D = 40 m .
Applsci 09 02744 g002
Figure 3. The total rewards versus d S D , ω t = 1 , λ = 2.
Figure 3. The total rewards versus d S D , ω t = 1 , λ = 2.
Applsci 09 02744 g003
Figure 4. The total rewards versus d S D , ω t = 1 , λ = 4.
Figure 4. The total rewards versus d S D , ω t = 1 , λ = 4.
Applsci 09 02744 g004
Figure 5. The total rewards versus time slots, λ = 2 and ω t = 1 .
Figure 5. The total rewards versus time slots, λ = 2 and ω t = 1 .
Applsci 09 02744 g005
Figure 6. The total rewards versus energy harvesting intensity, N = 30 and ω t = 1 .
Figure 6. The total rewards versus energy harvesting intensity, N = 30 and ω t = 1 .
Applsci 09 02744 g006
Figure 7. Grid energy consumption and throughput vs. ω t , N = 30, λ = 2.
Figure 7. Grid energy consumption and throughput vs. ω t , N = 30, λ = 2.
Applsci 09 02744 g007
Figure 8. Grid energy consumption versus throughput. N = 30, λ = 2.
Figure 8. Grid energy consumption versus throughput. N = 30, λ = 2.
Applsci 09 02744 g008

Share and Cite

MDPI and ACS Style

Wu, R.; Xie, H.; Chen, Z.; Tang, L. Power Control and Link Selection for Wireless Relay Networks with Hybrid Energy Sources. Appl. Sci. 2019, 9, 2744. https://doi.org/10.3390/app9132744

AMA Style

Wu R, Xie H, Chen Z, Tang L. Power Control and Link Selection for Wireless Relay Networks with Hybrid Energy Sources. Applied Sciences. 2019; 9(13):2744. https://doi.org/10.3390/app9132744

Chicago/Turabian Style

Wu, Runze, Huan Xie, Zhiyi Chen, and Liangrui Tang. 2019. "Power Control and Link Selection for Wireless Relay Networks with Hybrid Energy Sources" Applied Sciences 9, no. 13: 2744. https://doi.org/10.3390/app9132744

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop