Next Article in Journal
Optimal Population Coding for Dynamic Input by Nonequilibrium Networks
Next Article in Special Issue
Implementation and Evaluation of Age-Aware Downlink Scheduling Policies in Push-Based and Pull-Based Communication
Previous Article in Journal
Entropy Treatment of Evolution Algebras
Previous Article in Special Issue
On the Age of Information in a Two-User Multiple Access Setup
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Age of Information Minimization for Radio Frequency Energy-Harvesting Cognitive Radio Networks

1
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310014, China
2
Guangxi Key Laboratory of Cryptography and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(5), 596; https://doi.org/10.3390/e24050596
Submission received: 23 February 2022 / Revised: 16 April 2022 / Accepted: 21 April 2022 / Published: 24 April 2022
(This article belongs to the Special Issue Age of Information: Concept, Metric and Tool for Network Control)

Abstract

:
The Age of Information (AoI) measures the freshness of information and is a critic performance metric for time-sensitive applications. In this paper, we consider a radio frequency energy-harvesting cognitive radio network, where the secondary user harvests energy from the primary users’ transmissions and opportunistically accesses the primary users’ licensed spectrum to deliver the status-update data pack. We aim to minimize the AoI subject to the energy causality and spectrum constraints by optimizing the sensing and update decisions. We formulate the AoI minimization problem as a partially observable Markov decision process and solve it via dynamic programming. Simulation results verify that our proposed policy is significantly superior to the myopic policy under different parameter settings.

1. Introduction

To cope with both the spectrum scarcity and the energy shortage challenges in future wireless networks, radio frequency (RF) energy-harvesting in cognitive radio networks (CRN) has been increasingly attractive. Cognitive radio technology allows secondary users (SUs) to opportunistically access the primary users’ (PUs) licensed spectrum, based on the condition that the SUs transmission must not cause harmful interference to PUs [1,2,3,4]. Meanwhile, the RF energy-harvesting technique conquers the intermittency and uncontrollability of the conventional charging techniques absorbing energy from renewable energy sources [5,6,7]. Hence, it can simultaneously improve energy efficiency and spectral efficiency, where the SUs can both capture energy and spectrum [8].
While existing works mainly investigated throughput of the RF energy-harvesting CRN, many emerging applications require timely status-update delivery [9,10,11,12,13,14,15], i.e., health monitoring, environment monitoring, smart building, vehicle-to-vehicle networking, and so on. For example, in health monitoring, the sensors continuously measure and update blood pressure and heartbeat to the health monitoring platform, which implies the importance of the freshness and timeliness of status-update. The Age of Information (AoI) as a recently proposed performance metric can be used to quantify the freshness and timeliness of status-update [16,17,18,19,20,21,22,23]. It is defined as the time elapsed since the generation time of the latest successfully received status-update at the destination.
Some innovative efforts have been devoted to the AoI of CRN [24,25,26,27,28]. In [24], the authors considered a cognitive wireless sensor network with a cluster of SUs, where the authors proposed a joint and scheduling strategy that optimized energy efficiency of a communication system subject to the expected AoI. The authors in [25] considered an overlay CRN where the SU acted as a relay. The SU forwarded the PU’s packets or transmitted its own packets. The optimal policy for status-update and packet relaying was investigated to minimize the average AoI and energy efficiency. In [26], the authors analyzed the average peak AoI of the PU and SU for both overlay and underlay schemes. The asymptotic expressions of the average peak AoI were derived when the PU operated at high signal-to-noise ratio. Considering that it is difficult for PU keeping time-slotted synchronization with SU, the authors in [27] investigated AoI minimization in CRN with an unslotted PU. The closed-form expression was derived by conducting a Markov chain analysis. In [28], the authors considered AoI minimization for energy-harvesting CRN. They assumed that the SU harvests energy from ambient energy sources and derived the optimal sensing and update policies for both perfect and imperfect spectrum sensing.
Overall, the aforementioned research efforts rarely address AoI minimization for RF energy-harvesting CRN. Motivated by this, this article attempts to minimize the average AoI by adaptively making sensing and updating decisions subject to the energy causality and spectrum constraints with imperfect spectrum sensing. The system consists of one PU and one SU. Different from [28], the SU harvests RF energy from PU transmissions instead of ambient energy sources, which is further used to generate and deliver the status-update data pack when the PU is idle. The SU utilizes the harvested energy to perform spectrum sensing and updating. The main contributions of this paper are summarized as follows:
  • We study the average AoI minimization for RF energy-harvesting CRN where the SU harvests energy from PU transmissions. In each time slot, the SU adaptively makes sensing and updating decisions based on the channel state information, the AoI value, the available energy, and the belief of PU’s spectrum.
  • We formulate the decision-making problem as a framework of a partially observable Markov decision process (POMDP) with finite state and action spaces. Then we use dynamic programming to obtain the optimal policy.
  • We demonstrate through extensive simulations that the proposed policy can essentially improve the system performance compared to the myopic policy under different system parameter settings.
The remaining part of the paper is organized as follows. In Section 2, we review the works on RF energy-harvesting CRN in the literature. Section 3 describes the studied system model for RF energy-harvesting CRN. Section 4 first formulates the AoI minimization problem as a POMDP framework and then solves it through the dynamic programming. Section 5 presents simulation results and discussions. Finally, Section 6 concludes this paper.

2. Related Works on RF Energy-Harvesting CRN

Recently, cognitive radio technology has drawn significant attention as a promising solution to overcome the licensed spectrum severe scarcity. Cognitive radio allows SUs to opportunistically access PUs’ licensed spectrum, based on the condition that the SUs transmission must not cause harmful interference to PUs [1,2,3]. Spectrum sensing is an important functionality in the cognitive radio system [29], by which the SUs decide whether the spectrum is occupied by the PUs. It can be performed by a single SU or in cooperation with multiple SUs. The SUs can only transmit data when the PUs are idle [30]. Various spectrum-sensing approaches have been developed based on employing different features of the PUs’ signal [31], such as coherent detection [32], energy detection [33], and feature detection [34].
On the other hand, energy shortage is also a challenge in future wireless networks. Over the last past years, the RF energy-harvesting technique has emerged as a candidate method for charging low-power wireless devices, which can conquer the intermittency and uncontrollability of the conventional charging techniques absorbing energy from renewable energy sources [5,6,7]. In [35], the authors proposed the harvest-then-transmit (HTT) protocol as one of the important transmission strategies of RF energy-harvesting technology, where the users first harvest energy from the hybrid access point (HAP) and then use the captured energy to transmit information to the HAP. There have been some related works before. In [36], the authors investigated the wireless-powered network (WPCN) where one HAP coordinated the wireless information and energy transmissions to a set of nodes, where the transmission completion time minimization subject to the throughput requirement per node was considered. Furthermore, the authors studied a similar WPCN scenario in [37], where they focused on energy provision minimization for two physical-layer protocols, non-orthogonal multiple access (NOMA) and time-division multiple access (TDMA). Different from the common WPCN with a fixed HAP, the transmission completion time minimization was investigated in aerial vehicle-enabled WPCN in [38].
To jointly solve the aforementioned two challenges including spectrum scarcity and energy shortage, introducing RF energy-harvesting in CRN has been increasingly attractive due to the fact that it can simultaneously improve energy efficiency and spectral efficiency, where the SUs can both capture energy and spectrum [8]. The timely-delivery probability of data packs for the RF energy-harvesting SU was derived in [39], where the SU opportunistically accesses the spectrum vacated by the PU to deliver real-time data packs and harvests RF energy when the PU is active. Unlike the traditional RF energy-harvesting CRN system where the SU keeps synchronization with the PU, the authors in [40] considered unslotted PU. The sensing intervals were derived to balance between energy harvesting and spectrum access. However, both [39,40] focused on a simple CRN consisting of one PU and one SU. The authors in [41,42,43] considered a more general scenario where there were multiple SUs or multiple PUs. In [41], the multiple selection strategy was proposed for RF energy-harvesting CRN to maximize the SUs’ average throughput. In [42], the authors studied a hybrid energy-harvesting SU that can capture energy from both renewable sources and ambient radio frequency signals. The asymptotic activity behavior of a single SU was analyzed by deriving the theoretical upper bound on sensing and transmission opportunities. In [43], the authors investigated the end-to-end throughput maximization by jointly optimizing the transmit power and time allocation for multiple SUs.

3. System Model

As illustrated in Figure 1, we investigated AoI minimization for a RF energy-harvesting CRN, where the system consists of one PU, one SU, and one CBS communicating with the SU. The SU is a wireless sensor node that monitors the physical process and randomly generates status updates to the CBS. It has no embedded power supply available and harvests RF energy from PU transmissions. Additionally, it opportunistically accesses the PU’s licensed spectrum. We considered a time-slotted system with a time interval of T time slots. The duration of each time slot is sufficient for the SU to generate one status-update data packet and receive it successfully at the CBS. Without loss of generality, we assume that the time slot duration is 1 s. The important notations are summarized in Table 1.

3.1. Primary User Model

The occupancy of a channel by the PU is modeled as a two-state continuous-time Markov chain [44], i.e., active (A) and idle (I) states. In each time slot, the PU either stays in the idle state or occupies the spectrum in an active state. The two-state (active/idle) Markov chain model for modeling PU activity has been verified to be an appropriate model to characterize spectrum occupancy in the time domain [45]. Let q t { A , I } denote the state of the PU for t = 0 , 1 , , T 1 . The transition probabilities of the two-state Markov chain are expressed as p a i and p i i , which represent transitioning from the active state to the idle state, and still staying in the idle state, respectively. For t = 0 , 1 , , T 1 , we have
p a i P ( q t + 1 = I | q t = A ) ,
p i i P ( q t + 1 = I | q t = I ) .
The transition probabilities are known to SU, which can be obtained by long-term measurements.

3.2. Secondary User Model

We considered the SU time-slotted synchronization with the PU. At the beginning of each time slot, the SU needs to decide whether to sense the PU’s spectrum. If it decides not to sense the spectrum, it takes the entire time slot to harvest energy from the PU transmissions. That is, energy can be harvested when the PU is active; otherwise, no energy is harvested. We assume the imperfect sensing outcome for the SU [46]. We denote the probability of a false alarm by p f (i.e., the probability of deciding the spectrum is occupied by the PU while it is not). The probability of detection is denoted by p d (i.e., the probability of deciding PU is active when it is active). Then, we have
p f = P ( q t ^ = A | q t = I ) ,
p d = P ( q t ^ = I | q t = I ) .
The SU will take two actions after obtaining the sensing result. When the PU is sensed to be active, the SU will not deliver the status-update data pack. This means that it can harvest energy when the PU is actually active. On the other hand, if the sensing result is that the spectrum is vacated by the PU, the SU needs to further decide whether to update. If an update package is delivered, the SU will receive a 1-bit feedback signal from the CBS to determine whether the update is successful or not. When the sensing result q t ^ = I is correct, the update is successful. This happens with probability 1 p f . Update failure occurs if the PU is active despite the SU declaring it idle. This happens with probability 1 p d . The SU aims to minimize the average AoI by making the optimal sensing and update decisions over time slot t = 0 , 1 , , T 1 . We denote the decision of time slot t by x t = ( ϕ t , θ t ) , where ϕ t { 0 ( not sense ) , 1 ( sense ) } and θ t { 0 ( not update ) , 1 ( update ) } denote the sensing and update decisions, respectively. The optimal sensing and update decisions are based on the SU’s states and its statistical knowledge of the PU activity.
(1) Belief model: The SU observes the availability of the PU spectrum by adaptively detecting and accessing the spectrum. The belief state of the PU spectrum can be obtained based on the SU’s action and observation history. That is, at the beginning of each time slot t, the SU forms the belief ρ t . The belief ρ t is the conditional probability that the PU is in an idle state given the SU’s action and observation history.
(2) Channel model: Denote the channel power gains from the PU to the SU and from the SU to the CBS by h t and g t over time slot t. We consider a quasi-static channel model based on one time slot by assuming that the channel state information is constant in a single time slot and variable in different time slots. Especially, as is commonly assumed in the works about the wireless communication system, the channel state information of the current time slot can be perfectly obtained.
(3) RF energy-harvesting model: The batter-free SU harvests energy from the occupied spectrum by the PU. For the SU, the HTT protocol is employed. That is, the SU first captures energy from the PU transmissions and then utilizes the harvested energy to sense spectrum and transmit data. Overall, there are two cases where energy can be harvested over time slots: (1) The not sensing decision is made, and the PU is inactive, and (2) the sensing decision is made, and the sensing result q t ^ = A is correct. The energy captured by the SU is expressed as
E H , t m = η τ P h t ,
for t = 0 , 1 , , T 1 and m = 1 , 2 , where η , τ and P denote the energy-harvesting efficiency, energy-harvesting time and transmit power at the PU, respectively. The superscript m denotes the two cases of energy-harvesting mentioned above. The captured energy is used to perform sensing spectrum and update. Denote the energy and time consumption on sensing spectrum by δ and τ s , respectively. Similarly, let E T , t and τ t denote the energy and time consumption on update, respectively. Energy consumption E T , t is time-varying, which is related to the channel state information g t from the SU to the CBS. According to Shannon’s formula [47], the transmission rate S τ t can be expressed as S τ t = W log 2 ( 1 + E T , t g t τ t σ 2 ) , where σ 2 is the noise power at the CBS, S is the size of status-update data pack, and W is the bandwidth. Reorganizing the expression, we obtain the energy consumption, E T , t , as
E T , t = σ 2 τ t g t 2 S τ t W 1 .
Since the size of the status-update data pack is fixed, E T , t is only related to the channel state information from the SU to the CBS. Although the update decision can reduce the AoI to one, when the channel quality is poor, it may be better not to deliver the status-update data pack to conserve energy. Note that update failure occurs if the sensing result q t ^ = I is incorrect. In this case, the SU will consume all its available energy. Let B max denote the battery capacity of the SU. In time slot t, the battery state is b t , which evolves as
b t + 1 = min { b t + E H , t m ϕ t δ θ t E T , t , B max } , t = 0 , 1 , , T 1 .
Hence, for the SU, the energy causality constraint should satisfy
ϕ t δ + θ t E T , t b t , t = 0 , 1 , , T 1 .
(4) AoI model: We consider a linear model for the AoI [16], where the AoI is defined as the time elapsed from the moment when the most recently received update was generated to the present. Let the AoI at time slot t denote by a t A { 1 , 2 , . . . , A max } . Here A max is the upper of the AoI and is defined as
A max = a 0 + T .
In the considered system, the SU adopts the generate-at-will scheme. That is, the SU generates and delivers a status-update data pack after making an update decision. At each time slot t, the size of the data packet S is small enough to be generated and updated immediately and received by the end of the current time slot when the update decision is made and the sensing result q t ^ = I is correct. If the update is received at the CBS, AoI decreases to one; otherwise, it increases by one. We consider an error-free channel through which the status-update data pack can be successfully received at the CBS when the update decision is made and the sensing result q t ^ = I is correct. The average AoI for an interval of T time slots is expressed as
A ¯ = 1 T t = 0 T 1 a t , t = 0 , 1 , , T 1 .

4. POMDP for AoI Minimization

In this section, we formulate the AoI minimization as a finite-horizon POMDP problem and solve for the optimal solutions via dynamic programming.

4.1. POMDP Formulation

We use a POMDP framework to model the optimal sensing and update decisions for the SU’s AoI minimization. The components of POMDP are described as follows.
  • Actions: At the beginning of each time slot t, the SU needs to decide whether to sense the spectrum. If it decides not to sense the spectrum, then it captures energy from the PU transmissions and does not update, i.e., x t = ( 0 , 0 ) . If it decides to sense the spectrum and finds that the PU is idle, it further decides whether to update based on the available energy, the AoI value, the channel state information from the SU to the CBS and from the PU to the SU, i.e., x t = ( 1 , 0 ) and x t = ( 1 , 1 ) . Thus, the action for each time slot t is x t = ( ϕ t , θ t ) X { ( 0 , 0 ) , ( 1 , 0 ) , ( 1 , 1 ) : b t ϕ t δ + θ t E T , t } , where ϕ t Γ ϕ { 0 , 1 : b t ϕ t δ } and θ t Γ θ { 0 , 1 : b t δ + θ t E T , t } .
  • Observations and beliefs: Let q ^ t { A , I } denote the observation of the PU’s state. The belief ρ t [ 0 , 1 ] is a condition probability that the spectrum is vacated by the PU. The belief is updated according to the following cases.
    Case 1: The SU does not sense the spectrum; the new belief is given by
    ρ t + 1 = Λ 0 ( ρ t ) = ρ t p i i + ( 1 ρ t ) p a i .
    Case 2: If the PU is sensed to be active, the SU harvests energy in the remaining time of the current time slot, i.e., the battery energy increases. This implies the true state of the PU is q t = A . The belief is updated as
    ρ t + 1 = p a i .
    Case 3: If the PU is sensed to be active, the SU does not harvest energy; i.e., the battery energy does not change and is lower than B max . This implies the true state of the PU is q t = I . The new belief is expressed as
    ρ t + 1 = p i i .
    Case 4: If the PU is sensed to be active, the battery energy is B max at time slot t. The new belief is given by
    ρ t + 1 = Λ 1 A ( ρ t ) = ζ t p i i + ( 1 ζ t ) p a i ,
    where
    ζ t P ( q t = I | q t ^ = A ) = ρ t ( 1 p f ) ρ t p f + ( 1 ρ t ) ( 1 p d ) .
    Case 5: If the PU is sensed to be idle, the SU does not update. The belief is updated as
    ρ t + 1 = Λ 1 I ( ρ t ) = ζ t ¯ p i i + ( 1 ζ t ¯ ) p a i ,
    where
    ζ t ¯ P ( q t = I | q t ^ = I ) = ρ t ( 1 p f ) ρ t ( 1 p f ) + ( 1 ρ t ) ( 1 p d ) .
    Case 6: If the PU is sensed to be idle, the SU updates successfully. This implies that the true state of the PU is q t = I . Then, we have
    ρ t + 1 = p i i .
    Case 7: If the PU is sensed to be idle, the SU update fails. This implies that the true state of the PU is q t = A . Then, we have
    ρ t + 1 = p a i .
    Although (11)–(19) cover seven cases from case one to case seven, the new beliefs in both case two and case seven are denoted as p a i , and the new beliefs in both case three and case six are denoted as p i i . Hence, the SU can only transit to five beliefs. That is, the number of possible beliefs is finite over T time slots. Thus, for the length of T time slots, the belief space Φ is a finite set.
  • States: Denote the discrete battery energy level of the SU at the beginning of time slot t by b t B { 0 , 1 , 2 , . . . , b max } , where b max is the maximum battery energy level that can be stored inside the battery of the SU. Then, each energy quantum of the SU’s battery contains B max b max Joules. In this case, we use b t = b t b max B max to convert the continuous battery energy of the SU to the discrete battery energy level, by which a lower bound to the AoI of the original continuous system is obtained. Similarly, divide continuous channel power gain into finite number of intervals according to fading probability density function (PDF). Thus, the discrete channel power gain levels from the SU to the CBS and from the PU to the SU are expressed as g t G ( 0 , 1 , 2 , . . . , g max ) and h t H ( 0 , 1 , 2 , . . . , h max ) , respectively. Here, g max and h max denote the corresponding maximum channel power gain levels. At each time slot t, the completely observable states include channel state from the PU to the SU, channel state from the SU to the CBS, the AoI state, and battery state, denoted by s t ( h t , g t , a t , b t ) . Note that the state space, i.e., S H × G × A × B , is finite. Due to imperfect sensing, an update may be unsuccessful when the sensing result is q t ^ = I and the update decision is θ t = 1 . Thus,
    a t + 1 = 1 , when x t = ( 1 , 1 ) and q t ^ = q t , a t + 1 , otherwise ,
    for t = 1 , 2 , . . . . , T . Alternatively, we can express a t + 1 = ( 1 θ t ) a t + 1 when the sensing result q t ^ = I is correct. Additionally, the PU’s spectrum state is only partially observable, which is described by the belief ρ t . Thus, for each time slot t, the complete system state is denoted by ( s t , ρ t ) . Since S and Φ are finite, the SU experiences a finite number of possible system states ( s t , ϱ t ) S × Φ .
  • Transition probabilities: For time slot t, given the current state s t = ( h t , g t , a t , b t ) and the action x t = ( ϕ t , θ t ) , the transition probability to the next state s t + 1 = ( h t + 1 , g t + 1 , a t + 1 , b t + 1 ) is denoted by p x t ( s t + 1 | s t ) . Since the captured energy and the channel power gains are independently and identically distributed (i.i.d), the transition probabilities for taking actions other than x t = ( 1 , 1 ) are given as follows.
    p x t ( s t + 1 | s t ) = P ( a t + 1 | a t , x t ) P ( b t + 1 | b t , g t , h t , x t ) P ( g t + 1 ) P ( h t + 1 ) ,
    where
    P ( a t + 1 | a t , x t ) = 1 , when a t + 1 = ( 1 θ t ) a t + 1 , 0 , otherwise ,
    P ( b t + 1 | b t , g t , h t , x t ) = 1 , when ϕ t = 0 and b t + 1 = min { b t + E H , t 1 , B max } , 1 , when ϕ t = 0 and b t + 1 = b t , 1 , when ϕ t = 1 , θ t = 0 , and b t + 1 = min { b t δ + E H , t 2 , B max } , 1 , when ϕ t = 1 , θ t = 0 , and b t + 1 = b t δ , 0 , otherwise .
    For the action x t = ( 1 , 1 ) , the transition probability is expressed as follows.
    p x t ( s t + 1 | s t , q t ^ , q t ) = P ( a t + 1 | a t , x t q t ^ , q t ) × P ( b t + 1 | b t , g t , h t , x t ) × P ( g t + 1 ) P ( h t + 1 ) ,
    where
    P ( a t + 1 | a t , x t ) = ζ ¯ , when a t + 1 = 1 and q t = q t ^ , 1 ζ ¯ , when a t + 1 = a t + 1 and q t q t ^ , 0 , otherwise ,
    and
    P ( b t + 1 | b t , g t , h t , x t ) = 1 , when ϕ t = 1 , θ t = 1 , b t + 1 = b t δ E T , t , and q t ^ = q t , 1 , when ϕ t = 1 , θ t = 1 , b t + 1 = 0 , q t ^ = I , and q t = A , 0 , otherwise .
  • Cost: Let the immediate cost at state s t denoted by C ( s t ) , which is the accumulated AoI at time slot t. Then, we have
    C ( s t ) = a t , t = 0 , 1 , . . . , T 1 .
  • Policy: The policy is expressed as π = { ϑ 0 , ϑ 1 , . . . , ϑ T 1 } , where ϑ t is the deterministic decision rule that maps a system state ( s t , ρ t ) S × Φ into an action x t X , i.e., x t = ϑ t ( s t , ρ t ) . In this paper, let Π denote the set of all deterministic decision policies.
Given the SU’s initial state s 0 and belief ρ 0 of PU’s spectrum, the average AoI of T time slots under the policy π is given by
A ¯ π ( s 0 , ρ 0 ) = 1 T E t = 0 T 1 C ( s t ) | s 0 , ρ 0 ,
where the expectation is caused by policy π . Based on the above analysis, minimize the average AoI by finding the optimal sensing and update policy corresponds to solving
min π Π A ¯ π ( s 0 , ρ 0 ) .
Given T, (29) is a finite-state MDP with total cost. Based on (28) and (29), to minimize the average AoI, the SU should sense the spectrum and deliver the status-update data pack as long as it has sufficient energy. However, considering the channel state information, the belief of PU’s spectrum, and the battery energy available, preferring the spectrum sensing and status-update may not be the best decision.As a result, there is an optimal decision scheduling problem.

4.2. POMDP Solution

In this section, we use dynamic programming to solve total cost minimization of T time slots in (29) [48]. At a time slot t, the successive actions { x k } k = t T 1 affect the states s k along with the accumulated AoI C ( s k ) for all k = t , t + 1 , , T 1 . Let V t ( s t , ρ t ) denote the state-value function, which is given by
V t ( s t , ρ t ) min { x k } k = t T 1 E k = t T 1 C ( s k ) | s t , ρ t .
It is the minimum expected cost accumulated from time slot t to T 1 given state ( s t , ρ t ) . Thus, denote the minimum AoI in (29) by A * = V 0 ( s 0 , ρ 0 ) / T . Additionally, given ( s t , ρ t ) and sensing action ϕ t , let Q t ϕ t ( s t , ρ t ) represent the action-value function or Q-function, which is the minimum expected cost for taking sensing action ϕ t at state ( s t , ρ t ) . The Q-function includes two parts: the immediate cost of taking action at the current state and the expected sum of the state-value functions from the next time slot.
Overall, the formulated MDP problem can be solved recursively by dynamic programming as follows. For t = 0 , 1 , , T 1 ,
V t ( s t , ρ t ) = min ϕ t Γ ϕ Q t ϕ t ( s t , ρ t ) ,
When t = T 1 , we have
Q T 1 0 ( s T 1 , ρ T 1 ) = C ( s T 1 ) + C ( s T ) ,
Q T 1 1 ( s T 1 , ρ T 1 ) = ( 1 Δ T 1 ) C ( s T 1 ) + ρ T 1 × Δ T 1 min ϕ T 1 Γ ϕ C ( s T 1 ) + C ( s T ) .
When t = 0 , 1 , , T 2 , we have
Q t 0 ( s t , ρ t ) = C ( s t ) + s t + 1 p 00 ( s t + 1 | s t ) V t + 1 ( s t + 1 , Λ 0 ( ϱ t ) ) ,
Q t 1 ( s t , ρ t ) = ( 1 Δ t ) Q t 1 A ( s t , ρ t ) + Δ t min θ t Γ ϕ Q t 1 ϕ t ( s t , ρ t ) ,
Q t 1 A ( s t , ρ t ) = C ( s t ) + s t + 1 p 10 ( s t + 1 | s t ) V t + 1 ( s t + 1 , Λ 1 A ( ϱ t ) ) ,
Q t 10 ( s t , ϱ t ) = C ( s t ) + s t + 1 p 10 ( s t + 1 | s t ) V t + 1 ( s t + 1 , Λ 1 I ( ϱ t ) ) ,
Q t 11 ( s t , ρ t ) = C ( s t ) + s t + 1 p 11 ( s t + 1 | s t , q t ^ = q t ) V t + 1 ( s t + 1 , Λ I ( ϱ t ) ) + s t + 1 p 11 ( s t + 1 | s t , q t ^ q t ) V t + 1 ( s t + 1 , Λ A ( ϱ t ) ) ,
where Δ t represents the probability of observing PU idle. That is
Δ t = P ( q t ^ = I ) = ρ t ( 1 p f ) + ( 1 ρ t ) ( 1 p d ) .
Especially, Q t 1 A ( s t , ρ t ) in (36) represents the minimum expected cost by adopting sensing action ϕ t = 1 and sensing result q ^ t = A , i.e., x t = ( 1 , 0 ) . In (37) and (38), given the sensing action ϕ t = 1 and sensing result q ^ t = I , Q t 10 ( s t , ϱ t ) and Q t 11 ( s t , ϱ t ) denote the minimum expected costs by adopting update action θ t = 0 and θ t = 1 , respectively. Then, by recursion in (31)–(38), the optimal policies for sensing and update are given by
ϕ t * ( s t , ρ t ) argmin ϕ t Γ ϕ Q t ϕ t ( s t , ϱ t ) ,
θ t * ( s t , ρ t ) argmin ϕ t Γ θ Q t 1 θ t ( s t , ϱ t ) .

5. Numerical Results

In this section, we evaluate the performance of our proposed optimal policy through comparing it with the myopic policy and the random policy. At the beginning of time slot t, for the myopic policy, the SU senses the spectrum if it has enough energy. When the sensing result is q t ^ = I , the SU generates and delivers a status-update data pack if the energy available is sufficient. For the random policy, the SU randomly chooses to deliver the status-update data pack or harvest energy with a probability. Taking into account the protection of the PU’s transmission, the probability of harvesting energy is set to be 90%, and the probability of delivering the status-update data pack is set to be 10%. If the SU chooses to deliver the status-update data pack while the spectrum is occupied by the PU, the status-update fails, and the AoI increases by one. The PU’s state transition probabilities are p i i = 0.8 and p a i = 0.5 . The probability of detecting an active PU is p d = 0.8 . The channel power gains from the PU to the SU and from the SU to the CBS are modeled as h = Y Ψ 2 d 1 κ and g = Y Ψ 2 d 2 κ , where d 1 and d 2 denote the distances from the PU to the SU, and the SU to the CBS, respectively. Y represents a signal power gain at a 1 m’s reference distance, Ψ exp ( 1 ) denotes the small-scale fading gain, and d 1 κ and d 2 κ are standard power law path-loss with exponent κ . In the simulations, the system parameter values are set as follows: η = 0.5 , σ 2 = 95 dBm, W = 1 MHz, Y = 0.2 , κ = 2 , b max = 5 , g max = h max = 10 , ρ 0 = p i i , τ s = 0.2 s, and A max = 13 .
Figure 2 shows one sample path of the AoI by the optimal policy. The transmit power of the PU is 35 dBm, the energy consumption is one energy quantum, the distance from the the SU to the CBS is 20 m, the distance from the PU to the SU is 25 m, the size of the status-update data pack is 14 Mbits, and the battery capacity is 0.5 mJoules. The trend of the AoI over time slots is clearly observed. In the simulations, we found the SU did not perform sensing spectrum even the remaining energy was enough, which verifies the foresight of the optimal policy compared with the myopic policy.
Figure 3 shows the size of the status-update data packet versus the AoI, where the simulation setup is similar as in Figure 3. It is clear that our proposed policy is superior to the other policies. For the random policy, the AoI is obviously high due to the low probability of delivering the status-update data pack. For the random policy, the AoI is greater than 5.57, due to the low probability of delivering the status-update data pack. Considering the poor AoI performance of the random policy, we only compare our algorithm with the myopic algorithm in the following numerical evaluations. We can observe that the AoI increases with the size of the status-update data packet. The reason is that the increase in the size of the status-update data packet will result in increasing the energy needed to deliver one status-update data pack. This decreases the possibility that the SU will have enough energy to update, and hence the AoI is increased.
Figure 4 shows the transmit power of the PU versus the AoI, where the capacity of battery is 0.2 mJoules, the distance from the PU to the SU is 5 m, the distance from SU to the CBS is 25 m, the size of status-update data pack is 15 Mbits. We can observe from Figure 4 that the average AoI increases with the transmit power of PU. The reason is that the SU will harvest more energy as the transmit power of PU increases, which allows the SU to store more energy in the battery. This increases the possibility that the SU will have enough energy to update, and hence the AoI is decreased.
Figure 5 shows the battery capacity versus the AoI, where the size of the status-update data pack is 15 Mbits, the energy consumption on the sensing spectrum is one energy quantum, the transmit power of the PU is 35 dBm, the distance from the SU to the CBS is 10 m, and the distance from the PU to the SU is 5 m. It is clearly observed that the proposed policy essentially improves the AoI as compared to the myopic policy. We can also observe the average AoI decreases with the battery capacity. The reason is that increasing the battery capacity allows more harvested energy to be stored inside the battery. Thus, the SU will have enough energy to perform an update, and hence the AoI is reduced.
Figure 6 shows the energy consumption on sensing spectrum versus the AoI. The simulation setup is the similar as in the Figure 5. It is observed that the average AoI increases with the energy consumption on sensing action. The reason is that increasing the energy consumption on sensing spectrum can result in less energy remaining inside the battery. This, in turn, decreases the possibility that the SU will have enough energy to deliver status-update data packet, and hence the AoI is increased.

6. Conclusions

In this paper, we investigated RF energy-harvesting CRN with the aim of AoI minimization subject to the energy causality and spectrum constraints. We first used POMDP to formulate this average AoI minimization based on the AoI value, the channel state information, the energy available, and the PU’s spectrum belief, and then dynamic programming was adopted to find the optimal sensing and update decisions. Numerical results showed the influence of system parameters on the AoI, and demonstrated that the proposed policy significantly outperform the myopic policy.

Author Contributions

Conceptualization, all co-authors; methodology, all co-authors; software, not applicable; validation, all co-authors; formal analysis, all co-authors; investigation, all co-authors; resources, all co-authors; data curation, J.S.; writing—original draft preparation, J.S. and S.Z.; writing—review and editing, all co-authors; visualization, J.S. and C.Y.; supervision, S.Z.; project administration, L.H.; funding acquisition, L.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Natural Science Foundation of China under grant number 62072410 and the Zhejiang Provincial Natural Science Foundation of China under grant number LQ22F020009.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. He, Y.; Xue, J.; Ratnarajah, T.; Sellathurai, M.; Khan, F. On the Performance of Cooperative Spectrum Sensing in Random Cognitive Radio Networks. IEEE Syst. J. 2018, 12, 881–892. [Google Scholar] [CrossRef] [Green Version]
  2. Zheng, K.; Liu, X.; Liu, X.; Zhu, Y. Hybrid overlay-underlay cognitive radio networks with energy harvesting. IEEE Trans. Commun. 2019, 67, 4669–4682. [Google Scholar] [CrossRef]
  3. Liu, X.; Zheng, K.; Chi, K.; Zhu, Y. Cooperative spectrum sensing optimization in energy-harvesting cognitive radio networks. IEEE Trans. Wirel. Commun. 2020, 19, 7663–7676. [Google Scholar] [CrossRef]
  4. Yu, H.; Zikria, Y.B. Cognitive Radio Networks for Internet of Things and Wireless Sensor Networks. Sensors 2020, 20, 5288. [Google Scholar] [CrossRef]
  5. Bi, S.; Zeng, Y.; Zhang, R. Wireless powered communication networks: An overview. IEEE Wirel. Commun. 2016, 23, 10–18. [Google Scholar] [CrossRef] [Green Version]
  6. Zhang, S.; Kong, S.; Chi, K.; Huang, L. Energy Management for Secure Transmission in Wireless Powered Communication Networks. IEEE Internet Things J. 2022, 9, 1171–1181. [Google Scholar] [CrossRef]
  7. Guo, H.; Li, J.; Liu, J.; Na, T.; Kato, N. A Survey on Space-Air-Ground-Sea Integrated Network Security in 6G. IEEE Commun. Surv. Tutor. 2021, 24, 53–87. [Google Scholar] [CrossRef]
  8. Zhang, Y.; Han, W.; Li, D.; Zhang, P.; Cui, S. Power versus spectrum 2-D sensing in energy harvesting cognitive radio networks. IEEE Trans. Signal Process. 2015, 63, 6200–6212. [Google Scholar] [CrossRef]
  9. Khan, A.A.; Rehmani, M.H.; Rachedi, A. When cognitive radio meets the Internet of Things. In Proceedings of the International Wireless Communications & Mobile Computing Conference (IWCMC 2016), Paphos, Cyprus, 5–9 September 2016; pp. 469–474. [Google Scholar]
  10. Perera, C.; Liu, C.H.; Jayawardena, S. The emerging internet of things marketplace from an industrial perspective: A survey. IEEE Trans. Emerg. Top. Comput. 2015, 3, 585–598. [Google Scholar] [CrossRef] [Green Version]
  11. Guo, H.; Huang, W.; Liu, J.; Wang, Y. Inter-Server Collaborative Federated Learning for Ultra-Dense Edge Computing. IEEE Trans. Wirel. Commun. 2021; in press. [Google Scholar] [CrossRef]
  12. Sun, W.; Lei, S.; Wang, L.; Liu, Z.; Zhang, Y. Adaptive Federated Learning and Digital Twin for Industrial Internet of Things. IEEE Trans. Ind. Inform. 2021, 17, 5605–5614. [Google Scholar] [CrossRef]
  13. Bi, S.; Huang, L.; Zhang, Y.-J.A. Joint Optimization of Service Caching Placement and Computation Offloading in Mobile Edge Computing Systems. IEEE Trans. Wirel. Commun. 2020, 19, 4947–4963. [Google Scholar] [CrossRef] [Green Version]
  14. Sun, W.; Xu, N.; Wang, L.; Zhang, H.; Zhang, Y. Dynamic Digital Twin and Federated Learning with Incentives for Air-Ground Networks. IEEE Trans. Netw. Sci. Eng. 2022, 9, 321–333. [Google Scholar] [CrossRef]
  15. Khan, A.A.; Rehmani, M.H.; Rachedi, A. Cognitive-radio-based internet of things: Applications, architectures, spectrum related functionalities, and future research directions. IEEE Wirel. Commun. 2017, 24, 17–25. [Google Scholar] [CrossRef]
  16. Kaul, S.; Yates, R.; Gruteser, M. Real-time status: How often should one update? In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), Orlando, FL, USA, 25–30 March 2012; pp. 2731–2735. [Google Scholar]
  17. Kam, C.; Kompella, S.; Ephremides, A. Effect of message transmission diversity on status age. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, USA, 29 June–4 July 2014; pp. 2411–2415. [Google Scholar]
  18. Costa, M.; Codreanu, M.; Ephremides, A. Age of information with packet management. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Seoul, Korea, 29 June–4 July 2014; pp. 1583–1587. [Google Scholar]
  19. Sun, Y.; Uysal-Biyikoglu, E.; Yates, R.; Koksal, C.E.; Shroff, N.B. Update or wait: How to keep your data fresh. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), San Francisco, CA, USA, 10–14 April 2016; pp. 1–9. [Google Scholar]
  20. Huang, L.; Qian, L.P. Age of Information for Transmissions over Markov Channels. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Singapore, 4–8 December 2017; pp. 1–9. [Google Scholar]
  21. Abd-Elmagid, M.A.; Pappas, N.; Dhillon, H.S. On the role of age of information in the Internet of Things. IEEE Commun. Mag. 2019, 57, 72–77. [Google Scholar] [CrossRef] [Green Version]
  22. Nguyen, G.; Kompella, S.; Kam, C.; Wieselthier, J.; Ephremides, A. Impact of hostile interference on information freshness: A game approach. In Proceedings of the 15th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), Paris, France, 15–19 May 2017. [Google Scholar]
  23. Garnaev, A.; Zhang, W.; Zhong, J.; Yates, R.D. Maintaining information freshness under jamming. In Proceedings of the IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Paris, France, 29 April–2 May 2019; pp. 90–95. [Google Scholar]
  24. Valehi, A.; Razi, A. Maximizing Energy Efficiency of Cognitive Wireless Sensor Networks With Constrained Age of Information. IEEE Trans. Cogn. Commun. Netw. 2017, 3, 643–654. [Google Scholar] [CrossRef]
  25. Zhao, Y.; Zhou, B.; Saad, W.; Luo, X. Age of information analysis for dynamic spectrum sharing. In Proceedings of the Global Conference on Signal and Information Processing (GlobalSIP), Ottawa, ON, Canada, 11–14 November 2019; pp. 1–5. [Google Scholar]
  26. Gu, Y.; Chen, H.; Zhai, C.; Li, Y.; Vucetic, B. Minimizing age of information in cognitive radio-based IoT systems: Underlay or Overlay? IEEE Internet Things J. 2019, 6, 10273–10288. [Google Scholar] [CrossRef] [Green Version]
  27. Wang, Q.; Chen, H.; Gu, Y.; Li, Y.; Vucetic, B. Minimizing the Age of Information of cognitive radio-based IoT systems under a collision constraint. IEEE Trans. Wirel. Commun. 2020, 19, 8054–8067. [Google Scholar] [CrossRef]
  28. Leng, S.; Yener, A. Age of information minimization for an energy harvesting cognitive radio. IEEE Trans. Cognit. Commun. Netw. 2019, 5, 427–439. [Google Scholar] [CrossRef]
  29. Yilmaz, Y.; Moustakides, G.V.; Wang, X. Cooperative Sequential Spectrum Sensing Based on Level-Triggered Sampling. IEEE Trans. Signal Process. 2012, 60, 4509–4524. [Google Scholar] [CrossRef]
  30. Zhao, W.; Ali, S.S.; Jin, M.; Cui, G.; Zhao, N.; Yoo, S.-J. Extreme Eigenvalues-Based Detectors for Spectrum Sensing in Cognitive Radio Networks. IEEE Trans. Commun. 2022, 70, 538–551. [Google Scholar] [CrossRef]
  31. Ma, J.; Li, G.Y.; Juang, B.H. Signal processing in cognitive radio. Proc. IEEE 2009, 97, 805–823. [Google Scholar]
  32. Tang, H. Some physical layer issues of wide-band cognitive radio systems. In Proceedings of the 1st IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks (DySPAN), Baltimore, MD, USA, 8–11 November 2005; pp. 151–159. [Google Scholar]
  33. Akyildiz, I.F.; Lo, B.F.; Balakrishnan, R. Cooperative spectrum sensing in cognitive radio networks: A survey. Phys. Commun. 2011, 4, 40–62. [Google Scholar] [CrossRef]
  34. Ghasemi, A.; Sousa, E.S. Collaborative spectrum sensing for opportunistic access in fading environments. In Proceedings of the 1st IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks (DySPAN), Baltimore, MD, USA, 8–11 November 2005; pp. 131–136. [Google Scholar]
  35. Ho, C.K.; Zhang, R. Optimal Energy Allocation for Wireless Communications with Energy Harvesting Constraints. IEEE Trans. Signal Process. 2012, 60, 4808–4818. [Google Scholar] [CrossRef] [Green Version]
  36. Chi, K.; Zhu, Y.; Li, Y.; Huang, L.; Xia, M. Minimization of Transmission Completion Time in Wireless Powered Communication Networks. IEEE Internet Things J. 2017, 4, 1671–1683. [Google Scholar] [CrossRef]
  37. Chi, K.; Chen, Z.; Zheng, K.; Zhu, Y.-H.; Liu, J. Energy Provision Minimization in Wireless Powered Communication Networks With Network Throughput Demand: TDMA or NOMA? IEEE Trans. Commun. 2019, 67, 6401–6414. [Google Scholar] [CrossRef]
  38. Chen, Z.; Chi, K.; Zheng, K.; Dai, G.; Shao, Q. Minimization of transmission completion time in UAV-enabled wireless powered communication networks. IEEE Internet Things J. 2020, 7, 1245–1259. [Google Scholar] [CrossRef]
  39. Bae, Y.H.; Baek, J.W. Performance analysis of delay-constrained traffic in a cognitive radio network with RF energy harvesting. IEEE Commun. Lett. 2019, 23, 2177–2181. [Google Scholar] [CrossRef]
  40. Li, K.H.; Teh, K.C. Optimal spectrum access and energy supply for cognitive radio systems with opportunistic RF energy harvesting. IEEE Trans. Veh. Technol. 2017, 66, 7114–7122. [Google Scholar]
  41. Xu, M.; Jin, M.; Guo, Q.; Li, Y. Multichannel Selection for Cognitive Radio Networks With RF Energy Harvesting. IEEE Wirel. Commun. Lett. 2018, 7, 178–181. [Google Scholar] [CrossRef]
  42. Celik, A.; Alsharoa, A.; Kamal, A.E. Hybrid energy harvesting- Based cooperative spectrum sensing and access in heterogeneous cognitive radio networks. IEEE Tran. Cognit. Commun. Netw. 2017, 3, 37–48. [Google Scholar] [CrossRef]
  43. Xu, C.; Zheng, M.; Liang, W.; Yu, H.; Liang, Y. End-to-End Throughput Maximization for Underlay Multi-Hop Cognitive Radio Networks with RF Energy Harvesting. IEEE Trans. Wirel. Commun. 2017, 16, 3561–3572. [Google Scholar] [CrossRef]
  44. Masonta, M.T.; Mzyece, M.; Ntlatlapa, N. Spectrum decision in cognitive radio networks: A survey. IEEE Commun. Surv. Tutor. 2013, 15, 1088–1107. [Google Scholar] [CrossRef] [Green Version]
  45. López-Benítez, M.; Casadevall, F. Empirical time-dimension model of spectrum use based on a discrete-time Markov chain with deterministic and stochastic duty cycle models. IEEE Trans. Veh. Technol. 2011, 60, 2519–2533. [Google Scholar] [CrossRef]
  46. Unnikrishnan, J.; Veeravalli, V.V. Cooperative sensing for primary detection in cognitive radio. IEEE J. Select. Top. Signal Process. 2006, 2, 18–27. [Google Scholar] [CrossRef]
  47. Shannon, C.E. A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 2001, 5, 3–55. [Google Scholar] [CrossRef]
  48. Bertsekas, D.P. Dynamic Programming and Optimal Control; Athena Scientific: Belmont, MA, USA, 2005; Volume 1–2. [Google Scholar]
Figure 1. System model. In each time slot, the SU can harvest energy from the PU transmissions and can deliver the status-update date pack to the CBS when the channel is idle.
Figure 1. System model. In each time slot, the SU can harvest energy from the PU transmissions and can deliver the status-update date pack to the CBS when the channel is idle.
Entropy 24 00596 g001
Figure 2. One sample path of the AoI by the optimal policy.
Figure 2. One sample path of the AoI by the optimal policy.
Entropy 24 00596 g002
Figure 3. The size of status-update data packet versus the AoI when T = 10.
Figure 3. The size of status-update data packet versus the AoI when T = 10.
Entropy 24 00596 g003
Figure 4. The transmit power of PU versus the AoI when T = 10.
Figure 4. The transmit power of PU versus the AoI when T = 10.
Entropy 24 00596 g004
Figure 5. The battery capacity versus the AoI when T = 10.
Figure 5. The battery capacity versus the AoI when T = 10.
Entropy 24 00596 g005
Figure 6. The energy consumption on sensing spectrum versus the AoI when T = 10.
Figure 6. The energy consumption on sensing spectrum versus the AoI when T = 10.
Entropy 24 00596 g006
Table 1. List of notations used in this paper.
Table 1. List of notations used in this paper.
NotationDefinition
p a i The transition probability from the active state to the idle state
p i i The transition probability from the idle state to the idle state
p f The false alarm probability
p d The detection probability
ϕ t The sensing decision at time slot t
θ t The update decision at time slot t
q t ^ The sensing result
δ The energy consumption on sensing spectrum
τ s The time consumption on sensing spectrum
E T , t The energy consumption on update
τ t The time consumption on update
SThe size of status-update data pack
a t The AoI at time slot t
ρ t The belief probability
b m a x The maximum battery energy level
g max The maximum channel power gain level from the SU to the CBS
h max The maximum channel power gain level from the PU to the SU
Φ The belief space
η The energy-harvesting efficiency
σ 2 The noise power
B max The battery capacity of the SU
A max The upper of AoI
s t The current state
x t The action at time slot t
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Sun, J.; Zhang, S.; Yang, C.; Huang, L. Age of Information Minimization for Radio Frequency Energy-Harvesting Cognitive Radio Networks. Entropy 2022, 24, 596. https://doi.org/10.3390/e24050596

AMA Style

Sun J, Zhang S, Yang C, Huang L. Age of Information Minimization for Radio Frequency Energy-Harvesting Cognitive Radio Networks. Entropy. 2022; 24(5):596. https://doi.org/10.3390/e24050596

Chicago/Turabian Style

Sun, Juan, Shubin Zhang, Changsong Yang, and Liang Huang. 2022. "Age of Information Minimization for Radio Frequency Energy-Harvesting Cognitive Radio Networks" Entropy 24, no. 5: 596. https://doi.org/10.3390/e24050596

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop