Maximizing Average Throughput of Cooperative Cognitive Radio Networks Based on Energy Harvesting

Energy harvesting (EH) and cooperative communication techniques have been widely used in cognitive radio networks. However, most studies on throughput in energy-harvesting cooperative cognitive radio networks (EH-CCRNs) are end-to-end, which ignores the overall working state of the network. For the above problems, under the premise of prioritizing the communication quality of short-range users, this paper focuses on the optimization of the EH-CCRN average throughput, with energy and transmission power as constraints. The formulated problem was an unsolved non-deterministic polynomial-time hardness (NP-hard) problem. To make it tractable to solve, a multi-user time-power resource allocation algorithm (M-TPRA) is proposed, which is based on sub-gradient descent and unary linear optimization methods. Simulation results show that the M-TPRA algorithm can improve the average throughput of the network. In addition, the energy consumed by executing the M-TPRA algorithm is analyzed.


Introduction
In recent years, with the rapid growth of the number of mobile users and wireless communication devices, the demand for spectrum resources has increased dramatically. However, studies have shown that the low utilization of the allocated spectrum exacerbates the lack of spectrum resources [1]. The cognitive radio (CR) technology proposed by Mitola allows the secondary user (SU) to opportunistically use the spectrum of the primary user (PU) under the premise of ensuring the communication quality of PU [2]. The method improves the spectrum utilization and alleviates the problem of the shortage of spectrum resources [3,4]. Meanwhile, EH and cooperative communication technologies are considered to be the key technologies to improve the throughput of various wireless networks, including the cognitive radio network (CRN).
EH technology is utilized to solve the problem where key functions in CR technology (such as spectrum sensing and spectrum prediction) increase the energy consumption of equipment. Attempts have been made to reduce the overall energy consumption of the devices by energy-saving technology [5,6], but most CRNs are composed of batterypowered wireless devices, and energy-saving technology alone cannot fundamentally solve the problem of energy consumption. EH is a technology that provides a new way of powering the network by extracting energy from sources including solar [7], wind [8], radio frequency (RF) [9] and other energy sources. Among them, RF energy has become meaningful because of its reliability and sustainability. Wireless power transmission (WPT) technology derived from EH is widely used. In ref. [10], WPT duration, transmission time allocation of each edge device and the partial offloading decision are jointly optimized in order to maximize the sum computation rate. An online offloading algorithm based on deep reinforcement learning (DRL) is designed to solve the optimization problem. Zheng et al. studied the WPT-aided cell-free massive MIMO system. The harvested energy (HE) from (1) A time-power joint optimization model with the goal of maximizing the network's throughput is proposed and analyzed. The optimization model is constrained by transmission power, energy and interruption. Moreover, we comprehensively analyze the impacts of different key parameters on the average throughput of EH-CCRNs, i.e., the transmission power of PU, the system time switching factor and distance, etc. (2) A power splitting factor expression at SU is proposed, on the basis that the effect of time switching factor on the network average throughput is independent of this factor. We provide a detailed analysis of the influence of this factor on the communication quality of short-range users.
(3) A multi-user time-power resource allocation algorithm (M-TPRA) is proposed. Firstly, M-TPRA transforms non-convex optimization problems into convex optimization problems by introducing slack variables. Secondly, using the idea of hierarchical optimization, the optimization problem is divided into two sub-problems: power control and time allocation. Thirdly, the power control is obtained by sub-gradient descent, and time allocation is obtained by unary linear optimization. Finally, we analyze the energy consumed by implementing the M-TPRA algorithm.
The rest of this paper is organized as follows. In Section 2, we introduce the system model and problem description. In Section 3, we present a solution to the time-power optimization problem. Section 4 provides the simulation results. Finally, a summary of the work is given in Section 5. Figure 1 illustrates an underlay EH-CCRN which consists of a primary user transmitter (PU 1 ), primary user receiver (PU 2 ), ST and secondary user receiver (SR). We assume that there is no direct communication link between PU 1 and PU 2 due to distance and shadow fading. As a relay, ST can collect energy from the RF signal of PU 1 and obtain some resources (time, spectrum) to decode-forward information. In addition, all channels are independent and identically distributed and subject to Rayleigh fading. The channel coefficients remain unchanged for block time T. The notation list in Appendix A summarizes the following main variables and parameters used in this study.

System Model
(2) A power splitting factor expression at SU is proposed, on the basis that the effect of time switching factor on the network average throughput is independent of this factor. We provide a detailed analysis of the influence of this factor on the communication quality of short-range users. (3) A multi-user time-power resource allocation algorithm (M-TPRA) is proposed.
Firstly, M-TPRA transforms non-convex optimization problems into convex optimization problems by introducing slack variables. Secondly, using the idea of hierarchical optimization, the optimization problem is divided into two sub-problems: power control and time allocation. Thirdly, the power control is obtained by subgradient descent, and time allocation is obtained by unary linear optimization. Finally, we analyze the energy consumed by implementing the M-TPRA algorithm.
The rest of this paper is organized as follows. In Section 2, we introduce the system model and problem description. In Section 3, we present a solution to the time-power optimization problem. Section 4 provides the simulation results. Finally, a summary of the work is given in Section 5. ST and secondary user receiver ( SR ). We assume that there is no direct communication link between 1 PU and 2 PU due to distance and shadow fading. As a relay, ST can collect energy from the RF signal of 1 PU and obtain some resources (time, spectrum) to decode-forward information. In addition, all channels are independent and identically distributed and subject to Rayleigh fading. The channel coefficients remain unchanged for block time T. The notation list in Appendix A summarizes the following main variables and parameters used in this study.  The system transmission protocol is divided into two stages, as shown in Figure 2. In the first stage, PU 1 broadcasts, and ST uses the power splitting (PS) receiving scheme for EH and information decoding (ID). In the second stage, ST uses the harvested energy to transmit information to SR and PU 2 .

System Model
where P PU1 is the transmission power of PU 1 , m is path loss index, h 1 is the channel coefficient between PU 1 and ST, d 1 is the distance between PU 1 and ST, and n ST is zeromean additive white Gaussian noise at ST with variance σ 2 ST . Thus, the signal-to-noise ratio (SNR) at ST is described as According to the Shannon Theorem, the maximum transmission rate between PU 1 and ST can be expressed as Furthermore, the energy E ST collected by ST can be written as where α (0 < α < 1) is the time switching factor, β (0 < β < 1) is the power-splitting factor at PU 1 and η (0 < η < 1) is the energy conversion efficiency. In particular, since the PU 1 transmits information by broadcasting, it will cause certain interference to the SR. Hence, the interference signal received by SR is where h 2 is the channel coefficient between PU 1 and SR, d 2 is the distance between PU 1 and SR, and n SR is zero-mean additive white Gaussian noise at SR with variance σ 2 SR . We assume that the SR can successfully decode the received signal transmitted by the ST in the second stage, which can effectively eliminate the interference of PU 1 [28].

ST Broadcasting
After ST decodes the signal transmitted by PU 1 successfully, it broadcasts the signal s ST with the transmission power P ST . ST divides P ST into two parts: γP ST and (1 − γ)P ST , where γ (0 < γ < 1) is the power-splitting factor at ST, γP ST is the power of the signal s PU2 sent to PU 2 , and (1 − γ)P ST is the power of the signal s SR sent to SR. The broadcast signal of ST can be expressed as On the one hand, the signal received by PU 2 is where h 3 is the channel coefficient between PU 2 and ST, d 3 is the distance between PU 2 and ST, n PU2 is zero-mean additive white Gaussian noise at PU 2 with variance σ 2 PU2 . Signal-to-interference-plus-noise ratio (SINR) at PU 2 can be expressed as The maximum transmission rate on the channel between ST and PU 2 can be written as On the other hand, the signal received by SR after eliminating the interference is Therefore, SINR at SR is described as On the channel between ST and SR, the maximum transmission rate can be expressed as

Problem Formulation
In EH-CCRNs, when v 1 , v 2 and v 3 are lower than the target rate R t , the link is interrupted. Assuming that the system is not interrupted within the time T, v 1 , v 2 and v 3 must satisfy the following: Assuming that the SU does not have battery and can only use the capacitor for energy storage, the energy required by SU in the second stage can only come from the collection in the first stage, so E ST must satisfy the following: According to the characteristics of the CRN, P ST is constrained by the interference threshold I th , namely: To prevent interference with public systems, P PU 1 should be constrained by Maximum transmission power P Tmax , namely The average network throughput τ is The research goal in this paper is to maximize average throughput of the network while ensuring the quality of user communication. Therefore, the throughput optimization problem can be described as P1 : max Equation (18) is the optimizing problem. The optimization variables are the transmission power P PU1 , the time switching factor α, and the power splitting factor γ at ST. C1~C8 are the constraints of the optimizing problem. C7 and C8 are constraints on the value range of α, β, γ, making the system model more realistic.

Multi-User Time-Power Resource Allocation Algorithm (M-TPRA)
Equation (18) is a non-convex problem (Appendix B for the proof), and it is difficult to directly obtain the solution. Therefore, it is necessary to make it a convex optimization.
In the model proposed in this paper, we assume that the effect of α on the network average throughput is independent of γ. That is, the network average throughput τ is independent of γ after taking the derivative of α: (19), we can obtain From Equation (20), γ * can be solved and expressed as According to C7 and Equation (21), the following constraint can be obtained: As and h 2 4 are non-negative values,P ST > 0 is always satisfied. By substituting Equations (3), (9) and (12) into Equation (18), the optimization problem of (18) can be described as follows: However, the objective function of P2 is still a non-convex function, and the feasible region is not a convex set. In order to obtain the global optimal solution, a slack variable ω = αP PU 1 is introduced [29], and the optimization problem (Equation (23)) is rewritten as : Now, the objective function of P3 is a concave function, and the feasible region is a convex set (Appendix C for the proof). However, due to the nature of the objective function and constraints, a joint closed-form solution cannot be obtained. Therefore, based on the idea of cross-layer optimization, this paper transforms the original optimization problem into an inner and outer two layers to solve.
Under the requirement of ensuring the normal communication of the system, the inner layer converts the optimization objective function into a subtractive formula. Then, the sub-gradient descent is used to obtain the optimal transmission power of PU 1 . The outer layer obtains the optimal time switching factor α by unary linear optimization when the optimal transmission power of PU 1 has been obtained.

Power Control
This section is the inner layer of M-TPRA, solving the power control problem. The optimization variable is P PU1 . The Lagrange equation of the sub-problem can be described as where λ i (i = 1, 2, 3) are dual variables. We apply Karush-Kuhn-Tucker (KKT) conditions and obtain [x] + represents max{0, x}. From Equation (27), it can be known that P PU 1 is independent of α. λ i (i = 1, 2, 3) can be obtained by the sub-gradient method: where k is the number of iterations and ξ is the negative gradient step size of γ in each iteration. After γ is updated, the value of ω is simultaneously updated. The specific implementation steps of the algorithm are shown in Algorithm 1.
if P PU1 > 0 and P PU1 < P Tmax

Time Allocation
This section is the outer layer of M-TPRA, solving the time allocation problem. That is, when the optimal transmission power of PU 1 is known, the maximum average throughput of the network is calculated. The sub-problems can be written as The optimization problem P5 is a unary linear optimization. The value range Ω α of α can be calculated through the constraints of P5. Then, the maximum value of the objective function can be obtained within this range. The solution detailed steps of the multi-user time-power resource allocation algorithm (M-TPRA) is shown in Algorithm 1.

Simulation Parameters
In this section, M-TPRA is simulated, verified and analyzed considering a pair of PUs and a pair of SUs. The part values of the simulation parameters were based on ref. [28] and are summarized in Table 1.  the network adopts the DF relay strategy. The larger SINR ST is, the probability of successful ST decoding will also increase, which is also conducive to improving the network average throughput. However, it is due to external interference and the limitation of  Figure 3 gradually becomes denser, indicating that the average network throughput grows slowly, which is due to the influence of the interference threshold th I .  v v v ++. First of all, we observe from Figure 4 that the network average throughput, the PU network throughput and the SU network throughput cannot be optimal at the same time. Secondly, when other parameters in the network are determined,  is linearly related to the average throughput of the network. As  increases,  For horizontal comparison, Figure 3 shows the relationship between P PU1 and the network average throughput. It can be seen from Figure 3 that as P PU1 increases, the network average throughput increases gradually. This phenomenon is mainly caused by two reasons. Firstly, when the noise power is constant, the greater the transmission power, the greater the SI NR ST , the greater the maximum transmission rate v 1 on the communication link from PU 1 to ST. Hence, the network average throughput increases. Secondly, the network adopts the DF relay strategy. The larger SI NR ST is, the probability of successful ST decoding will also increase, which is also conducive to improving the network average throughput. However, it is due to external interference and the limitation of P Tmax that the network average throughput grows slower as P PU1 becomes larger.

Relationship between  and Throughput
When comparing vertically, Figure 3 shows the relationship between the P ST and the network average throughput. Observing Figure 3, it can be seen that when P PU1 is kept constant, the network average throughput increases with the increase of P ST . The reason is similar to that in the horizontal comparison. However, with the increase of P ST , the curve in Figure 3 gradually becomes denser, indicating that the average network throughput grows slowly, which is due to the influence of the interference threshold I th . Figure 4 is the relationship between α and the network average throughput, where P ST = 2W. In the network, the PU network contains two communication links, PU 1 → ST and ST → PU 2 , and the SU network contains one communication link, ST → SR , so the PU network throughput is v 1 + v 2 , the SU network throughput is v 3 , and the average network throughput is v 1 + v 2 + v 3 . First of all, we observe from Figure 4 that the network average throughput, the PU network throughput and the SU network throughput cannot be optimal at the same time. Secondly, when other parameters in the network are determined, α is linearly related to the average throughput of the network. As α increases, v 1 gradually becomes larger, so α has a positive linear relationship with the network throughput of PU. However, α has a negative linear correlation with the SU network throughput and the network average throughput. This is because the increase of α shortens the time left for ST to propagate information. The maximum transmittable rate on the ST → PU 2 , ST → SR two communication links becomes smaller, which may be lower than the target rate R t , which leads to the interruption of the link. Thus, the throughput is reduced. In this system model, the goal is to maximize the network average throughput, so α should take the minimum value in the range Ω α .    Figure 5 shows the relationship between the distance d 3 from ST to PU 2 and γ at ST, and Figure 6 shows the relationship between d 3 and v 3 . Among them, the distance d 4 from ST to SR remains unchanged at 1.5 m. When d 3 = d 4 = 1.5m, it can be observed from Figure 5 that γ = 0.5, that is, ST is always in the state of equal power distribution whatever the value of P ST . In addition, when P ST remains unchanged, γ gradually decreases with the increase of d 3 , which results in less transmission power allocated by ST to PU 2 . This is because ST reserves more power for the SR to ensure the communication quality of the close-range users. Therefore, from Figure 6 we can observe that the throughput v 3 of the SU network communication link ST → SR becomes larger as d 3 becomes larger. At the same time, when d 3 and d 4 are kept constant, the larger the P ST , the smaller the proportion of occupied transmission power; so, we can observe that γ decreases with the increase of P ST .

located by
ST to 2 PU . This is because ST reserves more power for the SR to ensure the communication quality of the close-range users. Therefore, from Figure 6 we can observe that the throughput v3 of the SU network communication link     . The characteristics of the curves in Figure 7 are similar to those in Figure 5. At the same time, we can see from Figure 8 that the throughput of the PU network v3 + v2 becomes larger as    Figure 8 is a relationship curve between the distance d 4 and v 3 + v 2 . Among them, the distance d 3 from ST to PU 2 remains unchanged at 1.5 m. In the system model, the transmission power allocated by ST to SR is (1 − γ)P ST , which has a complementary relationship with the transmission power γP ST allocated to PU 2 . Therefore, in Figure 7, the relationship d 4 -γ and the relationship d 3 -γ have opposite trends and are symmetrical about γ = 0.5. The characteristics of the curves in Figure 7 are similar to those in Figure 5. At the same time, we can see from Figure 8 that the throughput of the PU network v 3 + v 2 becomes larger as d 4 increases, which ensures the communication quality of the close-range user. Figure 9 shows the performance comparison of M-TPRA, the exhaustive method and the joint optimization algorithm in this model. It can be seen from Figure 9 that with the increase of P PU 1 , the performance of M-TPRA gradually approaches the exhaustive method. However, the exhaustive method needs to traverse all possibilities, so it takes longer than M-TPRA. In the case of similar performance, M-TPRA is more applicable to the actual situation. The optimization goal of the joint optimization algorithm proposed in ref. [29] is end-to-end throughput. From Figure 9, it can be observed that the performance of M-TPRA is much higher than joint optimization. Therefore, the algorithm that optimizes the end-to-end throughput cannot guarantee the overall throughput of the network. . The characteristics of the curves in Figure 7 are similar to those in Figure 5. At the same time, we can see from Figure 8 that the throughput of the PU network v3 + v2 becomes larger as    Figure 9 shows the performance comparison of M-TPRA, the exhaustive method and the joint optimization algorithm in this model. It can be seen from Figure 9 that with the increase of 1 PU P , the performance of M-TPRA gradually approaches the exhaustive method. However, the exhaustive method needs to traverse all possibilities, so it takes longer than M-TPRA. In the case of similar performance, M-TPRA is more applicable to the actual situation. The optimization goal of the joint optimization algorithm proposed in ref. [29] is end-to-end throughput. From Figure 9, it can be observed that the performance of M-TPRA is much higher than joint optimization. Therefore, the algorithm that optimizes the end-to-end throughput cannot guarantee the overall throughput of the network.  Figure 9 shows the performance comparison of M-TPRA, the exhaustive method and the joint optimization algorithm in this model. It can be seen from Figure 9 that with the increase of 1 PU P , the performance of M-TPRA gradually approaches the exhaustive method. However, the exhaustive method needs to traverse all possibilities, so it takes longer than M-TPRA. In the case of similar performance, M-TPRA is more applicable to the actual situation. The optimization goal of the joint optimization algorithm proposed in ref. [29] is end-to-end throughput. From Figure 9, it can be observed that the performance of M-TPRA is much higher than joint optimization. Therefore, the algorithm that optimizes the end-to-end throughput cannot guarantee the overall throughput of the network.

Energy Consumption Analysis
In this system, we ignore the energy loss of the EH circuit when converting RF energy into electrical energy. Therefore, the energy used by the ST consists of only two parts, one for executing the M-TPRA algorithm and the other for transmitting information. In order to balance the communication quality of SR and 2 PU , the maximum value of 3 d Figure 9. Performance comparison of M-TPRA, exhaustive method and joint optimization algorithm.

Energy Consumption Analysis
In this system, we ignore the energy loss of the EH circuit when converting RF energy into electrical energy. Therefore, the energy used by the ST consists of only two parts, one for executing the M-TPRA algorithm and the other for transmitting information. In order to balance the communication quality of SR and PU 2 , the maximum value of d 3 is 9 m when d 4 is 1.5 m. In addition, considering the saturation of the EH circuit, a timer is added to the EH system. The timer specifies the time required to collect energy. Figure 10 shows the relationship between the transmit power and the EH time when the distance between users is consistent. For example, when the value P ST is 20 W, the maximum energy collected by the system is 3.34 w. When the information transmission distance is 9 m, 2.64 W is required. Therefore, the energy consumed by executing the M-TPRA algorithm is about 0.7 W. Beyond this energy value, the communication link may be interrupted. The time required to collect energy E ST = αβ|h 1 | 2 P PU 1 T/d m 1 is αT, so the average speed of collecting energy is e ST = β|h 1 | 2 P PU 1 /d m 1 . Therefore, when the energy required to execute the M-TPRA algorithm is 0.7 W, the time to collect 3.34 w energy is 1 s. It can be seen from Figure 10 that when the distance between users is fixed, the larger P PU 1 is, the shorter the time required for energy collection is. When the transmit power is constant, the larger d 3 is, the longer the energy collection time. is 9 m when 4 d is 1.5 m. In addition, considering the saturation of the EH circuit, a timer is added to the EH system. The timer specifies the time required to collect energy. Figure  10 shows the relationship between the transmit power and the EH time when the distance between users is consistent. For example, when the value ST P is 20 W, the maximum energy collected by the system is 3.34 w. When the information transmission distance is 9 m, 2.64 W is required. Therefore, the energy consumed by executing the M-TPRA algorithm is about 0.7 W. Beyond this energy value, the communication link may be interrupted.
The time required to collect energy is T  , so the average speed of collecting energy is . Therefore, when the energy required to execute the M-TPRA algorithm is 0.7 W, the time to collect 3.34 w energy is 1 s. It can be seen from Figure 10 that when the distance between users is fixed, the larger 1 PU P is, the shorter the time required for energy collection is. When the transmit power is constant, the larger 3 d is, the longer the energy collection time.

Conclusions
This paper studies the problem of maximizing average throughput in EH-CCRNs. In this network, SU relies on the energy collected from the primary user's RF signal to decode and forward information, and the transmission power of both the PU and SU is limited to a certain extent. Under these constraints, a multi-user time-power resource allocation algorithm (M-TPRA) is proposed to improve the network average throughput. Simulation results show that the average throughput of the network is positively correlated with the primary users' transmission power and negatively correlated with the system time switching factor. At the same time, the system can reasonably set the power splitting factor at the SU according to the user distance and give priority to ensuring the communication quality of the short-range users. In addition, the energy consumed by implementing the M-TPRA algorithm is also analyzed. In order to make the system more realistic, the saturation of the EH circuit is considered. A timer is added to the system, and the time set by the timer is related to the

Conclusions
This paper studies the problem of maximizing average throughput in EH-CCRNs. In this network, SU relies on the energy collected from the primary user's RF signal to decode and forward information, and the transmission power of both the PU and SU is limited to a certain extent. Under these constraints, a multi-user time-power resource allocation algorithm (M-TPRA) is proposed to improve the network average throughput. Simulation results show that the average throughput of the network is positively correlated with the primary users' transmission power and negatively correlated with the system time switching factor. At the same time, the system can reasonably set the power splitting factor at the SU according to the user distance and give priority to ensuring the communication quality of the short-range users. In addition, the energy consumed by implementing the M-TPRA algorithm is also analyzed. In order to make the system more realistic, the saturation of the EH circuit is considered. A timer is added to the system, and the time set by the timer is related to the P PU 1 and the distance between users. This work can also be considered in the scenario of multiple pairs of sub-users. It can be seen from the above equation that the eigenvalues of the Hessian matrix are b 11 , b 22 , and b 33 , which are not all negative, so the objective function is not a convex function. In the same way, it can be seen that the constraint C1 is non-convex, while the constraints C2, C3, C4, C5, and C6 are convex, and the constraints C7 and C8 are affine surfaces, which are always convex.

Appendix C
The Hessian matrix H 2 of optimization problem P3 can be expressed as Bringing in the objective function of P3, we obtain The eigenvalues of the Hessian matrix are h 11 and 0, which are not positive, so the objective function is a concave function. Similarly, the eigenvalues of the constraints in P3 are all non-negative, so the feasible region of P3 is a convex set.