Joint Power Control and Resource Allocation with Rate Fairness Consideration for SWIPT-Based Cognitive Two-Way Relay Networks

This paper investigates the power control and resource allocation problem in a simultaneously wireless information and power transfer (SWIPT)-based cognitive two-way relay network, in which two secondary users exchange information through a power splitting (PS) energy harvesting (EH) cognitive relay node underlay in a primary network. To enhance the secondary networks’s transmission ability without detriment to the primary network, we formulate an optimization to maximize the minimum transmission rates of the cognitive users by jointly optimizing power allocation at the sources, the time allocation of transmission frames and power splitting at the relay, under the constraint that the transmission power of the cognitive network is set not to exceed the primary user interference threshold to ensure primary work performance. To efficiently solve this problem, a sub-optimal algorithm named the joint power control and resource allocation (JPCRA) scheme is proposed, in which we decouple the non-convex problem into convex problems and use alternative steps in the optimization algorithm to get final solutions. Numerical results reveal that the proposed scheme enhances transmission fairness and outperforms three traditional schemes.


Introduction
The demand for spectrum resources is increasing rapidly as wireless communication continues to develop.However, the limited spectrum resources severely restrict the further growth of communication capacity.Cognitive radio is an effective technology to enhance spectrum efficiency through the reasonable reuse of the authorized spectrum.The technology mainly includes three modes: spectrum interweave, spectrum overlay and spectrum underlay [1,2].Compared with interweave and overlay, underlay is simple to implement because no spectrum sensing is needed, and it has a better ability to realize spectrum sharing [3].With underlay, in order to ensure a high priority of the primary user's (PU) access of the spectrum, the transmit power of the cognitive users must be kept under the interference tolerance threshold of the PU.Therefore, power control becomes a key issue in optimizing the overall performance of the cognitive network [4].
In [5], Lee et al. investigate transmit power control for an underlay cognitive radio network by using a deep learning method that determines its own transmit power based solely on its local channel state information (CSI).In [6], Sarvendranath et al. develop an optimal and novel joint antenna selection and power adaptation rule that minimizes the average symbol error probability of a secondary user that is subject to two practically well-motivated constraints.Hu et al. [7] propose two optimal power control schemes from the long-term and short-term perspectives for a cognitive low orbit satellite constellation with terrestrial networks, which aims to maximize the delay-limited capacity and minimize the outage probability, respectively.In [8], Chuang et al. propose a dynamic multiobjective approach for power and spectrum allocation in a cognitive-based environment and propose a dynamic resource allocation algorithm comprising a hybrid initialization method and feasible point generation mechanisms to solve the dynamic multiobjective optimization problem.In [9], two efficient and low-complexity power control strategies are proposed for an ambient backscatter-based spectrum-sharing network, and with the backscatter prominent, there is no need to estimate all users' CSI.
The relay technique can expand the transmission distance and improve the transmission reliability of the system.Integrating relay and cognitive techniques can further improve the transmission performance of the system [10,11].In [12], the closed-form expression of outage probability for a cognitive multi-hop relay network is derived over Rayleigh fading channels, and an optimization problem to minimize the outage probability of the cognitive relay network is formulated and solved.In [13], a novel decentralized scheduling technique is developed for the cognitive multi-user multi-relay network, which operates on an incremental relaying mechanism and derives the outage probability of the secondary network for both the decode-and-forward (DF) and amplify-and-forward (AF) strategies.The transmission performance of a two-way AF cognitive relay network considering the influence of the primary network is studied in [14], and closed-forms of outage probability and bit error rate are derived.In [15], Yang et al. propose a dynamic power transmission scheme for non-orthogonal multiple access (NOMA) cognitive relay networks and derive the closed-form expressions of outage probability and average sum rate.Zhong and Zhang investigate relay selection in a two-way full-duplex AF relay network [16] and derive the system outage probability and bit error rate.In [17], Poornima et al. investigate the energy efficiency and the spectral efficiency performance of multi-hop full duplex cognitive relay networks.
Relaying often causes energy consumption issues, and forcing idle users to use their own energy to help the relay is difficult.Therefore, the energy issue in cognitive relay networks has become a topic of substantial research interest [18].Introducing radio frequency (RF) wireless energy harvesting (EH) technology into cognitive relay networks could potentially solve both the energy and spectrum problems, which has attracted significant attention in the academic communities [19].In [20], He et al. derive and compare the outage probabilities of the primary network and the EH cognitive network under direct transmission, single-user cooperation and multi-user cooperation scenarios.In [21], Shome et al. investigate the error probability of an energy harvesting co-operative cognitive radio network with several relay selection criteria.Wang et al. [22] study an energy harvestingbased secure transmission scheme for cognitive multi-relay networks and analyze the average secrecy rate, the secondary secrecy outage probability and the ergodic secrecy rate.More recently, some researchers have begun to study the SWIPT protocols and resource allocation for cognitive AF two-way relay networks [23,24].The optimization model of [23] aims to maximize the total transmission throughput of the system, and the authors propose an algorithm by optimizing the transmit power of sensor nodes.In [24], the approximate closed-form expression of minimizing the outage probability and throughput is taken as the optimization objective, and the closed-form solution of the optimal power control parameters and power partition ratio are obtained.Shukla et al. [25] evaluate the performance of the proposed SWIPT-enabled NOMA system by considering both the perfect and imperfect successive interference cancellation for the legitimate users over Nakagami-m fading in terms of outage probability, system throughput and energy efficiency.
However, to the best of our knowledge, much of the previous references that emphasize performance optimization for an energy harvesting cognitive two-way relay network focus on the AF transmission protocol.On the other hand, the fairness issue and the inter-ference effects of the main network on the secondary network are seldom studied.In our previous studies [26], we have investigated the power allocation under power control for a simultaneous wireless information and power transfer (SWIPT)-based cognitive two-way relay network with rate fairness consideration.However, the time allocation and power splitting issue have not been considered yet.In this paper, we consider the two-way DF cognitive relay network and investigate the jointly-optimum design based on the PS energy harvesting protocol, which aims to maximize the minimum cognitive user transmission rate with rate fairness and power control consideration.We achieve this by jointly optimizing the power allocation at source nodes, the time allocation of frames and the power splitting ratio at the relay.The main contributions are summarized as follows.
(1) We develop a joint optimization scheme to maximize the minimum cognitive user transmission rate under rate fairness and power control consideration.The goal is to maximize the minimum cognitive user transmission performance through the joint optimization of time, power and power component ratio.(2) A stepped alternating optimization algorithm is proposed to solve the complex nonconvex optimization problem.Through decoupling, the original problem is transformed into convex optimization problems and an alternating optimization problem.This avoids solving the complex non-convex optimization problem.(3) The results show that the proposed scheme improves the unfairness of inter-user transmission caused by channel asymmetry, and its superiority over the traditional scheme in terms of outage probability is depicted.
The rest of this paper is organized as follows: Section 2 presents the system model and problem formulation.Section 3 proposes the joint power control and resource allocation scheme.Section 4 studies and compares the performance of the proposed scheme under a simulation system setup.Finally, Section 5 concludes the paper.

System Model and Problem Formulation
Consider a half-duplex two-way cognitive relay network that consists of two source nodes S 1 and S 2 with a fixed power supply and a passive relay node with energy harvesting ability.All the terminals are equipped with a single omnidirectional antenna, and the antenna gain is normalized to 1.It is assumed that a direct link between the source nodes does not exist.We adopt the power splitting receiver architecture and DF protocol at relay.The system model is as shown in Figure 1.The information exchange of the whole transmission needs two time slots: a multiple access (MA) transmission phase and a broadcast (BC) phase.During the MA phase, source nodes S 1 and S 2 transmit their own information to relay R. Due to the broadcasting nature, PU receives the information from source nodes S 1 and S 2 as interference.To ensure the performance of the primary network, an interference threshold to restrict the total transmission power of source nodes S 1 and S 2 is set.Once the relay receives the signal, it partitions it into two parts: one part for energy harvesting, the other for information decoding.In the BC phase, the relay forwards the decoded signals to source nodes S 1 and S 2 with the harvested energy.Similarly, to ensure the performance of the primary network, the transmission power of the relay should be under the interference threshold.

Information and Energy Transfer
Let the total time of the whole transmission phase be normalized to be 1; if the MA phase time period is t, then the BC phase time period is 1 − t.In the MA phase, the transmit power of each source node is P i .Due to the peak power constraint, P i should satisfied the following equation: where P i,max is the peak power of source node S i .Since the cognitive network utilizes underlay spectrum sharing, the received interference power for the primary user should be less than an interference threshold Q to satisfy the primary performance, i.e., 2 where l i is the CSI from source S i to PU, and |l i | 2 P i , i = 1, 2 is the interference caused by spectrum sharing from source node S i to PU.Let α(0 < α < 1) be the interference constraint ratio of two cognitive sources to the primary network.The restrictions of P 1 and P 2 can be reformulated as follows.
The signal received at the cognitive relay is where τ u,r ∼ CN (0, σ 2 u,r ) is the interference introduced by the primary network, h i is the CSI between source node S i and the relay R, and n r,a ∼ CN (0, σ 2 a ) is the white noise at the receiver.
The cognitive relay then splits the received signal y R into two parts: √ ρy R for energy harvesting and 1 − ρy R for information decoding.With linear energy harvesting, the harvested energy can be expressed as where 0 < η < 1 is energy conversion efficiency.Because for practical cases the noise power is far less than the signal power, we neglect white noise for simplicity of analysis.Thus, Equation ( 6) is written as The signal used to decode information is written as where n r,b ∼ CN (0, σ 2 b ) is the noise generated by signal conversion from band-pass to baseband [27].Since σ 2 a σ 2 b , in practice, for simplicity, we neglect n r,a in the following analysis.
According to Equation ( 8) and [10], the rate region of the MA phase is obtained as where C(x) = log 2 (1 + x), Υ ir , i = 1, 2 is the signal to interference plus noise power ratio (SINR) from source S i to relay R, Υ MA is the SINR of multiple access transmission, and where In the broadcast phase, the cognitive relay performs information decoding utilizing the harvested energy.Assuming perfect CSI, the relay can utilize physical layer network coding to encode the received signal y ID into x R = x 1 ⊕ x 2 .Because of underlay spectrum sharing, the transmit power of relay R is restricted not only by the harvested energy but also by the interference threshold of PU, which is written as Equation ( 13) can be rewritten in a simpler form as The signal received at source node S i is where τ u,i ∼ CN (0, σ 2 u,i ) is the interference caused by the primary network, n i ∼ CN (0, σ 2 ) is white noise at source node S i .
Source node S i decodes x R and then uses self-cancellation to decode the intended information.For example, source node S 1 decodes x 2 : According to Equation ( 16) and [10], the rate region of the BC phase is obtained as where Υ r1 and Υ r2 are the SINRs from the relay to S 1 and S 2 , respectively, and are written as

Max-Min Optimization Problem Formulation
The goal is to assess the system's potential transmission capability with fairness consideration for the cognitive SWIPT-based relay network.The primary network's transmission should be guaranteed first.To this end, we propose joint power control and resource allocation optimization, aiming at maximizing the minimum transmission rate of the cognitive relay system.The optimization problem is formulated as where P = {P 1 , P 2 , P R }, C 1 and C 2 are, respectively, the transmission power limits of the source nodes and relay; C 3 and C 4 are the transmission rate region limits of the MA phase and the BC phase, respectively; and C5, C6 and C7 are the range of the time allocation parameter, the range of the power splitting parameter and the range of the power control parameter, respectively.Since multiple variables are coupled in conditions C2∼C4, OP1 is non-convex.Analysis reveals that when P 1 and P 2 are fixed, OP1 degenerates to a joint optimization problem determined by t and ρ; when t and ρ are fixed, OP1 degenerates to an optimization problem determined by α.Thus, the original problem can be decoupled into two parts deriving the optimal power control and power allocation parameters when t and ρ are fixed and deriving the time allocation and power splitting parameters with joint resource allocation when the transmission power is fixed.Based on the analysis, we develop a sub-optimal algorithm to solve this complex problem.

Joint Power Control and Resource Allocation
A sub-optimal algorithm to solve OP1 is proposed, which is named joint power control and resource (JPCRA) allocation.It is based on solving two degenerated optimization works: power allocation with power control (PAPC) consideration and jointly optimum time allocation and power splitting ratio (JoTAPS) with a fixed transmit power.The final results can be obtained by using the alternative optimization algorithm based on PAPC and JoTAPS.This technique is described in detail next.

Power Allocation with Power Control Consideration
One degenerated optimization work is proposing a power allocation scheme that considers power control, i.e., deriving the optimal power control and power allocation parameters when t and ρ are fixed.Equations ( 9) and (17) show that as P 1 or P 2 increases, the achievable upper bound of the objection function min(R 1 , R 2 ) in the MA phase increases monotonically.As P R increases, min(R 1 , R 2 ) in the BC phase increases monotonically.Thus, the system achieves optimal transmission performance with the maximum attainable values of P 1 , P 2 and P R expressed as Equations ( 21)- (23) show that P 1 , P 2 and P R depend on α(0 < α < 1).When the sources have no transmit power limits, P 1 , P 2 and P R can be rewritten as 24) By substituting Equations ( 24)-( 26) into OP1, the optimization problem reduces to a one-dimensional optimization problem.Define R min = min(R 1 , R 2 ).This one-dimensional optimization problem can be expressed as where and (32) Once the upper-bound of C3 and C4 as well as the intersection of C3 and C4 are determined, the optimal value of OP2 can be obtained by R min = min(R 1 , R 2 ).Thus, the first step is to find α, which maximizes t • min(C(Υ ir ), C(Υ MA )/2) and (1 − t) • C(Υ ri ).Through some mathematical analysis, OP2 can be solved in the following two cases by comparing the two terms of P R expressed in Equation (32).
(1) Case 1: the upper bound of C4 is a constant that is not affected by α; the upper bound of C3 is a continuous piecewise function of α.The element C(Υ 1r ) in C3 is a monotonically-increasing function of α, and the element C(Υ 2r ) in C3 is a monotonically-decreasing function of α.C(Υ MA ) is a monotonic function of α whose monotony is affected by the value of H 1 − H 2 .Denote α ir,ma as the intersection of C(Υ ir ) and C(Υ MA )/2, α 1r,2r as the intersection of C(Υ 1r ) and C(γ 2r ).The obtained α that maximizes Since the upper bound of C4 is not affected by α, α 1 is a feasible solution for maximizing the upper bound of C4 and even R min .Therefore, the optimal power control parameter of OP2 in Case 1 is where In this case, the upper bound of C4 is a monotonically increasing function of α, which is affected by the value of H 1 − H 2 ; the upper bound of C3 is a continuous piecewise function of α, in which C(Υ 1r ) is a monotonically increasing function of α while C(Υ 2r ) is a monotonically decreasing function, and C(Υ MA ) is a monotonic function of α affected by the value of H 1 − H 2 .Denote α ir,ma as the intersection of C(Υ ir ) and C(Υ MA )/2, α 1r,2r as the intersection of C(Υ 1r ) and C(Υ 2r ) and α ir,rj , i, j = 1, 2 as the intersection of C(Υ ir ) and C(Υ ir ).Some analysis leads to the following observations: In this case, C(Υ MA ) and C(Υ ri ) are constants.Thus, OP3 has one and only one optimal α, α 1 = min(α 0 , α 1r,2r ).
The above analysis leads to the optimal power control parameter, which satisfies OP2 under Case 2 as Combining the optimal power control parameters of Case 1 and Case 2, we have the following optimal power control solution: Next, we assume a fixed t and ρ to derive the optimal power allocation ratio using the derived optimal power control parameters and the peak power limit of the source nodes.The derived theorem is described in Theorem 1.
Theorem 1.For fixed parameters t and ρ, the optimal power allocation ratio that satisfies OP1 is obtained as where α * is the optimal power control parameter that maximizes the minimum cognitive user transmission rate without considering the source peak power limits, A = Proof.When a limit on the source's peak power is not enforced, the source transmit power can be written as with optimum value α * .With a peak power constraint, P 1 and P 2 are separated into four cases based on the value of α * according to Equations ( 21)-( 22): Thus, if α * ∈ {B1, B2, B3, B4}, then the power control for such cases is guaranteed.Further, by substituting the obtained P 1 and P 2 into Equation ( 23), P R can be derived as shown in Equation (45).
The power allocation algorithm with fixed t and ρ is given in Algorithm 1.

Optimal JoTAPS Scheme
This subsection derives the jointly optimum time allocation and power splitting ratio (JoTAPS) with a fixed transmit power.Firstly, we give an initial power control parameter α = α.Then, the source nodes can determine the transmit power according to Equations ( 21) and (22) By combining Equations ( 15) and ( 17), the rate region of the BC phase and the relay transmit power limit can be rewritten as where . The constraint of rate region C3 and C4 can be rewritten as where Substituting the rewritten constraint in Equations ( 48), ( 50) and ( 51) into OP3, we have Some analysis reveals that with respect to ρ (fix t) or with respect to t (fix ρ), OP4 is a convex optimization problem.Theorem 2. Given α = α, OP4 is a convex optimization problem with respect to ρ (fix t) or with respect to t (fix ρ).
Proof.To prove that OP4 is a convex optimization problem, we need to prove that the objective function and the constraint are both convex or affine functions.The objective function of OP4 is a constant, and when t and ρ are fixed, C2 is a linear function.Then, we derive the convex function properties of C3 and C4.
When t is fixed, the first and second derivatives of C(Υ a ), a = {1r, 2r, MA} in g 1 (t, ρ) with respect to ρ are From the properties of convex functions, we conclude that if f (x) and g(x) are convex (or concave) functions, then min( f (x), g(x)) are convex (or concave) as well.Clearly g 1 (t, ρ) is concave with respect to ρ when t is fixed; thus, C3 is concave with respect to ρ.When t is fixed, the first and second derivatives of where < 0. Thus, we can conclude that g 2 (t, ρ) is concave with respect to ρ, and so is C4 with respect to ρ.Now, we analyze the case when ρ is fixed.In this case, t • g 1 (t, ρ) is a linear function.The second derivative of (1 Thus, OP4 is a concave function with respect to ρ (when fixing t) or with respect to t (when fixing ρ).Theorem 2 is proved.
Based on Theorem 2, we propose an alternating iterative optimization algorithm to determine optimal value of t and ρ.The first step of the algorithm is to solve for the optimal ρ with a given t.Let k be the iteration number.We can calculate the optimal value ρ of the k + 1 iteration by solving OP5-a, which is written as Then, by substituting the optimal value ρ (k+1) of the kth iterations into OP5, the optimal value t (k+1) of the kth iterations can be calculated by solving OP5-b, which is written as OP5-b : max R min,b s.t.C2 : Since OP5-a and OP5-b are convex optimization problems, the value of ρ (k+1) and t (k+1) in each iteration can be solved by using CVX toolbox.With the solved ρ (k+1) and t (k+1) , an alternative algorithm named jointly-optimum time allocation and power splitting (JoTAPS) is designed as shown in Algorithm 2, where ε is the given allowable deviation.

Joint Power Control and Resource Allocation
Based on Algorithms 1 and 2, a stepped alternative optimization algorithm to solve OP1 is proposed.The idea of the algorithm is as follows: Step 1: Give an initial power allocation ratio.
Step 3: Substitute the obtained ρ (k+1) and t (k+1) as the initial value of Algorithm 1 to obtain the power allocation ratio.
Finally, the optimized parameter value of OP1 is satisfied by solving the above three steps iteratively.The proposed algorithm is described below.

Numerical Results and Discussions
In this section, we provide simulation results to evaluate the proposed joint power control and resource allocation (JPCRA) scheme.Based on the above derivation, we notice that the system parameters such as interference threshold, power and PU interference value have an effect on the max-min achievable rate R min .Thus, in the simulation part, we first simulate and analyze the effects of these parameters on R min in Figures 2-4.To better demonstrate the superiority of the proposed JPCRA scheme, we compare it with the end-to-end achievable rates in Figure 5 and compare with three traditional optimization schemes in Figures 6-8.The three traditional optimization schemes are the joint optimal power splitting and power allocation with fix time (FT-JoPSPA) scheme, joint optimal time and power allocation with fix power splitting (FPS-JoTPA) scheme and joint optimal time and power splitting with fix power allocation (FPA-JoTPS) scheme.
The max-min achievable rate vs. the end-to-end achievable rates aiming at maximizing the achievable sum rate.

Simulation Setup and Parameters
The effectiveness of our proposed scheme is evaluated through experimental simulations in which the full-band urban indoor communication environment is taken into account.Let d i be the distances between the source S i and relay R, let d iu be the distances between the source S i to primary user PU and let d ru be the distances between relay node to PU, respectively.We consider h i = d −m i , l i = d −m iu and l r = d −m ru as the channel gains between source S i and relay R, between source S i and the PU and between the relay node and the PU, respectively, where m = 2 is the pass loss exponent.The channel gains between two nodes are reciprocal.To simplify the analysis, we consider a case where the sources and the relay are on a straight line, with d 1 + d 2 = 10 m and d 1u = d 2u = 5 m.The interference effects from the PU to each secondary users are assumed to be the same, i.e., σ 2 u,r = σ 2 u,i = σ 2 u .The noise power is given as σ 2 = 10 −9 W. Since full-band communication is considered, the frequency band is normalized as B = 1.The parameters used in the simulation part are listed in Table 1.The distances between source S i and primary user PU 5 m m The pass loss exponent 2 σ 2  The white noise power at R 10 The interference caused by the primary network (10 −10 , 1) W P i,max The peak power of source node S i (0, 30) dBm Q The interference threshold (−10, 30) dBm

The Effects of Parameters on R min
The efficiency of the parameter settings in the proposed technique is investigated according to the following figures.
Figure 2 depicts the influence of the interference tolerance threshold on the max-min cognitive user achievable rate.The simulation parameters are (d 1 , d 2 ) = (3, 7) m, P 1,max = (20, 15) dBm, P 2,max = 15 dBm, σ 2 u = {10 −6 , 10 −4 } W. It is observed that when Q increases to a certain value, R min initially increases and eventually saturates because the transmission rate of the cognitive user is a monotonically increasing function of transmit power P i which is affected by Q and the peak power constraint.When Q is sufficiently small, the transmit power is mainly limited by Q.When Q increases to a certain value (larger than the peak power), the transmit power is determined mainly by the peak power.Thus, increasing Q cannot continuously improve the achievable sum-rate.
Figure 3 depicts the the max-min achievable rate versus the peak power of source S 1 when (d 1 , d 2 ) = (3, 7) m, Q = (5, 10) dBm, P 2,max = 15 dBm and σ 2 u = {10 −6 , 10 −4 }.The result shows that R min increases and tends to saturate to a stable value as P 1,max increases to a certain value.This is because R min is proportional to the user's transmission power, and the cognitive user transmission power is affected by Q and P i,max , i = 1, 2 at the same time.When P 1,max is small, the user's transmit power is limited by P 1,max .Thus, R min increases as P 1,max .As P 1,max increases to a certain value (larger than Q), however, the user's transmit power is limited by Q.Thus, continuing to increase P 1,max will not change R min .
Figure 4 depicts the max-min achievable rate obtained by the proposed JPCRA scheme versus the interference power σ 2 u caused by the primary user to cognitive user when (d 1 , d 2 ) = (3, 7) m, Q = (5, 10) dBm, P 1,max = (20, 15) dBm, P 2,max = 15 dBm.It can be seen that R min decreases and approaches 0 as σ 2 u increases, and the descend degree of JPCRA is more gradual.Thus, consideration of the interference from the primary network is necessary to design the cognitive energy harvesting system.

Performance Analysis with Comparative Schemes
The superiority of the proposed scheme with benchmark schemes is investigated according to the following figures.
Figure 5 compares the max-min achievable rate (R min ) obtained by the proposed JPCRA scheme with the end-to-end rate(we use R i , i = 1, 2 to denote the transmission rate of source (S i ) obtained by the maximization of achievable sum-rate (MASR) scheme [28].In the simulation, parameters are set as d 1 = {2, 3} m, d ru = {2, 3} m, (P 1,max , P 2,max ) = (15,15) dBm and σ 2 u = {10 −6 , 10 −4 } W. This figure clearly shows that R 2 increases as d 1 decreases while R 1 remains the same, and R min also increases, because the system with MASR scheme will enhance the transmission ability of the high channel quality to maximize the achievable sum-rate.Thus, the MASR scheme cannot enhance the transmission ability of the poor channel, which may mean that the transmission links with a poor channel quality do not meet their transmission needs.The proposed scheme considers the transmission fairness, effectively alleviating the effects caused by channel asymmetry.Therefore, when the system values the enhancement of the worst performance of a user most in this system, it is better to choose the proposed JPCRA scheme, whereas when the system values the enhancement of the system achievable sum rate most, it is better to choose the MASR scheme.
Figure 6 depicts the max-min achievable rate of the proposed JPCRA scheme versus the interference threshold of the primary network, which is compared with three traditional transmission schemes.The parameters are set as (d 1 , d 2 ) = (3, 7) m, P 1,max = 20 dBm, P 2,max = 15 dBm, σ 2 u = 10 −6 W. It is observed that as the interference threshold increases, the R min of each scheme increases and saturates at a stable value.This is because the transmit power of the cognitive user is mainly constrained by power control when the interference threshold is small; as the interference threshold increases to a certain value, the transmit power of the cognitive user is mainly constrained by the peak power limits.In this figure, the peak power is a fixed parameter; thus, the performance saturates as the interference threshold increases to a certain value.The proposed JPCRA scheme outperforms the other three schemes in the whole range of Q.When the interference threshold is a high SNR value, it is possible to use the FPA-JoTPS scheme to replace JPCRA, and when the interference threshold is a low SNR value, it is possible to use the FT-JoPSPA scheme to replace JPCRA.
Figure 7 shows the max-min achievable rate versus peak transmit power limit P 1,max .The proposed JPCRA scheme is compared with three traditional transmission schemes.The parameters are set as (d 1 , d 2 ) = (3, 7) m, Q =5 dBm, P 2,max = 15 dBm and σ 2 u = 10 −6 W. The proposed JPCRA scheme clearly outperforms the three other schemes.It is also observed that as P 1,max increases, the R min of each schemes increases and saturates to a stable value.This is because the transmit power is curved by the PU threshold, which curves the transmission rates.In this figure, it can be seen that when the peak transmit power limit P 1,max is sufficient, the proposed JPCRA scheme is definitely a good choice, whereas when P 1,max is in the low-SNR range, it is possible to use the three other traditional schemes to replace JPCRA.
Figure 8 illustrates the max-min achievable rate versus the interference power σ 2 u caused by the primary user to cognitive user.The parameters are set as (d 1 , d 2 ) = (3, 7) m, Q = 5 dBm, P 1,max = 20 dBm and P 2,max = 15 dBm.As σ 2 u increases, R min decreases and approaches 0. The proposed JPCRA scheme outperforms the other three schemes, and the smaller σ 2 u is, the greater the performance gap, while a smaller σ 2 u results in a greater impact of resource allocation.Considering the influence of σ 2 u on system performance, the primary user with less influence on the secondary network interference should be selected for spectrum sharing when selecting spectrum sharing objects.

Results Discussion
In this study, we investigate the performance of the proposed JPCRA scheme from two aspects.Firstly, the impact of parameter settings on the max-min achievable rate R min is investigated.We verify that the performance growth of R min is curved by the interference threshold of PU and the maximum transmit power constraint that echoes the cognitive twoway relay network should guarantee the performance of the primary network primarily.Then, the performance of the JPCRA scheme is compared with the MASR scheme, showing that the JPCRA scheme is preferable to enhance the poor channel quality.Last but not least, the superiority of the JPCRA scheme is verified compared with three traditional optimization schemes (FT-JoPSPA, FPS-JoTPA, FPA-JoTPS), which shows that the proposed JPCRA scheme achieves high performance in the whole range of adjustable SNR (changing Q, or P 1,max , or σ 2 u ).

Conclusions
We have studied the power control and resource allocation problem in the energy harvesting cognitive two-way relay network using the PS protocol and DF relay forwarding protocol.By considering transmission fairness, a joint resource allocation problem under power control is established, which aims at maximizing the minimum transmission rates of the cognitive users.To solve the optimization problem, a stepped alternative optimization algorithm, named the JPCRA scheme, is proposed to obtain the optimal parameter values.Simulation results have shown that the proposed scheme can improve the transmission performance of the cognitive users with poor channel quality and have verified its superiority compared with three other traditional schemes.
Further, we intend to expand the proposed scheme in an indoor relay environment or unmanned aerial vehicle cognitive relay circumstance.Another interesting further research direction could be exploring deep learning for cognitive relay selection.

Figure 1 .
Figure 1.System model for two-way cognitive relay network.

Figure 2 .
Figure 2. The max-min achievable rate vs. the interference threshold of PU.

Figure 3 .Figure 4 .
Figure 3.The max-min achievable rate vs. the maximum transmit power constraint of S 1 .

Table 1 .
The parameters used in the simulation part.