Non-Cooperative Spectrum Access Strategy Based on Impatient Behavior of Secondary Users in Cognitive Radio Networks

: In the cognitive radio network (CRN), secondary users (SUs) compete for limited spectrum resources, so the spectrum access process of SUs can be regarded as a non-cooperative game. With enough artiﬁcial intelligence (AI), SUs can adopt certain spectrum access strategies through their learning ability, so as to improve their own beneﬁt. Taking into account the impatience of the SUs with the waiting time to access the spectrum and the fact that the primary users (PUs) have preemptive priority to use the licensed spectrum in the CRN, this paper proposed the repairable queueing model with balking and reneging to investigate the spectrum access. Based on the utility function from an economic perspective, the relationship between the Nash equilibrium and the socially optimal spectrum access strategy of SUs was studied through the analysis of the system model. Then a reasonable spectrum pricing scheme was proposed to maximize the social beneﬁts. Simulation results show that the proposed access mechanism can realize the consistency of Nash equilibrium strategy and social optimal strategy to maximize the beneﬁts of the whole cognitive system.


Introduction
With the prosperity of the ubiquitous sensing technology, wireless sensor networks (WSNs) are widely used in environmental monitoring, smart home, medical systems, space exploration and many other areas [1][2][3][4][5][6][7]. As an important form of the underlying network technologies of Internet of Things (IoT) [8,9], the node deployment cost [10,11], limited energy [12] and the shortage of wireless spectrum resources [13] has restricted the development of WSNs. In order to solve the spectrum crisis, cognitive radio (CR) technology is becoming a hot research topic [14,15]. The core idea of CR is to make the radio devices intelligent and able to perform cognitive behaviors such as sensing, reasoning, learning, decision making and executing [16][17][18], which is consistent with the idea of artificial intelligence (AI) [19][20][21][22][23] to some extent. Especially in the distributed CRN, cognitive users have no knowledge of the system parameters, and can only make corresponding spectrum decision based on historical data by using their own learning ability.
There are two types of users in cognitive radio networks (CRNs), which can be called primary user (PU) and secondary user (SU), respectively. The PU obtains the fixed spectrum resources through purchase. Thus, the PU is the legal owner of the fixed spectrum and also can be called the licensed user. The SU, which is also known as the cognitive user, can only detect the idle spectrum through spectrum sensing, and then opportunistically accesses the idle spectrum to avoid interfering with the PU.
Since SUs need to share the spectrum with the PU, when an SU opportunistically utilizes the licensed spectrum, the SU may need to wait for the PU or other SUs to finish the transmission, thereby increasing the sojourn time of the SU in the system. Moreover, in order not to affect the performance of the PU, the SU must suspend its transmission in time to wait for PU to leave or switch to another idle channel to continue its transmission [24]. Queueing theory can effectively characterize the heterogeneity of different access channels and the influences of PU behavior on the communication performance of SU. Therefore, it is especially suitable for analyzing the sojourn time of SU. At present, many scholars have used queueing theory to study the dynamic spectrum access performance of SUs [25][26][27][28][29].
In [30], in view of the traffic loads of SUs, the authors designed the sensing-based and the probability-based spectrum decision schemes to minimize the overall sojourn time of SUs based on the preemptive resume priority (PRP) M/G/1 queueing theory. In [31], considering a dynamic spectrum access system in which SU could choose to either rent a licensed dedicated band or to use spectrum holes, the authors analyzed the equilibrium behavior for the SU decision strategies and applied the analysis results to maximize the revenue from renting dedicated bands to SUs based on the server-breakdown queueing model. In [32], the authors used the preemptive priority queueing model to investigate the different equilibrium strategic behaviors of SUs with no queue length information, partial queue length information and full queue length information. However, all the papers mentioned above assume that once the SU chooses to access the spectrum, it will not leave the system until the service transmission is completed. That is to say, the impatient behavior of SU is not considered. The impatient behavior in queueing theory means that when a customer arrives at the system, it will join the queue and wait for service if there is no idle server. But when the waiting time exceeds its endurance time, the customer will choose to leave because of impatience. In real wireless communication systems, if the services of SUs have strict requirements on the transmission delay, such as the delay-sensitive multimedia services and voice services which need to meet some delay constraints, SUs will still choose to renege from the waiting line, even if they have decided to access the spectrum once the waiting time in queue exceeds the tolerable time of SUs. In [33] and [34], the impatient behavior of SUs was considered when the spectrum access strategies were designed in CRN. However, the authors only analyzed the traffic model in which the service times of SU and PU followed the negative exponential distribution in [33], and in [34], the service times of PU and SU were supposed to follow geometrical distribution. Therefore, both of the analytical models have limitations for their application. In this paper, we use the queueing theory to obtain the Nash equilibrium and socially optimal access strategies for SUs with impatient behavior in CRNs. Similar research focusing on SU access strategies from the economic aspects can be found in [35] and [36]. In [35], the authors studied the queueing control in cognitive radio systems and achieved the individually and socially optimal strategies based on the observable queue system in which SUs could know the current queue length when they arrived. In [36], for simplicity, the service times of PUs and SUs were assumed to be exponentially distributed and the scenario was modeled as an M/M/1 queueing game with server breakdowns where each SU wanted to optimize its benefit in a selfish distributed manner. At the same time, neither article takes into account the impatience of SUs. Therefore, considering the fact that if its waiting time exceeds the tolerance limit, the SU may give up the transmission and renege from the queue before entering service, we propose a repairable M/G/1+M queueing model with balking and reneging based on the queueing theory with impatient customers to extend the analysis model to more general cases. Then, we investigate the expected waiting time and actual transmission time of SUs by employing the system model and analyze the Nash equilibrium access strategy and socially optimal access strategy. On the basis of the analysis results, the corresponding spectrum pricing scheme is developed to maximize the social welfare.
The rest of this paper is organized as follows. In Section 2, we present the system model to characterize the spectrum access strategy and evaluate the waiting time and actual service time of SUs. Next, we propose a reasonable spectrum pricing scheme to maximize the social welfare by comparing the Nash equilibrium access strategy with socially optimal access strategy in Section 3. Then, numerical results are shown in Section 4. Finally, we give our concluding remarks in Section 5.

System Model and Problem Statement
At present, the architecture of CRN can be either centralized or distributed architecture [37]. In a centralized architecture, SU decisions are based on the information of the channel state and queue length obtained from the centralized controller, which may lead to large communication overhead. Moreover, the deployment of the centralized controller may not be feasible in some environments, such as in ad hoc or sensor networking environment. Therefore, we only consider the distributed architecture in this paper. Without the coordination and control of the centralized controller in the distributed CRN, the SU cannot accurately know the number of other ones sharing the same channel. The unobservable queue model is used to depict the non-cooperative and distributed nature of SUs. The queue is only a virtual queue which can describe the congestion between SUs accessing the same channel. The SU sojourn time analysis is based on the statistical information of the behaviors of PU and SU. We suppose there is only one licensed channel in the system. SU must ensure PU has absolute priority in accessing the channel and suspend its transmission immediately to wait for the PU to leave once the PU is detected during the transmission. SUs can be treated as the customers in the queueing theory and receive services according to the first-come first-served (FCFS) principle. Moreover, the licensed channel can be considered as a server working at ON/OFF state. The ON state means the channel is not occupied by the PU currently, and can be regarded as idle or working normally for the server to provide services to customers, while the OFF state represents the channel being occupied by the PU and unable to support services to SUs. That is, the occurrence of the PU can be treated as a breakdown of the server and the transmission time of PU as the repair time. For simplicity, we assume that no other breakdown will occur during the repair period. The actions of PU and SU are assumed to follow independent Poisson processes with generalized time distribution of transmission duration. The transmission times are independent and identically distributed. Let λ p and λ s represent the average arrival rates of PU and SU, respectively. X p denotes the transmission duration of PU, and X s denotes the effective transmission time of SU without interruptions from PU. Furthermore, let F p (x), f p (x), F s (x), f s (x) denote the cumulative distribution function (CDF) of X p , the probability density function (PDF) of X p , the CDF of X s and the PDF of X s , respectively.
Due to the spectrum scarcity, the SUs must compete for the limited idle spectrum resources in CRN, and each of them wants to maximize its own benefit through the best decision. The non-cooperative game theory can be used to model and analyze the interaction between SUs [38]. Each SU is assumed to be risk neutral, that is, it only has interests in maximizing the expected value of its own benefit [39]. Therefore, we assume that when each SU arrives at the system, it will decide whether to join the waiting queue of the channel and share the spectrum resources with other SUs according to its estimated net welfare. To model the decision process, we assume all SUs that successfully complete the transmission can get a fixed service benefit R. On the other hand, each SU incurs a holding cost of C per unit of time in the system, which can be regarded as a penalty for the delay or traffic congestion. If an SU spends T s time units in the system, its expected net benefit is (denoted by U s ): According to (1), the SU then accesses the channel and joins the queue to wait for the transmission if the value of U s is nonnegative. Otherwise, the SU balks because the sojourn time in the system is too long resulting in negative expected net benefit. Based on the non-cooperative game theory, SU makes a decision as to whether or not to join the channel according to the strategy of maximizing its own benefit, since each SU is a rational player. The decision strategy of SU can be described by a fraction q(0 ≤ q ≤ 1), which is the joining probability. Given the joining rule q, if the potential arrival rate of SUs is λ s , then the effective arrival rate to access the channel is λ s q. In addition, due to the limitation of the delay requirement, an SU will choose to give up the transmission and leave the system, if its required waiting time exceeds the tolerable waiting time τ before it gets the right to use the channel. We assume that the tolerable waiting time τ follows exponential distribution with mean 1/r. The M/G/1 + M queueing theory with impatient customers can be used to model and analyze the system. Moreover, we assume that SUs will not leave the system until finishing the whole transmission once they start transmitting.
From (1), we can see that the net welfare of SU mainly depends on its sojourn time in the system, because the service benefit R and unit cost C are fixed. The sojourn time of SU consists of the waiting time in the queue and the actual transmission time. Then we will analyze the waiting time and transmission time based on the system model. Let Wsn be the required waiting time of SU S n (n = 1, 2, . . .) before service, then define: For each required waiting time y ≥ 0, let P α n (y), (P α n (y)) denote the conditional probability of SU giving up the transmission and choosing to leave because of impatience (completing the transmission), so we have: From (2), P α n (y) + P α n (y) = 1, y ≥ 0. We apply the level crossing methods to study the required virtual waiting time of SUs. Considering that the SUs whose waiting time exceeds τ will renege from the queue before the start of service, they add zero to the required wait of any SU. Thus for the sample path of virtual wait for SUs, the system point (SP) will jump if the event W sn = y > 0, α n = 1 occurs. Otherwise, SP makes no jump if the event W sn = y > 0, α n = 0 occurs, since S n receives zero service time. Let W s (t) (t ≥ 0) denote the virtual waiting time of a would-be time-t arrival with the probability density function f sw (x). The sample path of W s (t) (t ≥ 0) is shown in Figure 1. maximizing its own benefit, since each SU is a rational player. The decision strategy of SU can be described by a fraction q (  01 q ), which is the joining probability. Given the joining rule q, if the potential arrival rate of SUs is s, then the effective arrival rate to access the channel is sq. In addition, due to the limitation of the delay requirement, an SU will choose to give up the transmission and leave the system, if its required waiting time exceeds the tolerable waiting time τ before it gets the right to use the channel. We assume that the tolerable waiting time τ follows exponential distribution with mean 1/r . The M/G/1+M queueing theory with impatient customers can be used to model and analyze the system. Moreover, we assume that SUs will not leave the system until finishing the whole transmission once they start transmitting.
From (1), we can see that the net welfare of SU mainly depends on its sojourn time in the system, because the service benefit R and unit cost C are fixed.
From (2), . We apply the level crossing methods to study the required virtual waiting time of SUs.
Considering that the SUs whose waiting time exceeds τ will renege from the queue before the start of service, they add zero to the required wait of any SU. Thus for the sample path of virtual wait for SUs Assuming renegers arrive at an and stayers arrive at bn, n = 1, 2, …, we denote the actual transmission time of SU by Gs, and let Gs(x) be the corresponding distribution function. According to Assuming renegers arrive at a n and stayers arrive at b n , n = 1, 2, . . . , we denote the actual transmission time of SU by G s , and let Gs(x) be the corresponding distribution function. According to the level crossing methods, for any fixed level x > 0, the system point downcrossing and upcrossing rates of level x are equal in the steady state. Therefore, we can get: (4) where P 0 represents the stationary probability of the channel being idle and P 0 can be expressed as follows: Moreover, F p (x) and G s (x) denote the complementary CDF of F p (x) and G s (x), respectively. Thus, we have: In (4), the left side is the SP downcrossing rate of level x, while the right side consisting of three terms is the SP upcrossing rate of level x. The first term is the rate of SP jumps starting from level 0 due to PU arrivals that upcross level x. The jump size has the CDF F p (x). The second term is the SP upcrossing rate of x due to effective SU arrivals, when the channel is idle. Because the waiting time of those SUs equals zero, the corresponding SP jumps start from level 0 and the jump size has the CDF G s (x). The third term is the upcrossing rate of x by SP jumps at arrival instants when the virtual wait is at state-space levels y (0 < y < x). For each required waiting time y ≥ 0, the probability of SUs staying for successful transmission is P α n (y). When the actual transmission time of SUs is longer than x − y, the SP can upcross level x. The three terms on the right side of equation (4) account for the total rates at which SP jumps upcross level x. Moreover, because of the impatience, the required waiting time of SUs cannot be more than τ. Hence, we can obtain: f sw (x) = λ p P 0 F p (x) + λ s qP 0 G s (x) + λ s q x y=0 G s (x − y)e −ry f sw (y)dy (7) Since (7) is a Volterra integral equation of the second kind, it can be solved by the classical successive substitution method. The initial step is: which is substituted in the integral of (7) for f sw (y) to complete the second iteration as: Repeating the substitution, then we can obtain: The Laplace-Stieltjes transform (LST) of the CDF for X p is given by: Consequently, Considering the effect of PU preemption on SUs, the LST of the CDF for the actual transmission time of SUs can be written as: where F * s (s) is the LST of the effective transmission time of SUs. Hence, we have: Taking the LST of both sides of (8), (9) and (10) yields: where φ * n (s) is the LST of φ n (s). If φ * n (s) converges, its limit is the LST of f sw (x) in (7). We can prove the convergence of φ * n (s). The proof is given in the Appendix A. Therefore, the LST of the PDF for the virtual waiting time of SUs is: Then, the idle probability P 0 can be obtained according to (5) and (16): We take the LST of both sides of (7) and have: Letting s → 0 gives: where E(G s ) means the expected actual transmission time of SUs given by: For SU customers, let W sv , W st , W s respectively denote the virtual waiting time in queue before service, the time SU will wait to enter the server before reneging and the time spent in the queue regardless of whether or not SU reaches the server. Thus, we have [40]: x f sw (y)dy (21) and Hence, the proportion of SUs that get complete service is:

Spectrum Access Strategies of SUs
The spectrum access strategies of SUs can be described by a fraction q(0 < q < 1), which is the probability of joining. Let U S (q) denote the average net benefit of SUs which choose to access the channel with the probability q. Considering that each SU can receive a reward of R due to transmission completion, from (1) U S (q) can be written as:

Individual Equilibrium Strategy
Each SU wants to obtain a non-negative benefit value. Let q e denote the individual equilibrium access strategy and no SU can improve its own benefit by unilaterally changing the strategy under the Nash equilibrium state. For a new arrival SU, if U S (0) ≤ 0, then even if there is no other SU to share the spectrum, the tagged SU has to suffer a non-positive benefit by joining. It implies none of the SUs will choose to access the channel no matter what channel state (even if the channel is idle). In order to avoid trivialities, we assume U S (0) > 0. If there exists a unique access probability q e (0 < q e < 1) satisfying U S (q e ) = 0, then the strategy of SUs joining with probability q e is the unique Nash equilibrium mixed strategy. In the case U S (1) > 1, even if all SUs choose to access the channel, they can still enjoy a non-negative benefit, so the strategy of joining with probability q e = 1 is the unique equilibrium strategy.
In summary, it is a Nash equilibrium strategy for SUs to access the channel with probability q e if q e satisfies the following conditions:

Socially Optimal Strategy
If the wireless spectrum resource is regarded as a public good, then the best way to share it is to maximize the expected net benefit of the whole cognitive system per time unit which is also known as the social benefit (denoted by S o ). Therefore, the aim of socially optimal strategy is to find a joining probability q o to maximize the net benefit S o ,which is given by:

Spectrum Pricing
Since SUs act only in accordance with the individually optimal decisions and don't consider the negative externalities which they impose on the later arrivals, a discrepancy between the individually and socially optimal behaviors may result. A cost charged to SUs may effectively motivate them to adopt the strategy to improve the whole social benefit. The pricing mechanism does not deny individual rationality, but achieves collective rationality under the premise of satisfying individual rationality and maximizes the benefit of the whole society. When a spectrum access fee m is charged, the net benefit function of SUs becomes: Then, if each SU follows its equilibrium strategy, a new equilibrium joining probability can be obtained. According to (24) and (27), in the state of equilibrium we can get the spectrum access fee m by solving the equation as follows: More concretely, finding out the socially optimal joining probability q o from (26) and substituting q o instead of q e in (28), we can get m by solving (28). The spectrum access fee achieves the consistency between the individual and social targets. Since the exact expression of the optimal spectrum price cannot be obtained, its existence and validity can only be illustrated by numerical examples.

Numerical Results
In this section, we propose some numerical results to illustrate the equilibrium and socially optimal access strategies. In order to make the analysis results more general, we consider two kinds of user models called Exp model and ErlExp model. Furthermore, we define f τ (x) = re −rx as the PDF of the patience time of SUs which follows the exponential distribution.
(a) Exp Model The effective transmission time of PUs and SUs is assumed to be exponentially distributed with PDF f p (x) = µ p e −µ p x and f s (x) = µ s e −µ s x , respectively. From (11) and (13), we have: The transmission time of PU is assumed to follow second-order Erlang distribution having PDF f p (x) = µ 2 p xe −µ p x , and the effective transmission time of SU is exponentially distributed having PDF f s (x) = µ s e −µ s x . Therefore, we get: Figure 2 shows the effect of spectrum access fee m on the social benefit S and the SU equilibrium joining probability q e . The system parameters are set as follows: (a) Exp model: R = 12, C = 1, r = 0.2, λ p = 0.3, µ p = 0.5, λ s = 1.5, µ s = 2; (b) ErlExp mode: µ p = 0.9, the other parameters are the same as those in Exp model. The number of iterations is 3. As can be seen in Figure 2, as the spectrum price increases, the equilibrium joining probability of SUs decreases rapidly, while the social benefit increases first, and then decreases gradually after reaching the maximum value. For example, when the access fee m = 1.7, the social benefit reaches a maximum in the Exp model. If the fee continues to increase, the social benefit will begin to decline. When the access cost plus the delay loss of SUs exceeds the fixed return R that means negative net benefit, no SU is willing to access the channel, resulting in zero access probability and zero social benefit. The above results show that if the access fee is too high, the SUs will refuse to use the spectrum because of the high cost, resulting in insufficient utilization of spectrum resources; and if the access fee is too low, it will lead to an increase of SU access demands, resulting in excessive system congestion and the decline of social benefit. Therefore, only when a reasonable price is found can the social benefit be maximized. It also illustrates the effectiveness of the spectrum pricing mechanism. The pricing mechanism can effectively control the access number of SUs, and realize the consistency of the individual equilibrium strategy and the socially optimal strategy to finish the rational allocation of spectrum resources. increases first, and then decreases gradually after reaching the maximum value. For example, when the access fee . m  17, the social benefit reaches a maximum in the Exp model. If the fee continues to increase, the social benefit will begin to decline. When the access cost plus the delay loss of SUs exceeds the fixed return R that means negative net benefit, no SU is willing to access the channel, resulting in zero access probability and zero social benefit. The above results show that if the access fee is too high, the SUs will refuse to use the spectrum because of the high cost, resulting in insufficient utilization of spectrum resources; and if the access fee is too low, it will lead to an increase of SU access demands, resulting in excessive system congestion and the decline of social benefit. Therefore, only when a reasonable price is found can the social benefit be maximized. It also illustrates the effectiveness of the spectrum pricing mechanism. The pricing mechanism can effectively control the access number of SUs, and realize the consistency of the individual equilibrium strategy and the socially optimal strategy to finish the rational allocation of spectrum resources.  Figure 3 shows the variation of social benefit S with the parameter r under the no-charge access mechanism and the optimal spectrum pricing mechanism in the Exp model. Except the spectrum access fee m, the other system parameters are the same as those in Figure 2. As shown in Figure 3, the social benefit gradually increases with the raise of r. This is because as the parameter r increases, the number of SUs who choose to renege grows, resulting in a decrease of the average waiting time of SUs, and the access probability of SUs becoming bigger. At the same time, the increase of r reduces the waiting loss of SUs, so it can bring more benefit to the whole society. Moreover, comparing with the no-charge mechanism, we can see that when the parameter r is small, the socially optimal pricing mechanism improves the social benefit more effectively. Only when r is increased to a certain value, the socially optimal access probability becomes 1, that is, no access control for SUs needs to be performed, and the no-charge access mechanism is consistent with the socially optimal pricing mechanism.  Figure 3 shows the variation of social benefit S with the parameter r under the no-charge access mechanism and the optimal spectrum pricing mechanism in the Exp model. Except the spectrum access fee m, the other system parameters are the same as those in Figure 2. As shown in Figure 3, the social benefit gradually increases with the raise of r. This is because as the parameter r increases, the number of SUs who choose to renege grows, resulting in a decrease of the average waiting time of SUs, and the access probability of SUs becoming bigger. At the same time, the increase of r reduces the waiting loss of SUs, so it can bring more benefit to the whole society. Moreover, comparing with the no-charge mechanism, we can see that when the parameter r is small, the socially optimal pricing mechanism improves the social benefit more effectively. Only when r is increased to a certain value, the socially optimal access probability becomes 1, that is, no access control for SUs needs to be performed, and the no-charge access mechanism is consistent with the socially optimal pricing mechanism.

Conclusions
In this paper, we consider the impatient behavior of SUs and study the spectrum access strategies for them based on a repairable M/G/1+M queueing model with balking and reneging. Applying the proposed system model, we compare the individual equilibrium and socially optimal access strategies for SUs and give the corresponding spectrum pricing mechanism to maximize the social benefit. Numerical results demonstrate our analysis and prove the effectiveness of the pricing mechanism.

Conclusions
In this paper, we consider the impatient behavior of SUs and study the spectrum access strategies for them based on a repairable M/G/1 + M queueing model with balking and reneging. Applying the proposed system model, we compare the individual equilibrium and socially optimal access strategies for SUs and give the corresponding spectrum pricing mechanism to maximize the social benefit. Numerical results demonstrate our analysis and prove the effectiveness of the pricing mechanism.