A Correlation-Based Sensing Scheme for Outlier Detection in Cognitive Radio Networks

: Cooperative spectrum sensing (CSS) is a vital part of cognitive radio networks, which ensures the existence of the primary user (PU) in the network. However, the presence of malicious users (MUs) highly degrades the performance of the system. In the proposed scheme, each secondary user (SU) reports to the fusion center (FC) with a hard decision of the sensing energy to indicate the existence of the PU. The main contribution of this work deals with MU attacks, speciﬁcally spectrum sensing data falsiﬁcation (SSDF) attacks. In this paper, we propose a correlation-based approach to differentiate between the SUs and the outliers by determining the sensing of each SU, and the average value of sensing information with other SUs, to predict the SSDF attack in the system. The FC determines the abnormality of a SU by determining the similarity for each SU with the remaining SUs by following the proposed scheme and declares the SU as an outlier using the box-whisker plot. The effectiveness of the proposed scheme was demonstrated through simulations.


Introduction
The services provided for the rapid growth of the applications, such as computers, laptops, ipads, internet of things (IoT), etc., have increased the demand of the spectrum, which results in spectrum shortage. According to the Federal Communication Commission (FCC), most of the spectrum is underutilized, even in the crowded region, the spectrum utilization is between 15% and 85% [1]. To tackle the issue of the spectrum underutilization, cognitive radio technology (CRT) has emerged as a strong candidate for exploiting the spectrum [2]. The main functionality of CRT is to determine the availability of the spectrum for the secondary users (SUs). For achieving the spectrum availability and efficient utilization of the spectrum, the SUs need to continuously monitor the available spectrum to find the spectrum holes and vacate the spectrum whenever the primary user (PU) appears in the network [3].
The spectrum sensing is performed to determine the presence or absence of the PU in the network. Different detection techniques, such as matched filter detection, cyclostationary feature detection, energy detection, etc., have been proposed in the literature [4,5]. Every detection technique has their own features. For example, feature detection is optimal, when the PU information is available at the SU. When the SU has no prior information about the PU, then the energy detection technique is the optimal detection technique. The SU has only knowledge about the local noise power. The received signal energy is utilized to decide about the existence of the PU in the network.
Various techniques are used to efficiently utilize the scarce spectrum by merging the underlay and overlay methods in hybrid cognitive radio networks [6]. However, spectrum sensing is highly vulnerable to fading and hidden terminal problems between the PU and the SUs [7,8]. Thus, the decision of the spectrum sensing performed by a single SU is neither sufficient nor reliable for final decision about the presence of the PU in the network. To overcome the problem of the single SU sensing, researchers take advantages of cooperative spectrum sensing (CSS) for the enhancement of spectrum sensing. In CSS, each single SU gathers information about the PU channel, and shares its local sensing information with the FC, which accumulates the sensing information from the SUs and declares the global decision about the presence of the PU in the network [9,10]. In CSS, the SUs send information to the FC in two ways. In the first scenario, the SU sends a single bit of information to the FC, which is also known as the hard-decision rule. In the second scenario, which is called the soft combination rule, the SU sends a sampled energy value to the FC [11,12].
However, the existence of malicious users (MUs), or outliers, in the network highly degrades the performance of CSS. Various attacks, which highly degrade the performance of the networks, have been studied in the literature. Two common attacks are primary user emulation attacks (PUEAs) and spectrum sensing data falsification (SSDF) attacks [13,14]. In SSDF attacks, the MUs falsify the sensing results, which influences the sensing results in two ways. First, it decreases the probability of detection, which ultimately decreases the spectrum utilization. Second, it increases the probability of misdetection and the probability of false alarm, which increases interference in the network. Thus, overall performance of CSS is degraded by the SSDF attacks. To mitigate the effects of these attacks, several schemes have been proposed [15][16][17]. In Reference [18], the impact of incorrect information of the sensing system was formulated as detection performance and sensing efficiency; additionally, an authentication code length was proposed to reduce the system overhead. The authors of [19] proposed a MU suppression scheme, which consists of an improved energy detector followed by a statistical algorithm implemented at the FC. The authors of [20] proposed a neighbor detection-based spectrum sensing algorithm in distributed CRNs, which detects attackers with the help of neighbors during spectrum sensing to improve the decision-making accuracy. In this algorithm, the extreme outliers are isolated in the cognitive radio ad hoc network via the modified Z-test, and then the q-out-of-m rule is implemented to mitigate the SSDF attack [21]. Similarly, the authors integrated the reputation and q-out-of-m rule mechanism to mitigate the effect of the SSDF attack [22,23]. The authors of [24] utilized a k-medoids clustering algorithm to mine the collection of sensing reports at the FC to determine the attacker's presence; additionally, the proposed scheme can be utilized on streaming data (sensing reports), and thereby detects and isolates the attackers existing in the networks. The intelligent MUs were accurately detected by the authors of [25], who used a physical-layer network coding scheme based on a novel scheme friend or foe (FOF) detection. A cross-layered approach was presented to make the SU able to differentiate between the PU and MU through the hidden Markov model at the medium access control (MAC) layer [26]. The authors of [27] took advantage of the compressive sensing to detect the attack and defensive behavior and proposed a densitybased MU detection with the trusted user to distinguish the MU precisely. A robust defense strategy against the MUs via double-sided neighbor distance-based genetic algorithm was presented in order to filter out the MU sensing reports in CSS [28]. The authors of [29] proposed a novel attacker identification algorithm that is able to skillfully detect attackers and reject their reported results. Moreover, a novel attacker punishment algorithm was provided with the aim of punishing attackers by lowering their individual energy efficiency, motivating them to quit sending false results. A comparative analysis of different outlier techniques was proposed by the authors of [30]. Similarly, a comparative analysis of the various outlier method for the MUs was discussed by the authors of [31]. The authors of [32] assigned a reputation value to the SU, while ignoring the SUs having a reputation below a threshold value. An extended sequential CSS scheme was proposed based on the value of each SU [33]. A critical analysis of the MU attack, i.e., always yes, always no, and random attack, was studied by the authors of [34]. An onion peeling approach based on the calculation of suspicious levels was proposed by the authors of [35], which used belief propagation as the detection method. Protection of the CSS method, mentioned by the authors of [32], was reduced in the case of a large number of MUs.
In this paper, we propose a correlation-based scheme at the FC to detect the outlier and its behavior. In the proposed scheme, the FC first collects sensing information from all individual SUs, and then applies the correlation tool in the difference of the result of each SU, with the collective sensing results of all the SUs. The proposed scheme at the FC detects the results of the normal SUs, which are dissimilar from those of the MUs. In the proposed scheme, the box-whisker plot is utilized to classify the outlier and normal SUs. The box-whisker plot defines the upper and lower quartile limits of the normal SUs. Through the proposed scheme, the outlier and normal SUs are classified. The proposed scheme is tested for the existence of opposite malicious users (OMUs) and random opposite malicious users (ROMUs). The OMUs always send a high-energy signal when the PU is absent and a low-energy signal when PU is present. The ROMU is more dangerous and difficult to cope with. The ROMU's behavior is unpredictable, it behaves as a normal SU, while appearing as MU with opposite behavior at random intervals of time. Unlike always yes and always no, both the OMU and ROMU increase the probability of a false alarm and misdetection, which both degrades the bandwidth utilization and increases the interference to the PU network. Through the simulation study, we demonstrated that the proposed scheme can successfully classify the response of both OMUs and ROMUs from the normal SUs in a delicate manner.
The remaining sections of this paper are organized as follows. In Section 2, a detailed description of the system model is presented. In Section 3, we discuss the proposed scheme and describe the steps required for the classification of the outlier from the normal users. The performance evaluation and discussion are presented in Section 4. Finally, the paper is concluded in Section 5.

System Model
We considered a cooperative spectrum sensing scenario in a cognitive radio network. We assumed that the total number of outliers/MUs in the network was less than the total number of normal SUs. The system model for the proposed scheme is shown in Figure 1. The SUs performed spectrum sensing and sent the report to the FC, for the presence of the PU in the network. The SUs forwarded a hard-binary decision 1 if the spectrum was occupied by the PU, and -1 if the spectrum was not occupied by the PU. The FC received the local sensing reports from all SUs. The FC then employed the proposed scheme on these reports to identify the SU as an outlier on the basis of the history of each SU energy report. Once the outliers were identified and removed, the FC then employed a simple rule to declare a global decision about the presence of the PU in the network. The information received at the receiver SU in a particular band for the presence or absence of the PU was represented as a binary hypothesis given as where H 0 is the absence hypothesis, H 1 represents the presence hypothesis of the PU in the network, y j (l) shows the received signal from the jth SU, n j (l) is the additive white Gaussian noise (AWGN) in the lth time slot for the jth SU, s(l) is the signal transmitted by the PU, and h j is the channel gain value between the PU and the SU in the lthtime slot. According to the hypotheses H 0 and H 1 , the received signal energy of the channel by the jth SU at the ith sensing interval is where K denotes the number of samples in the ith sensing interval. According to central limit theorem (CLT), when the value of K is large enough, then the energy reported by each SU converges to a Gaussian random variable under H 0 and H 1 , which can be formulated as [28]: where η j is the signal-to-noise ratio (SNR) between the jth SU and the PU, (µ 0 , σ 2 0 ) is the mean and variance under H 0 , and (µ 1 , σ 2 1 ) are the mean and variance under H 1 .

Proposed Scheme
In this paper, we proposed a correlation-based approach to identify the legitimate SUs and outliers. The box-whisker plot was introduced to classify the legitimate SUs from the outliers. In the proposed scheme, each SU senses the spectrum by utilizing the energy-detection technique and compares the received signal strength with a threshold value. On the basis of the sensing results, the SUs send a hard decision of 1 or −1 to the FC, which can be given as where E j (i) is the energy received by the jth SU in the ith sensing interval, γ j is the value of the threshold set for the jth SU, and Z j (i) is the jth SU decision of the PU signal in the ith sensing interval, representing 1 if the E j (i) is greater than the threshold value and −1 if energy of the received signal E j (i) is smaller than the threshold value. The FC collects spectrum sensing information of the individual SU results with its own local decision as where Z represents the sensing energy accumulated in the database of the FC by all the SUs' hard-decision values. In Equation (5), the rows represent the sensing intervals, and the columns represent the SUs' energy responses under each sensing interval. M denotes the total number of SUs including the normal SUs, the outlier/MU, and the FC information, and N is the number of sensing intervals. Furthermore, correlation was used as a tool for the detection of the most harmful and difficult detect OMU and ROMU users. The correlation coefficients for the two samples X and Y can be determined as are the mean values of the samples X and Y, respectively, and X p and Y p are the pth elements of samples X and Y, respectively. Equation (6) shows that the correlation of variable X taken with Y is the same as the correlation of Y taken with X.
Correlation is a statistical exercise that shows how intensely the pair of testers are related to each other. Equation (6) shows the value of r from −1, when both variables are in the opposite direction with a perfect negative correlation, to +1, when both variables are in the same direction with a perfect positive correlation. Effective use of this correlation process is a good measure of the relationship between the two variables when there is a chance of outliers, no normality, no steady variance, and nonlinearity existing between the two variables that are being examined.

Outlier Detection
All SUs send their sensing reports to the FC as shown in Equation (5). At the FC, a relationship is verified by comparing each SU's sensing decision with the other SUs, to determine any abnormal SU, which sends spectrum sensing falsification data to the FC. The FC is able to easily identify the both the OMU and ROMU category of outliers/MUs by the following three steps. 3.1.1.
Step One: Averaging Differences of the SUs In this step, the FC determines the difference in the sensing results of the jth SU with the rest of the SUs. First, the average of all the SUs' sensing decisions is calculated by neglecting the jth SU's sensing result in the ith sensing interval to find the impact of excluding this particular SU in the overall sensing result. The same process is performed for all the M SUs during eachNthsensing interval to find the average of each SU in the FC, determined as where M is the total number of SUs, N is the number of sensing intervals, m ij is the average value of the energy reports from all the other SUs in the ithsensing interval while ignoring the jth SU result. As the energy responses of both the OMU and ROMU are different from the rest, taking such outliers/MUs out of the average value calculation during each sensing interval by the FC generates a dissimilar result for the OMU and ROMU users compared with the normal SUs.
In order to estimate how much the individual sensing results of each SU, z ij , are behaving differently from the average value of other users' results, m ij , we considered the following where ∆d ij is the difference in the sensing results of thejth SU in the ithsensing interval,z ij is the individual sensing result of the jth SU in the ith sensing interval, and m ij is the average sensing result of the SUs other than the jth SU.

Step Two: SUs' Correlation
The FC measures the difference value of each SU with the rest of the SUs as in Equation (8). Once the difference between the SUs was determined, we utilized the correlation tool defined in Equation (6) for all SUs, which is measured as where ∆d ij and∆d ik are elements of the jth and kth user sample in the ithsensing interval, and ∆d j and ∆d k are the mean values calculated for the jthand kth SU samples.
The data of the normal SUs are separated from those of the outliers or MUs by adding all the correlation differences in (10) for each SU as follows.
Based on the results of Equation (10), the OMU had more negative values, followed by the ROMU, when compared with normal SUs. From Equation (10), the behaviors of all three categories of SUs, i.e., normal SUs, OMUs, and ROMUs, were identified. These behaviors identify the outliers in the network. These outlier values were further identified and classified in step three of the proposed scheme.

Step Three: Outlier Classification Using the Box-Whisker Plot
In the proposed scheme, we utilized the box-whisker plot to find both the OMU and the ROMU as outliers in the result of Equation (11). A box-whisker plot divides Equation (11) into four parts. First, the results are made in order form and they are divided into an upper and lower half by the median. The median of the lower half is named as the lower quartile, while the median of the upper half is stated as the upper quartile. The lower and upper extremes are marked as the least and greatest values of the results. All the SUs' results of step two were arranged in ascending order from the lowest to the highest. The median value of the results can be calculated as Med = median(C). The first quartile is denoted as Q1 Lower , which implies the value at the 25th percentile of C. The third quartile is also calculated as Q3 Lower , which implies the value at the 75th percentile of C, and thus, the inter-quartile value is determined by IQR = Q3 Lower − Q1 Lower . The upper and the lower limits of the box-whisker plot were measured and marked for detection of the outlier values as follows: Lower Limit = Q1 Lower − 1.5 * IQR. (12) U pper Limit = Q3 Lower + 1.5 * IQR.
Once the lower and the upper quartile limits were defined by the box-whisker plot by Equation (12) and Equation (13), an SU was declared as one of the outliers, i.e., OMU or ROMU, on the basis of the following criteria: Since the outliers have different responses in Equation (11) than the normal SUs, they are classified as outliers with Equation (14). The overall flow chart of the proposed scheme is shown in Figure 2.

Numerical Evaluation
In this section, we numerically evaluate the performance of the proposed scheme considering the parameters given in Table 1.

Numerical Evaluation
In this section, we numerically evaluate the performance of the proposed scheme considering the parameters given in Table 1. To verify the effectiveness of the proposed scheme, we considered three scenarios. In the first scenario, we considered the only existence of the OMU with the normal SUs in the network. The OMUs are very sensitive to the performance of the network. Figure 3 shows the simulation results when the outlier behaves as an OMU. It can be observed that the normal SUs lie in the range of the lower and the upper quartile limits defined by the box-whisker plot, whereas the OMU lies outside the limits. These lower and upper quartiles define the boundary of the normal SUs in the network. Table 2 presents the values of the SNRs, the quartile, IQR and the lower and the upper quartile limits of the SUs in the network.  To verify the effectiveness of the proposed scheme, we considered three scenarios. In the first scenario, we considered the only existence of the OMU with the normal SUs in the network. The OMUs are very sensitive to the performance of the network. Figure 3 shows the simulation results when the outlier behaves as an OMU. It can be observed that the normal SUs lie in the range of the lower and the upper quartile limits defined by the box-whisker plot, whereas the OMU lies outside the limits. These lower and upper quartiles define the boundary of the normal SUs in the network. Table 2 presents the values of the SNRs, the quartile, IQR and the lower and the upper quartile limits of the SUs in the network.  In the second scenario, we considered only the existence of the ROMU and the normal SUs in the network. The ROMU behaves randomly with the probability p and appears as a normal SU with the probability 1-p. Figure 4 shows the simulation results in this scenario. The upper and lower quartile limits are defined in Table 3. It can be shown from Figure 4 that the box-whisker plot defines the limits of the lower and upper quartiles for the normal SUs and the proposed scheme can easily classify the normal SUs in this limit, whereas the ROMU response is different from the normal SUs, and easily identified in the system, which shows the effectiveness of the proposed scheme.  In the second scenario, we considered only the existence of the ROMU and the normal SUs in the network. The ROMU behaves randomly with the probability p and appears as a normal SU with the probability 1-p. Figure 4 shows the simulation results in this scenario. The upper and lower quartile limits are defined in Table 3. It can be shown from Figure 4 that the box-whisker plot defines the limits of the lower and upper quartiles for the normal SUs and the proposed scheme can easily classify the normal SUs in this limit, whereas the ROMU response is different from the normal SUs, and easily identified in the system, which shows the effectiveness of the proposed scheme. In the third scenario, we considered the existence of both the OMU and ROMU in the network. Table 4 defines the upper and the lower quartile limits for the normal SUs. Fig-Figure 4. Correlation vs. SNR, when a random opposite malicious user (ROMU) exists in the network.  In the third scenario, we considered the existence of both the OMU and ROMU in the network. Table 4 defines the upper and the lower quartile limits for the normal SUs. Figure 5 shows the simulation results, when both the OMU and the ROMU are equally distributed. From Figure 5, we can observe that the normal SUs lie within the range of the upper and lower limits, whereas the OMU and ROMU are not in the range of the limits set by the box-whisker plot. Figure 5 shows that the detection results of the OMU were more negative compared to the ROMU. The ROMU behavior was closer to that of the normal SUs and more sensitive care was required for the detection of such outliers or of the MUs.
In Figure 6, we show the receiver operator characteristics (ROC) comparison of the proposed scheme with other existing schemes when the MU was present in the network, and when the MU did not exist in the network. Figure 6 demonstrates that when no scheme was applied, the probability of detection decreased and the probability of false alarm and probability of misdetection increased. Furthermore, when the proposed scheme was applied in the presence of MU, the performance was better than other existing schemes. ure 5 shows the simulation results, when both the OMU and the ROMU are equally distributed. From Figure 5, we can observe that the normal SUs lie within the range of the upper and lower limits, whereas the OMU and ROMU are not in the range of the limits set by the box-whisker plot. Figure 5 shows that the detection results of the OMU were more negative compared to the ROMU. The ROMU behavior was closer to that of the normal SUs and more sensitive care was required for the detection of such outliers or of the MUs. In Figure 6, we show the receiver operator characteristics (ROC) comparison of the proposed scheme with other existing schemes when the MU was present in the network, and when the MU did not exist in the network. Figure 6 demonstrates that when no scheme was applied, the probability of detection decreased and the probability of false alarm and probability of misdetection increased. Furthermore, when the proposed scheme was applied in the presence of MU, the performance was better than other existing schemes. Tabular and graphical results show that the proposed scheme was effective in detecting the outliers or MUs in CSS environments. The proposed scheme had the ability to identify and classify both types of outliers, i.e., the OMU and the ROMU. By utilizing the box-whisker plot, an SU was classified as an outlier by the FC if its result lay above the upper quartile limit or below the lower quartile limit. Through the simulation results, we have shown that the proposed scheme can easily detect the outlier of OMU and ROMU in Tabular and graphical results show that the proposed scheme was effective in detecting the outliers or MUs in CSS environments. The proposed scheme had the ability to identify and classify both types of outliers, i.e., the OMU and the ROMU. By utilizing the boxwhisker plot, an SU was classified as an outlier by the FC if its result lay above the upper quartile limit or below the lower quartile limit. Through the simulation results, we have shown that the proposed scheme can easily detect the outlier of OMU and ROMU in nature.

Conclusions
SSDF attacks severely degrade the performance of CRNs. In this paper, we proposed a correlation-based approach using the box-whisker plot for the detection of outliers in the networks. In the proposed scheme, we considered the hard decision of each SU, and the FC utilized correlation tools and calculated the correlation for finding the similarity of the sensing results of the SUs and outliers. By utilizing the correlation-based approach, we easily classified the outlier among the normal SUs. The outliers were further classified by using the box-whisker plot. The box-whisker plot defined the lower and upper quartile limits for the SUs. Finally, the normal user lay within the range, so the outliers were easily classified from the normal SUs. Through intensive simulation studies, we verified that the proposed scheme has the ability to classify the OMU and ROMU outliers in CRNs.