Spectrum Sensing in Cognitive Radio Internet of Things: State-of-the-Art, Applications, Challenges, and Future Prospects

Raji, Akeem Abimbola; Olwal, Thomas O.

doi:10.3390/jsan14060109

Open AccessReview

Spectrum Sensing in Cognitive Radio Internet of Things: State-of-the-Art, Applications, Challenges, and Future Prospects

by

Akeem Abimbola Raji

^*

and

Thomas O. Olwal

Department of Electrical Engineering, F’SATI, Tshwane University of Technology, Pretoria 0001, South Africa

^*

Author to whom correspondence should be addressed.

J. Sens. Actuator Netw. 2025, 14(6), 109; https://doi.org/10.3390/jsan14060109

Submission received: 8 October 2025 / Revised: 6 November 2025 / Accepted: 11 November 2025 / Published: 13 November 2025

Download

Browse Figures

Versions Notes

Abstract

The proliferation of Internet of Things (IoT) devices due to remarkable developments in mobile connectivity has caused a tremendous increase in the consumption of broadband spectrums in fifth generation (5G) mobile access. In order to secure the continued growth of IoT, there is a need for efficient management of communication resources in the 5G wireless access. Cognitive radio (CR) is advanced to maximally utilize bandwidth spectrums in the radio communication network. The integration of CR into IoT networks is a promising technology that is aimed at productive utilization of the spectrum, with a view to making more spectral bands available to IoT devices for communication. An important function of CR is spectrum sensing (SS), which enables maximum utilization of the spectrum in the radio networks. Existing SS techniques demonstrate poor performance in noisy channel states and are not immune from the dynamic effects of wireless channels. This article presents a comprehensive review of various approaches commonly used for SS. Furthermore, multi-agent deep reinforcement learning (MADRL) is proposed for enhancing the accuracy of spectrum detection in erratic wireless channels. Finally, we highlight challenges that currently exist in SS in CRIoT networks and further state future research directions in this regard.

Keywords:

cognitive radio; spectrum sensing; machine learning

1. Introduction

Wireless communication capacity is tremendously expanding due to the surge in the number of IoT devices requiring high data volumes. The fifth generation (5G) spectrum band is occupied daily with new IoT devices. Spectrum management becomes essential for continued communication of IoT devices. Cognitive radio (CR) is introduced to identify the idle spectrum of primary users (PUs) that can be assigned to secondary users (SUs) for communication, without interfering with the operation of the PUs [1]. PUs are the licensed and legitimate users of the spectral band. They have exclusive rights to the spectrum while SUs (IoT devices) are opportunistic users that feast on the idle bands of the PUs [2,3]. Spectrum sensing (SS) is an important aspect of CRs, which improves spectrum utilization in a wireless network. It routinely monitors the PUs for spectral holes that can be exploited by the SUs for communication. SS in the Cognitive Radio Internet of Things (CRIoT) network is faced with challenges such as dynamism in network environments, noise, interference, and fluctuations in signal strength. These factors impede signal detection in radio networks. Thus, it is essential for SS algorithms to learn the intricate patterns of the network environment, be robust to noise, and exhibit high memory in order to access idle spectrum of the PU [3,4,5,6].

Frantic efforts have been made by numerous researchers to develop novel sensing techniques for solving spectrum deficit problems in CRIoT networks. Among them are the conventional techniques such as energy detection (ED), which although simple to implement, struggles in noisy environments. Matched filter (MF) requires the knowledge of the environment as priori [7]. Cyclostationary feature detection (CFD) is less affected by noise but requires high computational resources. Pilot-based techniques demand constant transmission of pilot signals to detect the availability of the unused spectrum [8,9]. Statistical models like the Bayesian technique and hypothesis evaluation utilize prior knowledge of the PU to make an informed decision on the existence of a vacant spectrum [7]. Interestingly, the application of machine learning (ML) algorithms has attracted the interest of the authors in [10,11,12,13,14,15,16,17] for providing solutions to SS problems. The computational convergence and accuracy of the supervised ML methods are dependent on the quantity of datasets that are utilized for training the models, which is high in most cases. Other notable research works on SS are those due to the authors in [18,19,20,21,22,23,24,25,26], which are summarized in Table 1. The authors in [18] introduce a technique based on statistical covariance of the received signal, which performs better than ED. The performance of the proposed technique degrades at a low value of the threshold. A genetic algorithm is proposed by the authors in [19] to optimize the sensing time and to reduce the probability of miss detection. The technique requires prior knowledge of the PU and channel condition, which may not be available before the communication system is designed. In [20], the authors improve the accuracy of spectrum detection using CFD and a maximum ratio combining (MRC) diversity scheme. The research reveals that the probability of detecting the PU in a radio network is high if MIMO antennas are used. The work fails to account for imperfections in the wireless channel and does not demonstrate the application of the proposed technique in a real-world scenario. In [21], the author introduces a Pietra–Ricci index detection scheme for detecting the presence and the absence of the PU in a radio network. The detector shows better performance than traditional techniques like ED, CFD, and MF. A goodness of fit test, which is critical for evaluating the performance of statistical models, is not considered by the authors.

In [22], the authors propose a multi-user antenna system to detect the presence of the PU in noisy channel conditions, where ED and CFD consider such conditions as signifying the absence of the PU. The proposed technique falters at high-noise regimes of the wireless channel. In [23], particle swarm optimization (PSO) is utilized for optimizing the allocation of the spectrum to the SUs. The authors fail to validate the proposed technique by comparing its performance with the state-of-the-art techniques. In [24], the authors propose a linear combination scheme for SS in slow, block, and fast-fading environments. The scheme utilizes the mean and variance of the received signal to detect the unused spectrum band of the PU. Simulation results demonstrate the superiority of the proposed model to conventional linear combination schemes. Mobility of the wireless channel, which is synonymous with real-life applications, is overlooked by the authors for ease of analysis. In [25], the authors introduce a soft combination scheme for SS in a cooperative network by applying a likelihood ratio test to detect the PU in an environment, where the SNR is below −5 dB and whose performance is shown to be better than ED. The scheme utilizes less sensing overhead, but the performance of the soft combination scheme degrades at high values of SNR. In [26], a survey paper is presented on the latest achievements in SS. The authors discuss the use of CR in 5G and beyond broadband service. The authors end the paper by highlighting the challenges for future research. However, the review paper does not discuss ML-based solutions to SS problems in the CRIoT network.

Similarly, the authors in [27] present various approaches to the reduction in sensing time in CR networks. The proposed techniques for the task include ED, CFD, MF, and waveform detection schemes. ML-based solutions to SS problems are not discussed. This paper presents an overview of SS techniques employed in CRIoT networks. Comprehensive discussion is presented on an ML-based solution to SS problems. The article proposes multi-agent deep reinforcement learning (MADRL) for coping with the challenges posed by wireless channels and for enhancing spectrum detection results. Furthermore, challenges in implementation and prospects that may be explored as future research are highlighted and discussed.

The organization of the article is as follows: Section 1 introduces the paper; Section 2 focuses on the description of the CRIoT network; and Section 3 discusses SS in CRIoT networks. Section 4 describes ML-based techniques widely adopted for SS, which also features discussion on the proposed MADRL. An overview of the application of ML for SS in CRIoT networks is presented in Section 5, while the importance of SS in 5G and beyond networks is highlighted in Section 6. Lessons learnt and challenges of SS in CRIoT networks are highlighted in Section 7 and Section 8, respectively. Prospects that are expected to be addressed as future research are provided in Section 9 while we draw conclusions in Section 10.

2. Cognitive Radio Internet of Things

IoT has become a household name in recent years, owing to the numerous applications it performs. IoT interconnects everything via the internet. It provides a platform for sensors, humans, cars, machines, and communication gadgets, etc., to communicate and connect. It fosters smart things such as smart health, smart agriculture, smart health, smart transportation, and smart building. The literature has documented various applications of IoT [28,29,30,31,32,33,34,35,36]. Figure 1 presents a smart world that is supported by ubiquitous 5G and beyond broadband internet access. The rapid growth of IoT has put enormous pressure on the radio spectrum in 5G wireless access. The quantity of spectral data that is processed daily has escalated [37] and there is increasing demand for more spectrums to cater for emerging IoT devices. CR is put forward to proffer solutions to spectrum deficit problems in wireless networks. The combination of CR and IoT is intended to bridge the spectrum gap in the radio networks by identifying spectral holes or unoccupied spectrums of primary communication such as mobile phones, Wi-Fi nodes, and WiMax (PUs), which can be used by IoT devices for communication. The research on CR has become a hot topic in academic and industrial arenas. This is due to its potential in supporting and sustaining the operation of IoT devices [38]. The front runner in this research is Mitola and his associate who introduced CR in 1999 for effective management and better utilization of the radio spectrum [39]. The advantages of CR in the IoT network include the following:

(i): Handling the dynamic nature of the network environment: The wireless channel in which IoT devices operate changes dynamically, with the PU signal varying over time and space. Large-scale and flat fading, shadowing, and noise are common occurrences in the network environment. CR has the capability to intelligently switch IoT devices to the frequency band of wireless channel with less noise, coping with the stochastic nature of the network environment. In other words, CR can adapt to spatial temporal characteristics of wireless channels.
(ii): Utilization of idle and underutilized spectrum: CR senses, adapts, and accesses spectral bands that are not used by the PUs, culminating in better utilization of the spectrum. CR also ensures that the SU hops to another frequency band if its occupation of certain band affects the operation of the PU. This prevents interference and disruption of the operation of the PU [2,3].
(iii): Spectrum sharing: Another important aspect of CR is spectrum sharing, where the unused spectra are shared among the SUs without posing danger to the primary communication [3].
(iv): Varying link capacity resistance: With CR, SUs can change to the frequency band with higher link capacity when there are changes in the channel’s condition and the operation of the PU is endangered.

3. SS in CRIoT Networks

SS in CRIoT networks takes different forms, as illustrated in Figure 2, which includes non-cooperative, cooperative, and interference-based approaches. In non-cooperative SS (NCSS), individual SU accesses the PU independently and there is no cooperation between the SUs. This approach is affected by shadowing, fading, and noise interference. Cooperative SS (CSS) is divided into centralized and distributed CSS. In centralized CSS, several SUs send sensing signals to the PU to obtain detection information [40].

The information received is combined at the central location, called the control center, as illustrated in Figure 3. The final decision is made by comparing detection information with the preset threshold. The PU is said to be active and the band is occupied if the detection information at the control center is greater than the threshold, otherwise, the band is vacant.

In the distributed CSS, there is no control center, each SU shares its detection information and interprets the decisions of other SUs. It utilizes large communication overheads and the approach burdens the communication network [41]. In interference-based SS, the PU’s signal is detected by checking its interference with the SU. It is assumed that if one of the SUs interferes with the PU signal, such SU is within the communication range of the PU and the PU is detected [41,42]. If there is no interference, the SU is not within the communication bandwidth of the PU. This indicates that the PU band is free. Generally, CSS improves the accuracy of signal detection, owing to spatial diversity of the location of the SUs. Unlike NCSS, CSS is robust to shadowing, fading, and changes in channel condition. CSS has received major attention in the literature on account of its robustness to the dynamic nature of wireless channels.

The techniques commonly used for SS include ED, CFD, MF, Pietra–Ricci index detection, and ML-based techniques. These methods are applicable in any of the SS approaches and are presented in what follows.

3.1. Energy Detection (ED)

ED has received much attention due to its ease of implementation. It utilizes moderate computational effort and does not require prior knowledge of the PU [8]. The average energy of the received signal may be computed via Equation (1) [8] as

E = \frac{1}{M} \sum_{m = 1}^{M} {|X (m)|}^{2}

(1)

where E is the energy of the received signal,

X (m)

is the received signal, and

m = 1, 2, \dots, M

is the length of the received signal. In implementing ED for sensing the PU, the null, and alternative hypotheses are formulated as Equation (2):

[\begin{array}{l} H_{0} : X (m) = V (m) \\ H_{1} : X (m) = H * Y (m) + V (m) \end{array}

(2)

for which,

V (m)

is the Additive White Gaussian Noise (AWGN),

H

is the channel matrix of the wireless medium, consisting of the channel gain that is independently and identically distributed, and

Y (m)

is the transmitted symbol from the PU. The null hypothesis

H_{0}

, indicates the absence of the PU when noise is received, while

H_{1}

indicates that the PU is active and present. ED compares the detection information (energy of received signal) with a threshold, forming a decision rule as Equation (3):

[\begin{array}{l} H_{0} : E < t h r e s h o l d, P U i s i n a c t i v e \\ H_{1} : E > t h r e s h o l d, P U i s a c t i v e \end{array}

(3)

It is implied in Equation (3) that the PU is active if the detection information is greater than the threshold while the PU is inactive if the detection information is lower than the threshold.

The performance of SS techniques is evaluated over a number of metrics including probability of detection

P_{D}

, probability of missed detection

P_{M D}

, probability of false alarm

P_{F A}

, and accuracy.

P_{D}

refers to the probability of asserting that the PU is active when it is actually transmitting. This parameter is expected to be high to minimize interference with the operation of the PU.

P_{M D}

refers to the probability of declaring that the PU is absent, when in actual fact, it is transmitting. The value of

P_{M D}

should be minimal in order to prevent interference between the PUs and the SUs.

P_{F A}

is concerned with the chances of declaring that the PU is active, when in actual fact, its band is unoccupied. The value of

P_{F A}

is expected to be insignificant to avoid loss of spectral bands that may be utilized by SUs for communication. Accuracy refers to the condition that the PU is detected to be active and its band is actually occupied, as well as the condition that the PU is detected to be inactive, when its band is free. For ED,

P_{D}

and

P_{F A}

are expressed by Equations (4) and (5) [8] as

P_{D} = ℚ (\frac{λ_{E D} - M (1 + α)}{\sqrt{2 M (1 + λ_{E D})}})

(4)

P_{F A} = ℚ (\frac{λ_{E D} - M σ_{v}^{2}}{\sqrt{2 M σ_{v}^{4}}})

(5)

where

ℚ

is the Gaussian function,

σ_{v}

is the noise variance,

α

is the signal-to-noise ratio (SNR), and

λ_{E D}

is the sensing threshold, which admits expression of the form given by [8] as

λ_{E D} = (ℚ^{- 1} (P_{F A}) \sqrt{2 M} + M) σ_{v}^{2}

(6)

In Equation (6), it is seen that

λ_{E D}

is a function of noise power, probability of false alarm, and the length of the transmitted signal from the PU. ED is affected by constant changes in wireless channels and suffers a dip in performance at low SNR profiles of the wireless medium. The demonstration of ED for SS has been presented by the authors in [8,9,43].

3.2. Cyclostationary Feature Detection (CFD)

An alternative spectrum detector is CFD. It senses the PU by exploiting the periodic features of the modulated signal. In CFD, it is assumed that the noise is a stationary signal without correlation, while the modulated signal is cyclostationary with spatial correlation due to the redundancy of the signal periodicity [41]. CFD is robust to noise and it is less affected by fading and shadowing [8]. The cyclic autocorrelation function of the received signal admits an expression of the form given by Equation (7) [44] as

A_{x x} = \lim_{T \to \infty} \frac{1}{T} \int_{- T / 2}^{T / 2} x (t + \frac{τ}{2}) * x (t - \frac{τ}{2})

(7)

where

T

is the period,

x (t)

is the received signal,

τ

is the time delay,

*

is the symbol for correlation, and

A_{xx}

is the autocorrelation function. The spectral correlation density, obtained from the Fourier transform of Equation (7), admits expression of the form given by Equation (8) [44] as

S_{x x} = \int_{- \infty}^{\infty} A_{xx} e^{- j 2 π f τ} d τ

(8)

The received signal exhibits cyclostationarity if

A_{xx}

is not equal to zero, else, the received signal exhibits stationarity [44]. When

A_{xx}

is equal to zero, it is believed that noise is transmitted from the PU. CFD is implemented by forming the null and alternative hypotheses, as demonstrated by Equation (9):

[\begin{array}{l} H_{0} : x (t) = n (t) \\ H_{1} : x (t) = H * y (t) + n (t) \end{array}

(9)

in which

n (t)

is the noise,

y (t)

is the transmitted signal from the PU, and

H

is the channel matrix. The availability and the absence of the PU is known via decision rule, expressed by Equation (10) as

[\begin{array}{l} H_{0} : A_{x x} < t h r e s h o l d, P U i s i n a c t i v e \\ H_{1} : A_{x x} > t h r e s h o l d, P U i s a c t i v e \end{array}

(10)

It is seen in Equation (10) that the PU is sensed to be active if the autocorrelation function is greater than the threshold, else, the PU is inactive. An illustration of SS via CFD is depicted in Figure 4.

CFD is computationally expensive and its implementation takes longer time [8,43,44]. The application of CFD for SS has been presented by the authors in [8,9,20,23,43,44,45].

3.3. Matched Filter (MF)

MF requires information about the PU a priori for implementation. Information, such as packet format and modulation type and order, is critical for sensing the spectral hole of the PU [41]. The method converges faster than CFD and consumes less computational resources [46]. MF can be described by the expression of the form given by Equation (11) as

Y [n] = \sum h (n - k) x (n)

(11)

x (n)

is the transmitted signal from the PU, which is convolved with the channel impulse response that is delayed in k unit time, and

Y [n]

is the test static that is used to make decision on the availability of the PU. The null and alternative hypotheses for MF are expressed by Equation (12) as

[\begin{array}{l} H_{0} : y (n) = w (n) \\ H_{1} : y (n) = h (n) * x (n) + w (n) \end{array}

(12)

where

w (n)

is the noise. The test statistic is compared with the threshold and the decision is formed such that Equation (13) holds as follows:

{[\begin{array}{l} H_{0} : Y [n] < threshold, PU is absent \\ H_{1} : Y [n] > threshold, PU is present \end{array}}^{\circ}

(13)

An illustration of this process is shown in Figure 5.

P_{D}

and

P_{F A}

are expressed by Equations (14) and (15) [46] as

P_{D} = ℚ (\frac{λ_{M F} - E)}{\sqrt{E σ_{w}^{2}}})

(14)

P_{F A} = ℚ (\frac{λ_{M F} - E)}{σ_{w}^{2} E})

(15)

for which,

λ_{M F}

is the sensing threshold, E is the energy of the PU signal,

σ_{w}^{2}

is the noise variance, and

ℚ

remains as defined earlier. The sensing threshold admits an expression of the form given by (16) [46] as

λ_{M F} = ℚ^{- 1} (P_{F A}) \sqrt{σ_{w}^{2} E}

(16)

MF requires prior knowledge of the PU, and it exhibits poor results if the information from the PU is not accurate [8,47]. The application of MF for SS has been presented by the authors in [8,46,47,48,49].

3.4. Pietra–Ricci Index Detection

Guimaraes in 2020 adapted the Pietra–Ricci index commonly used in economics and social sciences for detecting the vacant spectrum of the PU in a cooperative radio network [21]. Just like ED, CFD, and MF, the detection performance of this technique is tied to a threshold. The detection information (test static) of the Pietra–Ricci index scheme is expressed by Equation (17) [21] as

T = \frac{2 \sum_{j = 1}^{n^{2}} |y_{j}|}{\sum_{j = 1}^{n^{2}} |y_{j} - \bar{y}|}

(17)

in which

T

is the test static,

y_{j}

is the

j^{- t h}

element of the column vector of the received signal,

n

is the number of SUs involved in SS, and

\bar{y} = 1 / n^{2} \sum_{j = 1}^{n^{2}} y_{j}

. The method detects the presence and the absence of the PU by comparing

T

with the threshold, where the PU is assumed to be present and active if

T

is greater than the threshold, otherwise, the PU is absent and inactive. A demonstration of Pietra–Ricci index detection for SS can be found in [21]. The performance of this technique depends on the value of the threshold. Poor choice of the threshold may give false pretenses about the condition of the PU.

The weaknesses of traditional techniques, which include poor performance in low SNR, demand for partial or full knowledge of the PU, and high sensing time rejuvenate the interest of researchers to consider ML algorithms, to reduce spectrum detection time and enhance the accuracy of spectrum detection results without prior knowledge of the PU. ML techniques devoted to SS are discussed in what follows.

4. ML-Based SS Techniques

ML is adopted in the modern era to address problems in health, education, disaster management, agriculture, school management, and fraud detection [50,51,52,53,54,55,56,57,58]. Its use case has also been extended to telecommunication to address spectrum deficits in CRIoT networks, where SS is treated as a classification problem. The focus of this section is to furnish the readers with ML techniques suitable for SS in CRIoT networks. It is worthy to note that we do not provide an in-depth analysis of ML models because a comprehensive report on them can be found in [59,60,61,62]. ML is divided into various classes, as illustrated in Figure 6. These classes include unsupervised learning, supervised learning, reinforcement learning, and deep learning [63].

4.1. Supervised Learning

In this ML technique, an algorithm is utilized to learn from labeled data for predicting an outcome. The labeled data consist of input–output pairs, which enables the supervised learning algorithm to learn mapping functions that can be applied to raw, unseen, and unlabeled inputs. Common applications of supervised ML are image recognition and detection, spam filtering, classification, and regression analysis [64,65]. Examples of supervised ML algorithms are support vector machine (SVM), logistic regression (LR), decision tree, random forest (RF), Naïve Bayes (NB), and K-nearest neighbor (KNN), etc.

4.1.1. Support Vector Machine (SVM)

In SVM, kernel functions convert the input datasets into high-dimensional spaces. Flat subspace (hyperplane) is utilized to separate data into different classes. This makes it easier to execute classification and regression tasks. Kernels that are used in SVM include linear, radial basis, cubic, and quadratic functions. A linear kernel function is expressed in a form given by Equation (18) [66] as

L (E) = w^{T} E + a

(18)

where

L (\cdot)

represents the linear kernel function,

w

is the weight vector,

a

is the bias,

E

is the energy of the received signal that is being sensed, and

{(\cdot)}^{T}

denotes the transpose of a vector. The output of the SVM may be expressed by Equation (19) [66] as

\min_{w \in R^{d}} {‖w‖}^{2} + C \sum_{n = 1}^{N} \max (0, 1 - x_{i} L (E_{i}))

(19)

where in Equation (19), an optimization problem is solved over a given weight

w

to produce output

x_{i}

for

i^{- t h}

input element,

C

is the regularization constant, and

E_{i}

is the detection information for

i^{- t h}

element of the input dataset. In an SVM, the PU signal is detected if

E_{i}

exceeds the threshold, otherwise, the PU is inactive [67]. The authors in [67] demonstrate that an SVM exhibits better accuracy than the Gaussian Mixture Model (GMM) in sensing the spectral hole of the PU.

4.1.2. Logistic Regression

This technique predicts the probability that an input dataset belongs to a specific class. By using a sigmoid linear function, the algorithm categorizes a real-valued set of independent variables into probability values of 0 and 1. The input features of a given dataset may be expressed in matrix form as Equation (20):

Y = (\begin{matrix} y_{11} & \dots & y_{1 n} \\ ⋮ & ⋱ & ⋮ \\ y_{m 1} & \dots & y_{m n} \end{matrix})

(20)

The independent variable X, consisting of binary values is given by Equation (21) as

X = [\begin{array}{l} 0 i f c l a s s 1 \\ 1 i f c l a s s 2 \end{array}

(21)

By applying a linear function to the input variable Y, we have an expression of the form given by Equation (22) as

G = (\sum_{k = 1}^{N} w_{k} y_{k}) + a

(22)

where

y_{k}

is the

k^{- t h}

observation data of Y,

w_{k} = [w_{1}, w_{2}, w_{3} \dots, w_{N}]

is the weight, and

a

is the bias. The algorithm employs a sigmoid function to transform G into probability values of 0 and 1, which are used to predict the presence and the absence of the PU. That is,

σ (G) = \frac{1}{1 + e^{- G}}

(23)

where

σ (G)

is the linear sigmoid function on G. It is deduced from Equation (23) that

σ (G)

tends towards 1 as

G \to \infty

and

σ (G)

tends towards 0 as

G \to - \infty

. This means that the boundary of

σ (G)

is between 0 and 1. The probability of belonging to a class may be expressed by Equation (24) as:

P (X = 1) = σ (G), P (X = 0) = 1 - σ (G)

(24)

4.1.3. Decision Tree

In this learning scheme, input datasets are partitioned based on different features into a tree-like form. The tree has four nodes: the root node, the branches, the internal node, and the leaf node. The root node is the starting point of the tree, which represents the features of the dataset. The branch connects the nodes. It represents possible values that a node can predict. The internal node represents further decisions based on the previous branch. The leaf node provides the final outcome of the classification. The authors in [66] build the decision tree by using Dichotomiser 3 (ID3), where the gain and entropy function are employed as metrics to classify the training data for SS.

4.1.4. Random Forest (RF)

This is an ML algorithm that utilizes an ensemble of decision trees to predict an outcome. It handles complex datasets and provides high accuracy of data classification. It is a popular choice for both regression and classification work. It creates random vectors, where each tree in the forest is trained on a random subset of the input data. This helps to minimize overfitting and enhance generalization. Each tree contributes to make a decision. It combines the decisions of multiple trees to predict an outcome [68], having an edge over decision trees. The prediction of the presence and the absence of the PU using RF may be expressed in a form given by Equation (25) as:

O u t c o m e = \frac{1}{A} \sum_{n = 1}^{A} f_{n} [E]

(25)

where E is the detection information concerning the status of the PU,

f_{n}

is the classification function, and A is the number of trees that are involved in the decision.

4.1.5. K-Nearest Neighbor (KNN)

In this learning model, classification is performed based on the similarity between the instances of input datasets, rather than using the underlying features of the input data. It is assumed in many quarters to be a lazy learner. It makes predictions by memorizing the input data. The norm of the instances of the testing and training data may be expressed by Equation (26) as:

X = {‖\sum (f_{m} - f_{n})‖}^{2}

(26)

in which

X

is the similarity that defines the Euclidean distance between the instances of testing data

f_{n}

, and instances of training data

f_{m}

. Given a positive integer

K

, whose value depends on the experimentation and cross validation, the algorithm identifies

K

points in the training data that are closest to the testing data.

4.1.6. Naïve Bayes

This learning technique utilizes Bayes’ theorem to predict the likelihood that a data point belongs to a specific class, assuming that all the features of the data are unrelated or independent. The Bayes’ theorem for calculating the posterior probability may be defined by Equation (27) as

P (b / a) = \frac{P (a / b) . P (b)}{P (a)}

(27)

where

P (b / a)

is the posterior probability; the probability of

b

, given features of

a

,

P (a / b)

is the likelihood that defines the probability of features

a

, given class

b

;

P (b)

is the prior probability of class

b

and

P (a)

is the marginal likelihood. In using the model for SS, class

b

is assumed to represent the detection information concerning the absence and the presence of the PU and

a

represents the SU that is observing the PU. Using the Naïve assumption, where each SU in IoT network is independent, it is written that

P (a_{1}, a_{2}, a_{3}, \dots, a_{n} / b) = P (a_{1} / b) \cdot P (a_{2} / b) \dots P (a_{n} / b)

(28)

Consequently, Bayes’ theorem of Equation (27), transforms to expression of the form given by Equation (29) as

P (b / a_{1}, a_{2}, a_{3}, \dots, a_{n}) = \frac{P (b) \cdot \prod_{i = 1}^{n} P (a_{i} / b)}{P (a_{1}), P (a_{2}), \dots P (a_{n})}

(29)

The model provides information about the PU by calculating the highest posterior probability of each state of the PU, using Equation (30) that is written as

\hat{b} = \underset{b}{\arg \max} (P (b) \cdot \prod_{i = 1}^{n} P (a_{i} / b))

(30)

The state of the PU with the highest probability is chosen as the one representing the current status of the PU.

4.2. Unsupervised Learning

In unsupervised ML, an algorithm is utilized to learn from unlabeled data without a target or expected output. The aim of unsupervised learning is to enunciate hidden patterns, structures, and relationships within the data. Common tasks in unsupervised ML are clustering, dimensionality reduction, and anomaly detection as well as associate rule learning. It has useful applications in image analysis and fraud detection, among others. Its accuracy is low because of the absence of input and output data pairs; it requires large computational resources [69]. Examples are K-means clustering, Bayesian learning, principal component analysis (PCA), and independent component analysis (ICA).

4.2.1. K-Means Clustering

This learning model divides unlabeled data into K-point clusters. It assigns each data point in the cluster to the nearest centroid based on the chosen Euclidean distance. The centroids are taken as the mean of all the data points in the cluster. This process is repeated until all the cluster assignments no longer change or a certain number of iterations is reached. In determining the value of K, the elbow technique is utilized. The authors in [70] have demonstrated the application of K-means clustering for SS in the radio network.

4.2.2. Bayesian Learning

This is an unsupervised ML technique that learns hidden patterns in unlabeled data. Bayesian learning determines the likelihood function that quantifies the probability of observing the data, given the chosen model and its parameters. It assigns data points to different clusters and identifies patterns based on the learned model parameters. The GMM is a powerful probabilistic model that is used in Bayesian learning to define data features such as the number of clusters or the connection between the variables, cluster’s shape, and location. With the GMM, each data point is assigned to the cluster with the highest probability. The algorithm utilizes Bayes’ theorem to understand the pattern and the distribution of the data. The parameters of the GMM are refined iteratively by using the posterior distribution from previous iteration as the prior for the subsequent iteration. Through this process, the algorithm learns and improves its understanding of the underlying features of the input datasets.

4.3. Reinforcement Learning (RL)

In reinforcement learning (RL), the agent explores the environment, makes decisions and receives feedback in the form of a reward or penalty, learning through trial and error. RL is classified into the Markov decision process (MDP), multi-armed bandit, dynamic programming, and temperature difference learning [71]. We do not wish to repeat the discussion of these classes because a comprehensive report on them can be found in [72,73,74,75,76,77,78,79,80]. The following terms are pertinent to the understanding of RL and its deployment for SS.

(i): Agent: This is a device that explores the environment.
(ii): Environment: This refers to the circumstances in which the agent operates, providing states and actions.
(iii): State: This describes the current status of the environment as seen by the agent.
(iv): Action: It refers to the choices that are available to the agent in an environment.
(v): Reward: It denotes the feedback received based on the actions of the agent.
(vi): Policy: This is the strategy that guides the agent’s behavior. It maps states to actions.
(vii): Value function: This measures the cumulative reward an agent receives by following a particular policy.
(viii): Model: This is a representation of the environment that the agent explores to forecast the outcome of its actions.

The basic algorithm for training RL is presented in Algorithm 1.

Algorithm 1: Algorithm for training R.

#step 1: start
#step 2: define the environment
#step 3: state the reward
#step 4: define the agent
#step 5: train or validate the agent
#step 6: implement the policy
#step 7: end

An illustration of MDP for identifying idle spectrums of the PU is presented in Figure 7. In the illustration, an agent (the SU) in a particular state

S_{i}

, takes action

A_{i}

, to access the spectral band of the PU (environment). Consequent upon this task, the agent receives feedback in the form of reward

R_{i}

. The reward measures the effect of the selected action of the SU’s state [81]. The state transitions to the next state

S_{i + 1}

, and the SU receives the reward

R_{i + 1}

, for the action of the new state following a policy

β

. The policy determines how the SU takes an action in different states. The SU continues to interact with the PU and receives the reward for every action taken. Through this process, the SU learns the characteristics of the environment and observes whether the PU band is vacant or not. The reward may be modeled to minimize spectrum sensing time.

The cumulative reward of the state

S

under policy

β

when agent’s first state is

S_{i}

, is defined by [82] as Equation (31):

v_{β} = E_{β} [\sum_{i = 0}^{\infty} γ^{i} R (S_{i}, A_{i,} S_{i + 1}) |S_{i} = S]

(31)

where

γ

is the discount rate, which measures agent’s discounted future reward.

The objective of RL is to find an optimal policy that generates maximum reward for every action taken by the agent. Thus, the action value, also referred to as the Q-value can be expressed by Equation (32) [82] as:

ℚ_{β} (S, A) = E_{β} [\sum_{i = 0}^{\infty} γ^{i} R (S_{i}, A_{i,} S_{i + 1}) |S_{i} = S, A_{i} = A]

(32)

Equation (32) represents the expected value for the agent’s action

A

, under state

S

. The optimal policy, given a set of action values of a particular state, is defined by Equation (33) as:

β^{*} (S) = \underset{A}{\arg \max} \{ℚ_{β}^{*} (S, A)\}

(33)

where

β^{*} (S)

is the optimal policy of the state S and

ℚ_{β}^{*} (S, A)

is the optimal Q-value function, which is given by Equation (34) as:

ℚ_{β}^{*} (S, A) = P (S' / S, A) \sum_{S'} R (S, A, S') + v_{β^{*}} (S')

(34)

The cumulative reward of the agent in state S taking action A under optimal policy, admits an expression of the form given by Equation (35) [82] as:

v_{β^{*}} (S) = \underset{A}{\arg \max ℚ_{β}^{*} (S, A)} = \underset{A}{\arg \max} P (S' / S, A) (\sum_{S'} R (S, A, S') + v_{β^{*}} (S'))

(35)

The algorithms for maximizing the cumulative reward in the MDP may be model-based or model-free. The former requires not only high computational operation but also perfect knowledge of the environment, which may not be available when the communication link is designed. The applications of model-based algorithms are limited [83] and are rarely used for SS. Examples of algorithms that fall under this category are Dyna, model-based policy optimizer (MBPO), Dreamer, and probabilistic inference for learning control (PILCO). Model-free algorithms are commonly used, owing to their faster rate of convergence and ease of implementation [84]. They require less computational overheads and plan their next action by learning from the environment [84]. Examples are Q-learning, actor–critic, state–action–reward–state–action (SARSA), the Monte Carlo model, and the policy gradient method. These learning models are also classified as temperature difference algorithms [83]. To receive the maximum reward in Q-learning for example, the agent takes random actions and updates the Q-value function of each action by using expression of the form given by Equation (36) [82] as:

ℚ (S^{i}, A^{i}) \leftarrow (1 - δ) ℚ (S^{i}, A^{i}) + δ (R^{i + 1} + γ \max_{A} ℚ (S^{i + 1}, A^{i}))

(36)

in which

δ

is the learning rate.

RL is useful for SS in CRIoT networks because it requires little training data for learning the dynamic change in status of the primary channel and coping with the non-stationarity of wireless channel.

4.4. Deep Learning (DL)

DL utilizes an artificial neural network (ANN) with many layers to analyze and learn complex patterns in input data. Figure 8 shows an ANN with zero hidden layers that accepts a number of inputs

(x_{1}, x_{2}, x_{3}, \dots, x_{n})

, sums the weighted input and bias

b

, and passes the result as input argument to an activation function, to give output

y

. This process may be described by the expression of the form given by Equation (37) [85] as:

y = φ ([w_{1}, w_{2}, w_{3}, \dots, w_{n}] [\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \\ ⋮ \\ x_{n} \end{array}] + b) = φ (\sum_{n = 1}^{N} w_{n} x_{n} + b)

(37)

Provided that

φ

symbolizes activation function,

b

is the bias and

w

is the synaptic weight.

The weights of the ANN are updated by minimizing the error between the network output and the expected output or target, using the mean square error (MSE), stochastic gradient descent (SGD), and root mean square propagation (RMSProp.) [86]. Vanishing gradient, overfitting, and slow convergence occur when an ANN with many hidden layers is utilized to handle a complex task. This necessitates the need for DL architecture, consisting of multiple hidden layers to overcome the limitations of the ANN. A DNN avoids overfitting by training some randomly selected nodes instead of the entire network. This is called dropout. DL allows 20%, 25%, and 50% dropout in the hidden layers, which in turn reduces the time required in training the network. DL is divided into supervised and unsupervised learning as well as deep reinforcement learning (DRL). Supervised DL is subdivided into multilayer perceptron (MLP), convolutional neural network (CNN), and recurrent neural network (RNN). An example of unsupervised DL is the autoencoder.

4.4.1. Multilayer Perceptron

This is a feed-forward DNN that consists of fully connected layers. An illustration of MLP architecture for SS is shown in Figure 9. The illustration depicts a CRIoT network with channel coefficient

(H_{1}, H_{2}, H_{3}, \dots, H_{N})

from the PU to the SUs

(S U_{1}, S U_{2}, S U_{3}, \dots, S U_{N})

. The control center combines sensing information from the SU, which serves as input into the MLP architecture. After repeated process of activation at different hidden layers and updating the weights, the network generates the corresponding output

y

that provides information about the presence and the absence of the PU in a wireless network. MLP suffers from vanishing gradient and overfitting due to a large number of hidden layers in the network architecture. A number of solutions have been proposed to address this problem, including early stoppage of the training, dropout, and regularization [63].

4.4.2. Convolutional Neural Network

MLP falters when it is utilized to handle huge and nuanced input data like image signals. It experiences the problem of dimensionality due to full connection of neurons from one layer to the other [63]. These concerns birth the introduction of a CNN to extract features in complex signals for classification and regression tasks. A CNN consists of convolution, pooling, and fully connected layers. The convolution layer employs kernels to extract the feature map in the input data. The pooling layer reduces the spatial dimension of the extracted feature map while the fully connected layer performs classification or regression duties. The output layer generates the output of the network. CNNs find useful applications in image classification, object detection, image denoising, image recognition, and segmentation. Popular types are residual neural networks (ResNet) AlexNet, VGG-16, GoogleNet, and LeNet [87,88,89,90]. CNNs capture the spatial characteristics of the PU signal by recognizing the intricate pattern of the radio spectrum, improving spectrum detection in the wireless network [8]. It combats the vanishing gradient of MLP through normalized initialization and batch normalization. To overcome overfitting and achieve good performance, CNNs require high amounts of labeled training data. The training process involves high-density calculations in each layer of the CNN and gradients are determined for every parameter using backpropagation [87]. Given that a training session may involve thousands of iterations on a large quantity of training datasets, the entire computational complexity can be immense for generating SS results in an IoT network.

4.4.3. Recurrent Neural Network

In this learning method, the temporal features of input datasets are captured. It excels in handling tasks that require memory or the need for recalling past information. In RNN architecture, the output is fed back into the network as an input, creating a closed-loop system. This allows the architecture to have memory about previous inputs. The memory characteristic of an RNN suits the model for handling sequential data in time series form [9]. Owing to its ability to recall past events, it is a valuable tool in predicting historical data concerning the present and the future spectrum usage of the PU in CRIoT networks [9]. The author in [91] describes the output of RNNs as Equation (38):

y_{i} = φ_{y} (D_{y} h_{i} + k_{i})

(38)

where

y

is the output vector,

φ

denotes activation function,

k

is the bias vector,

D

is the weighted matrix for hidden to output layer connection, and

h

represents the hidden layer vector, expressed by Equation (39) [92] as:

h_{i} = φ_{y} (W_{h} a_{i} + U_{h} h_{i - 1} + k_{i})

(39)

where

W

is the weighted matrix for hidden to hidden connection,

U

is the weighted matrix for input to hidden layer connection, and

a

is the input vector. Examples of RNNs are LSTM, gated recurrent unit (GRU), and bidirectional RNN. RNNs require a high volume of datasets for training. The computational complexity of RNNs is high because of two main reasons: the difficulties associated with training a large number of datasets and the sequential manner of processing the data, which limits parallel processing. The application of LSTM for SS has been demonstrated by the authors in [92]. To the best of the authors’ knowledge, no work has been reported in the literature on the application of GRU and bidirectional RNN for SS.

4.4.4. Autoencoder

This is an unsupervised DL technique that learns the essential features in input dataset by projecting the input data into smaller dense structures called latent spaces. It learns to extract essential features from the input data by training the model to reduce the error between the input and the output [93]. The autoencoder architecture for SS in the cooperative IoT network is shown in Figure 10. It consists of the encoder, the bottleneck, and the decoder. The encoder compresses the spectrum’s information from the control center into the latent space. The bottleneck captures the essential feature map in the encoder’s output while the decoder is used to determine whether the PU band is occupied or not. It produces detection information concerning the status of the PU. Other applications of autoencoders include anomaly detection, image denoising, feature learning, and dimension reduction [93]. Variational, sparse, convolutional, and denoising autoencoders are the common types. The demonstration of the autoencoder for SS has been presented by the authors in [93,94]. In [93], a variational autoencoder is utilized for SS, where the test statistic is created from the signal sample.

4.4.5. Multi-Agent Deep Reinforcement Learning (MADRL)

Supervised, unsupervised, and deep learning models exhibit poor performance when deployed to handle tasks involving making complex decisions by employing few datasets. DRL is introduced to provide solutions to complex decision-making problems [95,96] with few training data. It combines DL and RL to perform functions such as game playing, robotics, finance, healthcare, and smart grid monitoring, among others [95,96,97]. Figure 11 illustrates the proposed multi-agent deep reinforcement learning (MADRL) architecture that fuses MLP (DL) and multiple-agent RL for SS. MLP is utilized to extract features in the input data, which will be transformed into more explicit representations using any of the following, Rectified Linear (ReLU), sigmoid, and tanh as the activation function. The hidden layer will transform the input data through weighted sums and non-linear mapping, where each neuron detects a specific pattern or feature of the training data. In this way, MLP extracts complex relationships and patterns in the training data. Multi- agent RL utilizes multiple agents (SUs) to learn, interact with the PU (environment), take action, and observe the resulting state and reward. The agents learn value functions or policies for maximizing the cumulative reward. Q-learning will be used for optimizing the policy or value function [98,99,100]. The importance of multi-agent RL in the architecture is to recognize the non-stationarity in the status of the PU (environment) and generate scalable and optimal output that converges faster than single-agent RL [101,102]. Thus, the proposed MADRL architecture will leverage on the excellent data processing ability of MLP to extract several features of the PUs in a dense IoT network, where many IoT devices transmit and receive, and also deploy multi-agent RL, for sensing, and to enable IoT devices to occupy the vacant spectrum of the PUs without delay and hassle.

Table 2 summarizes the strengths and weaknesses of various approaches to SS earlier discussed.

5. Review of the Application of ML for SS in Cognitive Radio Networks

The attention is shifted here to the review of various contributions on the application of ML for SS in CRIoT networks. These studies are reported by the authors in [66,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122] and are summarized in Table 3. In [66], the authors consider RF, SVM, decision tree, NB, KNN, LR, and ANN for SS, where it is shown that RF outperforms SVM, decision tree, NB, KNN, LR, and ANN in terms of

P_{D}

,

P_{F A}

,

P_{M D}

, and accuracy. All ML techniques considered as candidates for SS, exhibit high

P_{F A}

, which is undesirable in radio networks. The authors in [103] investigate MLP, SVM, and NB for SS in a cooperative radio network. By using receiver operating characteristics and area under the curve as metrics, it is demonstrated that MLP exhibits better performance than SVM and NB. The authors do not assess the models in non-stationary condition of wireless channels, where SUs and PUs are mobile, and which typifies the practical real-world application. The research is also limited to the case of a single PU and three SUs. In [104], the authors present a hybrid CNN-RNN for SS and further incorporate transfer learning to improve the performance of the hybrid model at a low SNR. The proposed technique achieves high accuracy with large training overheads. The high computational complexity of the proposed model may limit its implementation in large-scale networks.

The authors in [105] demonstrate the effectiveness of RNNs and CNNs to expand wireless capacity by identifying unoccupied bands that may be used by secondary devices. Simulations illustrating

P_{D}

and

P_{F A}

suggest that the proposed DL models surpass MF and CFD. However, the learning models require large datasets for training, limiting deployment in computer systems with few resources. The contribution of [106] introduces LSTM for SS in non-cooperative radio networks by using modulated received signals as datasets. It is seen that LSTM outperforms CFD and ED. The limitations of the study are reliance on large training datasets and utilization of high spectrum detection time that may not be feasible in real-time applications. In [107], ResNet50 is proposed to show that the PUs are detected in CRIoT networks if the congestion rate is low. The authors do not optimize the model to reduce computational time at high SNR values. In [108], the authors propose an ANN for SS in cognitive radio networks. The ANN architecture consists of one hidden layer and output layers. The learning model is trained with ED and a likelihood ratio test scheme. The study is deployed to detect vacant spectral bands of the PU and it achieves 63% gain over ED and improved ED. An assumption of the communication system with a single PU and SU is made by the authors, which limits the adoption of the study in real-time applications, where many PUs and SUs interact and communicate. The contribution of the authors in [109] utilizes K-means clustering to sense the vacant spectrum in a cooperative network, characterized by Rayleigh, Rician, and Nakagami fading channels. It is shown that K-means clustering demonstrates better performance than ED and fusion-based schemes. The learning method requires large training datasets that are not readily available in real-time applications. The authors in [110] reduce the computational time for making decisions at the control center in a cooperative network. Q-learning is utilized to update the action–reward value of the SS algorithm. The authors do not take into account unpredictable and dynamic channel states in wireless channels, which may affect the reliability of the study for practical applications.

In [111], the authors optimize the performance of RF using Bayesian learning, which is shown to be more accurate than SVM, KNN, and GMM. The learning model achieves

P_{D}

and

P_{F A}

of 0.94 and 0.1, respectively. However, the work is limited to the case of a single PU and three SUs. This does not accurately represent practical wireless communication environments with many potential users. RL is utilized by the authors in [112] to improve SS accuracy in a cooperative IoT network, where an actor–critic learning process is adopted. Evaluation of the performance of the learning model in terms of

P_{D}

and

P_{F A}

, which are critical in SS is not reported. The study overlooks real-world challenges and imperfections like small and large-scale fading as well as noise interference, which may impede practical deployment. The authors in [113] train three CNN models including AlexNet, LeNet, and VGG-16 for SS, where the covariance matrix of the sensing information at the control center is removed and updated information is used as the input into the CNN models for the final decision. The decisions obtained from the CNNs are communicated to the SUs. All the learning methods demonstrate better accuracy than AND, OR, and voting-based schemes. In addition, VGG-16 exhibits higher

P_{D}

and lower

P_{F A}

compared to other CNN models. The improved performance of VGG-16 necessitates more computer resources, which may incur a higher computational cost. The study also relies on large datasets for training, which may not be available in real-time applications.

The authors in [114] utilize an ANN to improve the performance of ED and MF at low SNRs, resulting in ANN-based ED and MF techniques. The proposed models improve the detection accuracy, reduce the false alarm, and bit error rate (BER). The study demonstrates the superiority of ANN-based MF to ANN-based ED, conventional ED, conventional CFD, and SVM. The drawbacks of the proposed techniques are high computational complexity and reliance on large labeled datasets as well as inability to implement in devices with limited computer resources due to vast memory and high computational power requirements. The proposed models need to be practically validated for real-world applications. The authors in [115] present RL for SS in a cooperative network. The study effectively shows that RL performs better than SVM, KNN, and the wavelet transform. The paper lacks detailed analysis concerning practical deployment and challenges in real-time applications. In [116], the authors introduce an SVM for reducing the computational overhead required for detecting vacant spectrums and improving sensing performance in a cooperative IoT network. The proposed ML technique groups SUs into various classes for easy identification of unoccupied PU bands. The first-class groups SUs into abnormal and normal users, the second classifies SUs into redundant and non-redundant users, the third-class groups SUs into optimized cooperation class, and lastly, the fourth classifies the SUs into superior and inferior users. Simulation results portray the superiority of the proposed technique to conventional techniques. The study overlooks imperfections such as fading, shadowing, terrestrial effects, and noise interference in real-world applications. The implementation of the proposed technique in a multi-band cooperative scenario is also not considered in the study.

The authors in [117] propose RL for reducing the amount of computational resources required for scanning the PU and reducing the delay in sensing unused spectrums in a cooperative users’ network. The study considers the mobility of the PUs and the SUs as well as the dynamic nature of the wireless channel in its analysis, which is critical in real-world applications. The performance of the proposed method is evaluated over average detection probability, average number of times that SU successfully accesses the PU, and call block rate. Analytical results demonstrate the effectiveness of the proposed method to achieve a high probability of detection and low call block rate. The study fails to account for noise interference in real-time applications. In the experimental study of [118], the authors consider ANN, SVM, decision tree, and KNN to sense real signals generated by a low-cost smart embedded device that operates at 433 MHz and a wireless transmitter with amplitude shift keying (ASK) and frequency shift keying (FSK) modulation schemes. Simulation results reveal that the ANN and SVM are more accurate for the sensing task than decision tree and KNN. The drawbacks of the study include dependence on a huge number of real signals for training the algorithms and failure to take into the account in the analysis, shadowing and multipath fading, which degrade performance in real-time implementation. The authors in [119] present an innovative solution to SS problems within the noisy regime of the wireless channel by proposing a hybrid ML technique that combines a CNN and LSTM. The authors refer to the proposed technique as SenseNet. The study shows that the proposed solution achieves a 3.3% gain in sensing accuracy at an SNR of −20 dB, when compared to conventional CNNs. The limitations of the study include high computational complexity and reliance on a high quantity of datasets for training.

The study in [120] proposes a deep autoencoder to learn the features of the primary communication node, which is further classified by SVM into active and passive nodes. It is shown that a deep autoencoder with a radial basis kernel exhibits the highest accuracy in spectrogram signal while a deep autoencoder with a linear kernel demonstrates the best accuracy in amplitude phase signal. The research overlooks the imperfections associated with the wireless channel, limiting the adoption of the model for practical implementation. The authors in [121] consider deep, variational, and LSTM autoencoders for sensing and for distinguishing between LTE and Wi-Fi signals. The research effectively shows that deep and LSTM autoencoders have better values of recall and F1-score. The variational autoencoder exhibits difficulty in recognizing LTE and Wi-Fi signals. It is also demonstrated in the study that the deep autoencoder converges faster than the LSTM autoencoder, but produces lesser values of recall, compared to LSTM, while the values of precision for LSTM and deep autoencoders are the same. The research overlooks the impact of wireless channels on signal transmission. The effect of fading, shadowing, and noise are not considered, which may derail the implementation of the study.

Similarly, the research in [122] proposes a denoising autoencoder for detecting the vacant spectrum in integrated sensor and communication networks. The authors demonstrate the effectiveness of the proposed approach in detecting the vacant spectrum and its robustness to shadowing. Mobility or non-stationarity of the wireless channel, critical in real-life demonstration, is overlooked in the study. To the best of authors’ knowledge, no work has been reported in the literature on MADRL for SS in CRIoT networks and it is proposed in this work for enhancing the accuracy of spectrum detection in dynamic wireless channels, and for addressing research concerns, as observed in the foregoing.

The illustration of the trend in the adoption of ML for SS by the researchers, in the years covering 2018 to 2025, as captured in Table 2, is presented in Figure 12. It is observed in the illustration that supervised ML receives the most attention, closely followed by supervised DL, while unsupervised ML techniques receive the least attention. The preference of supervised ML by most researchers may be connected to the use of few computer resources and less rigor in its implementation as well as scalability across diverse datasets.

6. Importance of SS in 5G and Beyond IoT Networks

The rising data volume and the introduction of data-intensive applications necessitate more than ever the need to manage spectrum resources in the present cellular network. Hence, the importance of SS in the current network infrastructure cannot be overemphasized. The advantages of SS in 5G and beyond IoT networks include the following:

(i): Management of existing data spectrum: SS encourages efficient usage of spectrum resources by identifying vacant portions of primary channels that may be deployed for communication [123,124]. This preserves the cost of developing a new bandwidth for accommodating an upsurge in mobile connectivity and upcoming data-intensive applications.
(ii): Enhancement of capacity and spectral efficiency of the wireless network: SS in 5G and beyond IoT networks has the potential to free up space for communication, increasing throughput and spectral efficiency [15]. With SS, new IoT devices will not be short of spectral bands and the development of IoT will not be endangered in any way.
(iii): Avoidance of interference with primary communication: SUs are opportunistic users and they become active when PU bands are free or inactive. Thus, IoT devices will only occupy unused spectrums of inactive primary channels. This process reduces interference [125], prevents network delays, and reduces potential attacks on the operation of the PUs.
(iv): Optimization of spectrum usage: SS supports efficient optimization of usage of the spectrum in CRIoT networks [126]. This importance prevents spectrum wastage and guarantees availability of spectral bands for communication in 5G and beyond IoT networks.

7. Lessons Learnt

The takeaways from comprehensive survey carried out in this work are listed as follows:

(i): Robustness of SS models: The wireless channel is characterized by fading, shadowing, and terrestrial disturbances. As a result, SS models must be able to withstand variation or changes in communication channels. The application of RL in such an environment suffices. The agent is trained to understand and interact perfectly with the environment. The decision is based on the maximization of the reward, which guides the training process.
(ii): Training data: The performance of ML algorithms for SS depends on the number of datasets that is used to train the models. Supervised ML exhibits high accuracy and converges faster if the training data are high.
(iii): Difficulty in the optimization of the models: Because of large computational resources required for training RL and DRL models, the training process can be slow, boring, unstable, and may be difficult to optimize, leading to inaccurate and inconsistent spectrum detection results.
(iv): Generalization: Model-free RL algorithms may struggle to generalize beyond their training environment and may be unable to adapt to new conditions.
(v): Real world applications: The trial-and-error approach of RL and DRL may create problems in real-world implementation.

8. Challenges of SS in CRIoT Network

An appraisal of the previous studies reveals quite a number of challenges that may potentially derail existing SS techniques from achieving accurate results. These challenges include the following:

(i): Requirement for large training data: Supervised ML algorithms require a huge amount of training data for accurate SS results. It may be impossible to obtain huge training data that will capture the true characteristics of the ever-dynamic network environment in real-life scenarios. Current efforts in this regard utilize computer-generated data, which are fashioned based on simplified assumptions of the network environment and may not model the nuanced and complicated features of wireless channels in the real world [127].
(ii): Huge computational resources: Supervised DL techniques like CNN and RNN require significant amounts of computational resources. This requirement can be a challenge in real time processing, especially in computer systems with limited computational facilities.
(iii): Overfitting: DL techniques are susceptible to overfitting due to high levels of modeling. This may result in poor performance.
(iv): Poor understanding of ML models: ML techniques are black boxes that may be hard to understand. In some cases, there may be the need to know why certain SS decisions are taken.
(v): Hardware consideration: RL and DRL require fast computing power and specialized hardware like graphics processing units (GPUs) for smooth operation and implementation. Their computational complexity in real-time applications may cause difficulty when deployed in large-scale networks. This factor also limits their deployment in computer devices with constrained hardware resources.
(vi): Dynamic propagation channel: The network environment changes from time to time, ensuring that SS techniques adapt to the nature of the network environment is still a concern in real-world applications. This may necessitate training, retraining, and fine tuning of the existing models.
(vii): ML architecture: The nature of the architecture of the ML algorithm plays a great role in its performance. A poor choice of architecture may lead to a bad result.
(viii): Powering IoT devices: IoT devices are run on batteries with a limited lifecycle, amid these, are the challenges of greenhouse effects due to improper disposal of the batteries, and large operational cost, owing to constant replacement of the batteries. Alas, this is not sustainable. Other ways of energizing IoT devices as proposed in the literature are to equip them with energy harvesters. The authors in [128] have theoretically described various energy sources that can be used, which include ambient environments, mechanical sources, thermoelectric generators, and tribo-electric nanogenerators. The authors in [129,130] have shown that small amount of D.C voltage (order of milli-volt) is obtained from an ambient environment by deploying an energy harvester, consisting of a rectifier circuit and antenna. This voltage is not enough to power a large-scale IoT network. Other sources theoretically proposed apart from ambient sources are in the budding stage of practical deployment.

9. Future Prospects

The following future prospects and open issues are observed and are highlighted for further exploration by the research community in mitigating the challenges in SS and improving the accuracy of spectrum detection in 5G and beyond IoT networks.

(i): Interference measurement: Because the PUs are passive devices, SUs may not be aware of their precise location in the communication network. This phenomenon may cause SUs to interfere with the operation of the PUs. To the best of our knowledge, a device that allows SUs to estimate the interference at nearby PUs is currently non-existent. This device is essential in real-time implementation. Purposeful research efforts should be devoted to address this concern.
(ii): Huge data transfer: The transfer of overall sensing data to the control center in CRIoT networks with many IoT devices may necessitate the need for huge amount of data in real-time applications. Wireless transfer of information without interfering with the operation of the PU may be impossible to achieve. Efforts should be geared towards addressing this concern without affecting primary communication.
(iii): SS in multi-user environments: Most works in the literature have considered a single PU and multiple SUs as design model for investigation. IoT devices operate in multi-user network environments, where multiple PUs and SUs are present. There is a need to explore the spatial diversity inherent in multi-dimensional network environments for SS, where multiple PUs and SUs are seen to be present.
(iv): Reduced time for SS: Critical attention should be paid to the reduction in time taken to sense an unused spectrum of the PUs. Future work should focus on further reducing sensing time, where less samples are utilized to detect the PU without compromising accuracy and without increasing $P_{M D}$ and $P_{F A}$ .
(v): Fallacious result: Malicious attacks from the SUs reduce the accuracy of SS results. The attack may be in the form of false sensing results that are sent by the SUs to mislead the control center. Though the authors in [114,131] consider this scenario in their investigations, there is a need for more robust SS techniques that will notice and demystify this conundrum.
(vi): Emulation of the PU: A mischievous SU may emulate and exhibit the characteristics of the PU in order to prevent other SUs from accessing the primary channel [132]. This challenge necessitates the need for novel SS techniques that will give all SUs the same leverage and ensure fairness among the SUs [133].
(vii): Cost of implementation: The requirement for specialized hardware for implementing DRL and RL models may impact on the cost of the hardware resources for SS in 5G and beyond networks. Energy-efficient hardware design, consisting of miniaturized elements and less power consumption will be vital for detecting vacant spectrums in multi-dimensional network environments. This is necessary for reducing the overall cost of implementation and execution.
(viii): Role of transfer learning: Transfer learning can be used to accelerate the process of detecting unoccupied spectrums in CRIoT networks, where pre-trained CNNs (or other deep learning models) are utilized for extracting the features of the input data. The learned characteristics in the hidden layers of pre-trained CNNs or MLPs are further used as features for fresh and related tasks, leading to improved performance [134]. Though the authors in [104] have cascaded hybrid CNNs and RNNs with transfer learning to improve spectrum detection at low SNRs, there is a need for more research that will leverage on transfer learning to reduce spectrum detection time.

10. Conclusions

CR is essential for enhancing the growth of IoT communication in stochastic network environments, characterized by shadowing, slow and fast fading, as well as terrestrial disturbances. One of the core functions of CR is SS, where the IoT devices sense and transmit in the unused portion of the licensed spectrum assigned to the PUs. This process guarantees maximum spectrum utilization. Numerous methods are available in the literature for SS, each with its pros and cons. This paper presents a survey of works on SS to bring them up to date with the latest efforts in this domain. We focus specifically on the description of the techniques widely adopted for SS, including ED, CFD, MF, Pietra–Ricci detection, and ML. We highlight various articles where the application of the aforementioned methods is demonstrated. In addition, we identify the drawbacks and highlight challenges that affect the implementation of various techniques. Finally, prospects or open areas that may be explored as future research are observed and highlighted. It is our belief that the article will provide guidance to communication system designers and researchers willing to embark on finding missions in this field.

As a perception, the importance of model-free RL algorithms for SS in 5G and beyond networks cannot be ignored. RL is robust to extraneous network challenges, which is essential in obtaining accurate spectrum detection results. This gain may be further consolidated by multi-agent RL, which utilizes two or more agents to interact simultaneously with primary sources. Multi-agent RL can recognize non-stationarity in the network environment and yield scalable and optimal solutions that converge faster than single-agent RL. Moreover, MLP is capable of handling a high amount of sensing information generated by many IoT devices in dense networks. Thus, MADRL is proposed for addressing SS problems in 5G and beyond networks, with many IoT devices and multiple primary channels. The authors are currently investigating the application of MADRL for SS in CRIoT network and the outcome of the investigation shall be reported in our forthcoming paper.

Author Contributions

Conceptualization, A.A.R. and T.O.O.; methodology, A.A.R. and T.O.O.; investigation, A.A.R. and T.O.O.; writing—original draft preparation, A.A.R.; writing—review and editing, A.A.R. and T.O.O.; funding acquisition, T.O.O. All authors have equally contributed to the article. All authors have read and agreed to the published version of the manuscript.

Funding

The research is funded by Tshwane University of Technology, Pretoria, South Africa and National Research Foundation (NRF), South Africa.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

5G	Fifth Generation
ANN	Artificial neural network
ASK	Amplitude shift keying
AWGN	Additive white Gaussian noise
BER	Bit error rate
CFD	Cyclostationary feature detection
CNN	Convolutional neural network
CR	Cognitive radio
CRIoT	Cognitive radio internet of things
CSS	Cooperative spectrum sensing
DL	Deep learning
DNN	Deep neural network
DRL	Deep reinforcement learning
ED	Energy detection
FSK	Frequency shift keying
GM	Gaussian mixture model
GPU	Graphics processing unit
ICA	Independent component analysis
IoT	Internet of Things
KNN	K-nearest neighbor
LTE	Long term evolution
LR	Logistic regression
LSTM	Long short-term memory
MBPO	Model-based policy optimizer
MDP	Markov decision process
MIMO	Multiple input multiple output
MRC	Maximum ratio combining
MF	Matched filter
ML	Machine learning
MLP	Multi-layer perceptron
MSE	Mean square error
MADRL	Multi-agent deep reinforcement learning
NB	Naïve Bayes
NCSS	Non-cooperative spectrum sensing
PCA	Principal component analysis
PILCO	Probabilistic inference for learning control
PSO	Particle swarm optimization
PU	Primary user
ReLU	Rectified linear unit
RF	Random forest
RL	Reinforcement learning
RMSProp.	Root mean square propagation
RNN	Recurrent neural network
SGD	Stochastic gradient descent
SNR	Signal to noise ratio
SS	Spectrum sensing
SARSA	State-action-reward–state-action
SU	Secondary user
SVM	Support vector machine

References

Colakovic, A.; Hadzialic, M. Internet of Things (IoT): A review of enabling technologies, challenges and open research issues. Comput. Netw. 2018, 144, 17–39. [Google Scholar] [CrossRef]
Muzaffar, M.M.; Sharqi, R. A review of spectrum sensing in modern cognitive radio networks. Telecommun. Syst. 2024, 85, 347–363. [Google Scholar] [CrossRef]
Yucek, T.; Arslan, H. A survey of spectrum sensing algorithms for cognitive radio applications. IEEE Commun. Surv. Tutor. 2009, 11, 116–130. [Google Scholar] [CrossRef]
Islam, H.; Das, S.; Bose, T.; Ali, T. Diode based reconfigurable microwave filter for cognitive radio applications: A review. IEEE Access 2020, 8, 185429–185444. [Google Scholar] [CrossRef]
Chae, K.; Kim, Y. DS2MA, A deep learning based spectrum sensing scheme for a multi-antenna receiver. IEEE Wirel. Commun. Lett. 2023, 12, 952–956. [Google Scholar] [CrossRef]
Hlapsi, N.M. Enhancing hybrid spectrum access in CR-IoT networks: Reducing sensing time in Low SNR environments. Mesopotamian J. Comput. Sci. 2023, 2023, 47–52. [Google Scholar] [CrossRef]
Patil, P.; Pawar, P.R.; Jain, P.P.; Manoranjan, K.V.; Pradhan, D. Enhanced spectrum sensing based on Cyclo-stationary Feature Detection (CFD) in cognitive radio network using Fixed & Dynamic Thresholds Levels. Saudi J. Eng. Technol. 2020, 5, 271–277. [Google Scholar] [CrossRef]
Kumar, A.; Venkatesh, J.; Gaur, N.; Alsharif, M.H.; Uthansakul, P.; Uthansakul, M. Cyclostationary and energy detection spectrum sensing beyond 5G waveforms. Electron. Res. Arch. 2023, 31, 3400–3416. [Google Scholar] [CrossRef]
Kumar, A.; Venkatesh, J.; Gaur, N.; Alsharif, M.H.; Jahid, A.; Raju, K. Analysis of hybrid spectrum sensing for 5G and 6G waveforms. Electronics 2022, 12, 138. [Google Scholar] [CrossRef]
Solanki, S.; Dehalwar, V.; Choudhary, J. Deep learning for spectrum sensing in cognitive radio. Symmetry 2021, 13, 147. [Google Scholar] [CrossRef]
Gao, J.; Yi, X.; Zhong, C.; Chen, X.; Zhang, Z. Deep learning for spectrum sensing. IEEE Wirel. Commun. Lett. 2019, 8, 1727–1730. [Google Scholar] [CrossRef]
Bkassiny, M.; Li, Y.; Jayaweera, S.K. A survey on machine-learning techniques in cognitive radios. IEEE Commun. Surv. Tutor. 2012, 15, 1136–1159. [Google Scholar] [CrossRef]
Lee, W.; Kim, M.; Cho, D. Deep cooperative sensing: Cooperative spectrum sensing based on convolutional neural networks. IEEE Trans. Veh. Technol. 2019, 68, 3005–3009. [Google Scholar] [CrossRef]
Liu, C.; Wang, J.; Liu, X.; Liang, Y.-C. Deep CM-CNN for spectrum sensing in cognitive radio. IEEE J. Sel. Areas Commun. 2019, 37, 2306–2321. [Google Scholar] [CrossRef]
Soni, B.; Patel, D.K.; López-Benítez, M. Long short-term memory based spectrum sensing scheme for cognitive radio using primary activity statistics. IEEE Access 2020, 8, 97437–97451. [Google Scholar] [CrossRef]
Sarikhani, R.; Keynia, F. Cooperative spectrum sensing meets machine learning: Deep reinforcement learning approach. IEEE Commun. Lett. 2020, 24, 1459–1462. [Google Scholar] [CrossRef]
Pati, B.M.; Kaneko, M.; Taparugssanagorn, A. A deep convolutional neural network based transfer learning method for non-cooperative spectrum sensing. IEEE Access 2020, 8, 164529–164545. [Google Scholar] [CrossRef]
Zeng, Y.; Liang, Y.-C. Spectrum-sensing algorithms for cognitive radio based on statistical covariances. IEEE Trans. Veh. Technol. 2009, 58, 1804–1815. [Google Scholar] [CrossRef]
Arshad, K.; Imran, M.A.; Moessner, K. Collaborative spectrum sensing optimization algorithms for Cognitive Radio Networks. Int. J. Digit. Multimed. Broadcast. 2010, 2010, 1–20. [Google Scholar] [CrossRef]
Mahapatra, R.; Krusheel, M. Cyclostationary detection for cognitive radio with multiple receivers. IEEE ISWCS 2008, 493–497. [Google Scholar]
Guimaraes, D.A. Pietra-Ricci index detector for centralized data fusion cooperative spectrum sensing. IEEE Trans. Veh. Technol. 2020, 69, 12354–12358. [Google Scholar] [CrossRef]
Yawada, P.S.; Dong, M.T. Performance analysis of new spectrum sensing scheme using multi antennas with multiuser diversity in cognitive radio networks. Wirel. Commun. Mob. Comput. 2018, 2018, 8560278. [Google Scholar] [CrossRef]
Jaronde, P.; Vyas, A.; Gaikwal, M. Spectrum efficient cognitive radio sensor network for IoT with low energy consumption. Int. J. Recent Innov. Trends Comput. Commun. 2023, 11, 469–479. [Google Scholar] [CrossRef]
Guo, H.; Jiang, W.; Luo, W. Linear soft combination for cooperative spectrum sensing in cognitive radio networks. IEEE Commun. Lett. 2017, 21, 1573–1576. [Google Scholar] [CrossRef]
Do, N.T.; An, B. A soft-hard combination- based cooperative spectrum sensing scheme for cognitive radio networks. Sensors 2015, 15, 4388–4407. [Google Scholar] [CrossRef]
Nasser, A.; Hassan, H.A.; Chaaya, J.A.; Mansour, A.; Yao, K.-C. Spectrum sensing for cognitive radio: Recent advances and future challenges. Sensors 2021, 21, 2408. [Google Scholar] [CrossRef]
Jerry, R.; Adekogba, O.O.; Maxwell, F.; Usman, A.D. A review of spectrum sensing times in cognitive radio networks. Adv. Engr. Des. Technol. 2023, 5, 29–49. [Google Scholar]
Atzori, L.; Iera, A.; Morabito, G. The Internet of Things: A survey. Comput. Netw. 2010, 54, 2787–2805. [Google Scholar] [CrossRef]
Raji, A.A.; Orimolade, J.F.; Ewetola, I.A. Design and implementation of internet of things based scheme for testing loamy soil. Turk. J. Eng. 2025, 9, 323–333. [Google Scholar] [CrossRef]
Bhuiyan, M.N.; Rahman, M.M.; Billah, M.M.; Saha, D. Internet of Things (IoT): A review of its enabling technologies in healthcare applications, standards protocols, security, and market opportunities. IEEE Internet Things J. 2021, 8, 10474–10498. [Google Scholar] [CrossRef]
Yang, H.; Zhong, W.-D.; Chen, C.; Alphones, A.; Xie, X. Deep-reinforcement-learning-based energy-efficient resource management for social and cognitive internet of things. IEEE Internet Things J. 2020, 7, 5677–5689. [Google Scholar] [CrossRef]
Perera, C.; Liu, C.H.; Jayawardena, S. The emerging Internet of Things marketplace from an industrial perspective: A survey. IEEE Trans. Emerg. Top. Comput. 2015, 3, 585–598. [Google Scholar] [CrossRef]
Al-Fuqaha, A.; Guizani, M.; Mohammadi, M.; Aledhari, M.; Ayyash, M. Internet of Things: A survey on enabling technologies, protocols and applications. IEEE Commun. Surv. Tutor. 2015, 17, 2347–2376. [Google Scholar] [CrossRef]
Shaikh, F.K.; Zeadally, S.; Exposito, E. Enabling technologies for green Internet of Things. IEEE Syst. J. 2017, 11, 983–994. [Google Scholar] [CrossRef]
Ahan, M.S.; Pathan, A.-S.K. A comprehensive survey on the requirements, applications, and future challenges for access control models in IoT: The state of the art. IoT 2025, 6, 9. [Google Scholar] [CrossRef]
Zhu, C.; Leung, V.C.M.; Shu, L.; Ngai, E.D.C. Green Internet of Things for Smart World. IEEE Access 2015, 3, 2151–2162. [Google Scholar] [CrossRef]
Raji, A.A.; Orimolade, J.F.; Adejumobi, I.A.; Amusa, K.A.; Olajuwon, B.I. Channel estimation via compressed sampling matching pursuit for hybrid MIMO architectures in millimeter wave communication. Int. J. Electron. Lett. 2025, 13, 56–70. [Google Scholar] [CrossRef]
Al-Turjman, F.M. Information-centric sensor networks for cognitive IoT: An overview. Ann. Telecommun. 2016, 72, 3–18. [Google Scholar] [CrossRef]
Miah, M.S.; Schukat, M.; Barrett, E. A throughput analysis of an energy-efficient spectrum sensing scheme for the cognitive radio based internet of things. EURASIP J. Wirel. Commun. Netw. 2021, 201, 1–36. [Google Scholar] [CrossRef]
Liu, X.; Li, Y.; Zhang, X.; Lu, W.; Xiong, M. Energy efficient resource optimization in green cognitive internet of things. Mob. Netw. Appl. 2020, 25, 2527–2535. [Google Scholar] [CrossRef]
Wu, Q.; Ding, G.; Xu, Y.; Feng, S.; Du, Z.; Wang, J.; Long, K. Cognitive Internet of Things: A new paradigm beyond connection. IEEE Internet Things J. 2014, 1, 129–143. [Google Scholar] [CrossRef]
Garhwal, A.; Bhattacharya, P.P. A survey on dynamic spectrum access technologies for cognitive radio. Int. J. Next-Gener. Netws. 2012, 3, 15–32. [Google Scholar] [CrossRef]
Saad, M.A.; Mustafa, S.T.; Ali, M.H.; Hashim, M.M.; Bin Ismail, M.; Ali, A.H. Spectrum sensing and energy detection in cognitive networks. Indones. J. Electr. Eng. Comput. Sci. 2020, 17, 465–472. [Google Scholar] [CrossRef]
Geng, X.; Hu, B. Maritime spectrum sensing based on cyclostationary features and convolutional neural networks. Entropy 2025, 27, 809. [Google Scholar] [CrossRef]
Damodaram, D.; Venkateshwarlu, T. Performance analysis of cyclostationary spectrum sensing in cognitive radio. Int. J. Appl. Eng. Res. 2020, 10, 43603–43610. [Google Scholar]
Zhang, X.; Chai, R.; Gao, F. Matched filter based spectrum sensing and power detection for cognitive radio network. In Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Atlanta, GA, USA, 3–5 December 2014; pp. 1267–1270. [Google Scholar]
Salahdine, F.; ElGhazi, H.; Kaabouch, N.; Fihri, W.F. Matched filter detection with dynamic threshold for cognitive radio networks. In Proceedings of the 2015 International Conference on Wireless Networks and Mobile Communications (WINCOM), Marrakesh, Morocco, 20–23 October 2015. [Google Scholar]
Geete, P.; Gupta, M.K. Matched filter spectrum sensing technique for various fading channels of cognitive radio network. J. Emerg. Technol. Innov. Res. (JETIR) 2019, 6, 551–559. [Google Scholar]
Kalhoro, S.; Umrani, F.A.; Khanzada, M.A.; Ali Rahoo, L. Matched filter based spectrum sensing technique for 4G cellular network. Mehran Univ. Res. J. Eng. Technol. 2019, 38, 973–978. [Google Scholar] [CrossRef]
Maza, D.; Ojo, J.O.; Akinlade, G.O. A predictive machine learning framework for diabetes. Turk. J. Eng. 2024, 8, 583–592. [Google Scholar] [CrossRef]
Bhardwaj, P.; Gupta, P.K.; Panwar, H.; Siddiqui, M.K.; Morales-Menendez, R.; Bhaik, A. Applications of deep learning on student engagement in e-learning environment. Comput. Electr. Eng. 2021, 93, 107277. [Google Scholar] [CrossRef]
Hastings, P.; Hughes, S.; Britt, M.A. Active learning for improving machine learning of students. Int. Conf. Artif. Intell. Educ. 2018, 10947, 140–153. [Google Scholar]
Siddiqui, M.K.; Morales-Menendez, R.; Gupta, P.K.; Iqbai, H.; Hussain, F.; Khatoon, K.; Ahmad, S. Correlation between temperature and CoVID-19 (suspected, confirmed and death) cases based on machine learning analysis. J. Pure Appl. Microbiol. 2020, 14, 1017–1024. [Google Scholar] [CrossRef]
Albahri, A.S.; Khaleel, Y.L.; Habeeb, M.A.; Ismael, R.D.; Hameed, Q.A.; Deveci, M.; Homo, R.Z.; Alhbahri, O.S.; Alamoodi, A.H.; Alzubaidi, L. A systematic review of trustworthy artificial intelligence applications in natural disasters. Comput. Electr. Eng. 2024, 118, 1–53. [Google Scholar] [CrossRef]
Wang, H.; Barone, G.; Smith, A. Current and future role of data fusion and machine learning in infrastructural health monitoring. Struct. Infrastruct. Eng. 2023, 20, 1853–1882. [Google Scholar] [CrossRef]
De vries, A.; Blinznyuk, N.; Pinedo, P. Invited review: Examples and opportunities for artificial intelligence (AI) in dairy farms. Appl. Anim. Sci. 2023, 3, 14–22. [Google Scholar] [CrossRef]
Hyder, U.; Talpur, M.-R.-H. Detection of cotton leaf disease with machine learning model. Turk. J. Eng. 2024, 8, 380–393. [Google Scholar] [CrossRef]
Sinap, V. Comparative analysis of machine learning techniques for credit card fraud detection: Dealing with imbalanced datasets. Turk. J. Eng. 2024, 8, 196–208. [Google Scholar] [CrossRef]
Praveen Kumar, D.; Amgoth, T.; Annavarapu, C.S.R. Machine learning algorithms for wireless sensor networks: A survey. Inf. Fusion 2019, 49, 1–25. [Google Scholar] [CrossRef]
Sun, Y.; Peng, M.; Zhou, Y.; Huang, Y.; Mao, S. Application of machine learning in wireless networks: Key techniques and open issues. IEEE Commun. Surv. Tutor. 2019, 21, 3072–3108. [Google Scholar] [CrossRef]
Kaur, J.; Khan, M.A.; Iftikhar, M.; Imran, M.; Haq, Q.E.U. Machine learning techniques for 5G and beyond. IEEE Access 2021, 9, 23472–23488. [Google Scholar] [CrossRef]
Wang, J.; Jiang, C.; Zhang, H.; Ren, Y.; Chen, K.-C.; Hanzo, L. Thirty years of machine learning: The road to Pareto-optimal wireless networks. IEEE Commun. Surv. Tutor. 2020, 22, 1472–1514. [Google Scholar] [CrossRef]
Alamu, O.; Olwal, T.O.; Migabo, M. Machine learning applications in energy harvesting Internet of Things Networks: A review. IEEE Access 2025, 13, 4235–4266. [Google Scholar] [CrossRef]
Yazici, İ.; Shayea, I.; Din, J. A survey of applications of artificial intelligence and machine learning in future mobile networks-enabled systems. Eng. Sci. Technol. Int. J. 2023, 44, 101455. [Google Scholar] [CrossRef]
Yang, X.; Song, Z.; King, I.; Xu, Z. A survey on deep semi-supervised learning. IEEE Trans. Knowl. Data Eng. 2022, 35, 8934–8954. [Google Scholar] [CrossRef]
Arjoune, Y.; Kaabouch, N. On spectrum sensing, a machine learning method for cognitive radio systems. In Proceedings of the 2019 IEEE International Conference on Electro Information Technology (EIT), Brookings, SD, USA, 20–22 May 2019; pp. 333–338. [Google Scholar]
Kavya, R.K.; Talmilsevi, T. Machine learning techniques in spectrum sensing. Int. J. Sci. Res. Sci. Eng. Technol. 2023, 10, 739–746. [Google Scholar]
Eren, E.; Censur, I. Comparative analysis of machine learning models for CO emission prediction in engine performance. Sak. Univ. J. Comput. Inf. Sci. 2025, 9, 1–11. [Google Scholar] [CrossRef]
Kim, T.; Vecchietti, L.F.; Choi, K.; Lee, S.; Har, D. Machine learning for advanced wireless sensor networks: A review. IEEE Sens. J. 2021, 21, 12379–12397. [Google Scholar] [CrossRef]
Kumar, V.; Kandpal, D.C.; Jain, M.; Gangopadhyay, R.; Debnath, S. K-mean clustering based cooperative spectrum sensing in generalized κ-μ fading channels. In Proceedings of the 2016 Twenty Second National Conference on Communication, Guwahati, India, 4–6 March 2016; pp. 1–5. [Google Scholar]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; Smith, R.R., Ed.; MIT Press: Peterborough, NH, USA, 2018. [Google Scholar]
Bouneffouf, D.; Rish, I.; Aggarwal, C. Survey on Applications of Multi-Armed and Contextual Bandits. In Proceedings of the 2020 IEEE Congress on Evolutionary Computation (CEC), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
Li, F.; Yu, D.; Yang, H.; Yu, J.; Karl, H.; Cheng, X. Multi-Armed- Bandit-Based spectrum scheduling algorithms in wireless networks: A survey. IEEE Wirel. Commun. 2020, 27, 24–30. [Google Scholar] [CrossRef]
Maghsudi, S.; Hossain, E. Multi-armed bandits with application to 5G small cells. IEEE Wirel. Commun. 2016, 23, 64–73. [Google Scholar] [CrossRef]
Barrachina-Muñoz, S.; Chiumento, A.; Bellalta, B. Multi-armed bandits for spectrum allocation in multi-agent channel bonding WLANs. IEEE Access 2021, 9, 133472–133490. [Google Scholar] [CrossRef]
Puterman, M.L. Markov Decision Processes: Discrete Stochastic Dynamic Programming; Wiley: Hoboken, NJ, USA, 2014. [Google Scholar]
Sharma, N.; Mastronarde, N.; Chakareski, J. Accelerated structure-aware reinforcement learning for delay-sensitive energy harvesting wireless sensors. IEEE Trans. Signal Process. 2020, 68, 1409–1424. [Google Scholar] [CrossRef]
Wu, K.; Jiang, H.; Tellambura, C. Sensing, probing, and transmitting policy for energy harvesting cognitive radio with two-stage after-state reinforcement learning. IEEE Trans. Veh. Technol. 2019, 68, 1616–1630. [Google Scholar] [CrossRef]
Seijen, H.V.; Mahmood, A.R.; Pilarski, P.M.; Machado, M.C.; Sutton, R.S. True online temporal-difference learning. J. Mach. Learn. Res. 2016, 17, 5057–5096. [Google Scholar]
Brunton, S.L.; Kutz, J.N. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
Mu, X.; Zhao, X.; Liang, H. Power allocation based on reinforcement learning for MIMO system with energy harvesting. IEEE Trans. Veh. Technol. 2020, 69, 7622–7633. [Google Scholar] [CrossRef]
Jiang, H.; He, H.; Liu, L.; Yi, Y. Q-learning for non-cooperative channel access game of cognitive radio networks. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018. [Google Scholar]
Mammeri, Z. Reinforcement learning based routing in networks: Review and classification of approaches. IEEE Access 2019, 7, 55916–55950. [Google Scholar] [CrossRef]
Abu Alsheikh, M.; Hoang, D.T.; Niyato, D.; Tan, H.-P.; Lin, S. Markov decision processes with applications in wireless sensor networks: A survey. IEEE Commun. Surv. Tutor. 2015, 17, 1239–1267. [Google Scholar] [CrossRef]
Usman, A.U.; Okereke, O.U.; Omizegba, E.E. Macrocell pathloss prediction using artificial intelligence techniques. Int. J. Electr. 2013, 101, 500–515. [Google Scholar] [CrossRef]
Meireles, M.R.G.; Almeida, P.E.M.; Simoes, M.G. A comprehensive review for industrial applicability of artificial neural networks. IEEE Trans. Ind. Electron. 2003, 50, 585–601. [Google Scholar] [CrossRef]
Bottou, L.; Cortes, C.; Denker, J.S.; Drucker, H.; Guyon, I.; Jackel, L.D.; LeCun, Y.; Muller, U.; Sackinger, E.; Simard, P.; et al. Comparison of classifier methods: A case study in handwritten digit recognition. In Proceedings of the 12th IAPR International Conference on Pattern Recognition, Jerusalem, Israel, 9–13 October 1994; Volume 77–82, p. 3. [Google Scholar]
Lv, M.; Zhou, G.; He, M.; Chen, A.; Zhang, W.; Hu, Y. Maize Leaf Disease Identification Based on Feature Enhancement and DMS-Robust Alexnet. IEEE Access 2020, 8, 57952–57966. [Google Scholar] [CrossRef]
Alippi, C.; Disabato, S.; Roveri, M. Moving Convolutional Neural Networks to Embedded Systems: The AlexNet and VGG-16 Case. In Proceedings of the 2018 17th ACM/IEEE Conference on Information Processing in Sensor Networks, Porto, Portugal, 11–13 April 2018; pp. 212–223. [Google Scholar]
Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P.; Nasrin, M.S.; Hasan, M.; Van Essen, B.C.; Awwal, A.A.S.; Asari, V.K. A state-of-the-art survey on deep learning theory and architectures. Electronics 2019, 8, 292. [Google Scholar] [CrossRef]
Jordan, M.I. Serial order: A parallel distributed processing approach. Adv. Psychol. Elsevier 1997, 121, 471–495. [Google Scholar]
Balwani, N.; Patel, D.K.; Soni, B.; Lopez-Benıtez, M. Long short-term memory based spectrum sensing scheme for cognitive radio. In Proceedings of the IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Istanbul, Turkey, 8–11 September 2019. [Google Scholar]
Xie, J.; Fang, J.; Liu, C.; Yang, L. Unsupervised deep spectrum sensing: A variational Auto-Encoder based approach. IEEE Trans. Veh. Technol. 2020, 69, 5307–5329. [Google Scholar] [CrossRef]
Cheng, Q.; Shi, Z.; Nguyen, D.N.; Dutkiewicz, E. Sensing OFDM signal: A deep learning approach. IEEE Trans. Commun. 2019, 67, 7785–7798. [Google Scholar] [CrossRef]
Raj, V.; Dias, I.; Tholeti, T.; Kalyani, S. Spectrum access in cognitive radio using a two-stage reinforcement learning approach. IEEE J. Sel. Top. Sig. Process. 2018, 12, 20–34. [Google Scholar] [CrossRef]
Vakili, S.; Liu, K.; Zhao, Q. Deterministic sequencing of exploration and exploitation for multi-armed bandit problems. IEEE J. Sel. Top. Sig. Process. 2013, 7, 759–767. [Google Scholar] [CrossRef]
Zhu, J.; Song, Y.; Jiang, D.; Song, H. A new deep-Q-learning-based transmission scheduling mechanism for the cognitive Internet of things. IEEE Internet Things J. 2018, 5, 2375–2385. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
Nguyen, T.T.; Reddi, V.J. Deep reinforcement learning for cyber security. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 3779–3795. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, D.; Qiu, R.C. Deep reinforcement learning for power system applications: An overview. CSEE J. Power Energy Syst. 2019, 6, 213–225. [Google Scholar]
Ning, Z.; Xie, L. A survey on multi-agent reinforcement learning and its applications. J. Autom. Intel. 2024, 3, 73–91. [Google Scholar] [CrossRef]
Canese, L.; Cardarilli, G.C.; Di Nunzio, L.; Fazzolari, R.; Giardino, D.; Re, M.; Spanò, S. Multi-Agent Reinforcement Learning: A Review of Challenges and Applications. Appl. Sci. 2021, 11, 4948. [Google Scholar] [CrossRef]
Tavares, C.H.A.; Marinello, J.C.; Proenca, M.L.; Abaw, T. Machine learning-based models for spectrum sensing in cooperative radio networks. IET Commun. 2020, 14, 3102–3109. [Google Scholar] [CrossRef]
Solanki, S.; Dehalwar, V.; Choudhary, J.; Kolhe, M.L.; Ogura, K. Spectrum sensing in cognitive radio using CNN-RNN and transfer learning. IEEE Access 2022, 10, 113482–113492. [Google Scholar] [CrossRef]
Kumar, A.; Gaur, N.; Chakravarty, S.; Alsharif, M.H.; Uthansakul, P.; Uthansakul, M. Analysis of spectrum sensing using deep learning algorithms: CNNs and RNNs. Ain Shams Eng. J. 2024, 15, 102505. [Google Scholar] [CrossRef]
Ajayi, O.O.; Badrudeen, A.A.; Oyedeji, A.I. Deep learning based spectrum sensing technique for smarter cognitive radio networks. J. Inven. Eng. Technol. 2021, 1, 64–77. [Google Scholar]
Mishra, Y.; Chanudhary, V.S. Deep learning approach for cooperative sensing under congested cognitive IoT network. J. Integr. Sci. Technol. 2024, 12, 1–8. [Google Scholar] [CrossRef]
Patel, D.K.; Lopez-Benitez, M.; Soni, B.; Garcia-Fernandez, A.-F. Artificial neural network design for improved spectrum sensing in cognitive radio. Wirel. Netw. 2020, 26, 6155–6174. [Google Scholar] [CrossRef]
Samala, S.; Mishra, S.; Singh, S.S. Machine learning based cooperative spectrum sensing in a generalized α-κ-β fading channels. J. Sci. Ind. Res. 2023, 82, 219–225. [Google Scholar]
Olatunji, S.A.; Fajemilehin, T.O.; Opadiji, J.F. Reduction of computational time for cooperative sensing using reinforcement learning algorithm. Afr. J. Comput. ICT 2019, 12, 90–108. [Google Scholar]
Raghavendra, L.R.; Manjunatha, R.C. Optimizing spectrum sensing in cognitive radio using Bayesian-optimized random forest. Int. J. Intell. Eng. Syst. 2023, 16, 505–518. [Google Scholar]
Gao, A.; Du, C.; Ng, S.X.; Liang, W. A cooperative spectrum sensing with multi-agent reinforcement learning approach in cognitive radio networks. IEEE Commun. Lett. 2021, 25, 2604–2608. [Google Scholar] [CrossRef]
Tan, T.; Jing, X. Cooperative spectrum sensing based on Convolutional Neural Networks. Appl. Sci. 2021, 11, 4440. [Google Scholar] [CrossRef]
Kumar, A.; Gaur, N.; Nanthaamornphong, A. Hybrid spectrum using Neural network-based MF and ED for enhanced detection in Rayleigh channel. J. Electr. Comput. Eng. 2025, 2025, 9506922. [Google Scholar] [CrossRef]
Prasad, K.V.V.; Rao, P.T. Learning based cooperative spectrum sensing for primary user detection in cognitive radio networks. ICTACT J. Commun. Technol. 2020, 11, 3. [Google Scholar] [CrossRef]
Li, Z.; Wu, W.; Liu, X.; Qi, P. Improved cooperative spectrum sensing model based on machine learning for cognitive radio networks. IET Commun. 2018, 12, 2485–2492. [Google Scholar] [CrossRef]
Ning, W.; Huang, X.; Yang, K.; Wu, F.; Leng, S. Reinforcement learning enabled cooperative spectrum sensing in cognitive radio networks. J. Commun. Netw. 2020, 22, 1. [Google Scholar] [CrossRef]
Saber, M.; ElRharras, A.; Saadane, R.; Chehri, A.; Hakem, N.; Kharraz, H.A. Spectrum sensing for smart embedded devices in cognitive networks using machine learning algorithms. Procedia Comput. Sci. 2020, 176, 2404–2413. [Google Scholar] [CrossRef]
Zhang, Y.; Luo, Z. A deep-learning based method for spectrum sensing with multiple feature combination. Electronics 2024, 13, 2795. [Google Scholar] [CrossRef]
Subekti, A.; Pardede, H.F.; Sustika, R.; Suyoto. Spectrum sensing for cognitive radio using deep Autoencoder neural network. In Proceedings of the 2018 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET), Serpong, Indonesia, 1–2 November 2018; pp. 81–85. [Google Scholar]
Subray, S.; Tsschmben, S.; Gifford, K. Towards enhancing spectrum sensing: Signal classification using Autoencoder. IEEE Access 2021, 9, 82288–82299. [Google Scholar] [CrossRef]
Li, Y.; Song, H.; Ren, X.; Zhang, Z.; Cheng, S.; Jing, X. Spectrum Sensing Meets ISAC: An Spectrum Detection Scheme for ISAC Services Based on Improved Denoising Auto-Encoder and CNN. Appl. Sci. 2025, 15, 3381. [Google Scholar] [CrossRef]
Mourougayare, K.; Amgothu, B.; Bhagat, S.; Srikanth, S. A robust multistage spectrum sensing model for cognitive radio applications. AEU Int. J. Electron. Commun. 2019, 110, 152876. [Google Scholar] [CrossRef]
Kockaya, K.; Develi, I. Spectrum sensing in cognitive radio network: Threshold optimization and analysis. EURASIP J. Wirel. Commun. Netw. 2020, 255, 1–20. [Google Scholar] [CrossRef]
Fernando, X.; Lazaroui, G. Spectrum sensing, clustering algorithms, and energy harvesting technology for cognitive-radio based Internet-of-Things network. Sensors 2023, 23, 7792. [Google Scholar] [CrossRef] [PubMed]
Song, Z.; Wang, X.; Liu, Y.; Zhang, Z. Joint spectrum resource allocation in NOMA-based cognitive radio network with SWIPT. IEEE Access 2019, 7, 89594–89603. [Google Scholar] [CrossRef]
Obite, I.; Usman, A.D.; Okafor, E. An overview of deep reinforcement learning for spectrum sensing in cognitive radio networks. Dig. Sign. Process. 2021, 113, 103014. [Google Scholar] [CrossRef]
Teodara, S.; George, D.-M.; Sherali, Z.; Silviu, F. Energy harvesting techniques for internet of things (IoT). IEEE Access 2021, 9, 39530–39549. [Google Scholar] [CrossRef]
Fakharian, M.M. A high gain wideband circularly polarized rectenna with wide ranges of input power and output power. Int. J. Electron. 2021, 109, 83–99. [Google Scholar] [CrossRef]
Erinosho, T.C.; Adekola, S.A.; Amusa, K.A. Design of practical rectennas for RF energy harvesting. In Proceedings of the 2019 Photonics & Electromagnetics Research Symposium (PIERS), Moscow, Russia, 17–20 June 2019; pp. 1149–1156. [Google Scholar]
Manesh, M.R.; Kaabouch, N. Security threats and countermeasures of MAC layer in cognitive radio networks. Ad Hoc Netw. 2018, 70, 85–102. [Google Scholar] [CrossRef]
Fihri, W.F.; Arjoune, Y.; El Ghazi, H.; Kaabouch, N.; Abou El Majd, B. A particle swarm optimization based algorithm for primary user emulation attack detection. In Proceedings of the 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2018; pp. 823–827. [Google Scholar]
Arjoune, Y.; Mrabet, Z.E.; Kaabouch, N. Multi-Attributes, Utility-Based, Channel Quality Ranking Mechanism for Cognitive Radio Networks. Appl. Sci. 2018, 8, 628. [Google Scholar] [CrossRef]
Alican, D.; Cemal, Y. Enhancing apple plant leaf disease detection performance with transfer learning methods. Sak. Univ. J. Comput. Inf. Sci. 2025, 9, 592–605. [Google Scholar]

Figure 1. Smart world supported by 5G and beyond internet access.

Figure 2. Forms of SS in CRIoT Networks.

Figure 3. Centralized CSS model.

Figure 4. CFD design for SS in the CRIoT network.

Figure 5. MF design for SS.

Figure 6. Classes of ML.

Figure 7. MDP model for SS [82].

Figure 8. Basic ANN model.

Figure 9. MLP architecture for SS in CRIoT networks.

Figure 10. Autoencoder architecture for SS.

Figure 11. Proposed MADRL architecture for SS in 5G and beyond IoT networks.

Figure 12. Distribution of the utilization of ML models for SS in the years covering 2018 to 2025.

Table 1. Summary of research/review articles on SS in the CRIoT network.

Reference	Type of the Article	SS Methods Adopted	Contribution/Findings	Remarks/Limitations
[18]	Research	Statistical method	The authors introduce a statistical covariance technique for SS, which is shown to outperform ED and CFD schemes.	The performance of the proposed technique degrades at low value of the threshold.
[19]	Research	Genetic algorithm	Genetic algorithm is proposed to optimize the sensing time and to reduce the probability of mis-detection.	It requires knowledge of the PU and channel condition, which may not be available prior to the design of the IoT network.
[20]	Research	CFD	The authors show that the probability of detecting the PU in a radio network is high if MIMO antennas are used.	The work fails to account for the imperfection in the wireless channel and does not demonstrate the application of the proposed technique in a real-world scenario.
[21]	Research	Pietra–Ricci index detection	A novel Pietra–Ricci index detector (statistical model) is utilized to detect the presence of the PU. The proposed technique is shown to exhibit less computational complexity when compared with CFD.	Goodness of fit test, which is critical for evaluating the performance of a statistical model is not considered by the authors.
[22]	Research	Multi-user antenna design system	The authors propose multi-user antenna system to detect the presence of the PU in a noisy region, where ED and CFD consider such conditions as signifying the absence of the PU signal.	The proposed technique falters at high-noise regimes of the wireless channel.
[23]	Research	PSO	The authors propose PSO to show that detection threshold, swarm size, and the number of iterations impact on spectrum detection and spectral efficiency of the wireless network.	The authors fail to validate the proposed technique by comparing its performance with the state-of-the-art techniques.
[24]	Research	Linear combination scheme	Proposes linear combination scheme for SS in slow, block, and fast-fading environments. The superiority of the proposed model to conventional linear combination schemes is demonstrated.	Mobility of wireless channel, which is synonymous with real-life applications is overlooked by the authors for ease of analysis.
[25]	Research	Soft combination scheme	The authors propose soft combination scheme to reduce the time required for detecting the vacant spectrum in the noisy conditions of the wireless channel. The superiority of the proposed technique to ED is demonstrated.	The performance of the proposed scheme degrades at high SNR values.
[26]	Review	__	Focusses the discussion on half and full duplex CR as well as SS in a wireless sensor network. CR applications in 5G and beyond network are discussed. Furthermore, the paper discusses challenges that need to be addressed in SS in the latter part of the article.	Discussion on a ML-based solution to SS problems in the CRIoT network is not presented.
[27]	Review	ED, CFD, MF, and wave form detection are discussed	Discussion is centered on the utilization of ED, CFD, MF, and waveform detection methods for reducing sensing time in CR networks.	ML-based solution to SS problems is not discussed.
This paper	Review	MF, CFD, Pietra–Ricci detection scheme and ML algorithms are discussed	Provides an overview of SS techniques employed in CRIoT networks. The techniques considered are MF, CFD, the Pietra–Ricci detection scheme, and ML algorithms. Proposes MADRL for SS in CRIoT networks. Furthermore, an overview of the studies on the application of ML algorithms is presented. In addition, challenges in implementation and open areas for further exploration are highlighted and discussed.

Table 2. Comparison of the strengths and weaknesses of traditional and ML-based SS techniques.

SS Techniques	Strength	Weakness
ED	Implementation is easy and no prior information about the PU is required.	Exhibits poor performance at low SNR.
CFD	Robust to noise, fading, and shadowing.	Computationally intensive and takes longer time to sense vacant spectrum.
MF	Demonstrates robust performance at low SNR profile of wireless medium.	Requires information about the PU, which may not be available prior to the design of the IoT network
Pietra–Ricci Index detector	Prior information about the PU is not required.	Performance depends on the value of the threshold. Poor choice of the value of the threshold gives false pretense about the PU.
Supervised Learning and Unsupervised Learning	Learns features of the PU from a given dataset without requiring either partial or full knowledge of the PU.	Large quantity of datasets is required and the accuracy of spectrum detection depends on selected features.
Deep Learning (CNN, MLP, and LSTM)	Demonstrates good performance at low SNR.	High quantity of datasets is needed. High computational complexity is due to large hidden layers involved in training and the manner of processing the data.
Reinforcement Learning	Few datasets are required for implementation. Resilient performance at low SNR and it requires less time for sensing vacant spectrum.	High computing power is needed for implementation.

Table 3. Summary of the contributions on the application of ML for SS in CRIoT networks.

Reference	ML Techniques	Class of the ML Techniques	Cognitive Radio Network	Main Findings	Limitations
[66]	RF, SVM, decision tree, Naïve Bayes, KNN, LR, and ANN	Supervised ML	Cooperative	The authors demonstrate that the performance of RF is superior to SVM, decision tree, Naïve Bayes, KNN, LR, and ANN.	All ML techniques considered as candidates for sensing exhibit high $P_{F A}$ , which is undesirable in SS.
[103]	MLP, SVM, and NB	Supervised DL and supervised ML	Cooperative	MLP exhibits better performance in terms of training time and spectrum detection capability.	Attention is not paid to the assessment of the models in the non-stationary condition of the wireless channel, where SU and PU are mobile. The research is limited to the case of a single PU and three SUs.
[104]	Hybrid CNN-RNN and transfer learning	Supervised DL	Cooperative	Achieves high $P_{D}$ and low $P_{F A}$ .	The study incurs high computational complexity that may limit its adoption in large-scale networks.
[105]	RNN and CNN	Supervised DL	Cooperative	CNN and RNN outperform traditional techniques in terms of $P_{D}$ , $P_{F A}$ and bit error rate.	The models require large datasets for training and are susceptible to overfitting.
[106]	LSTM	Supervised DL	Cooperative	Higher spectrum detection accuracy and robust performance in low SNR regime of the wireless channel are achieved.	Large training datasets are needed, and relatively high detection time is required for implementing the ML technique.
[107]	ResNet50	Supervised DL	Non-cooperative	The rate of detection of PUs in the CRIoT network improves with the reduction in the level of noise in the network.	Optimization of the model to reduce computational time at high SNR values is not reported.
[108]	ANN	Supervised ML	Non-cooperative	High detection rate at high SNR and robust performance at low SNR are demonstrated. The model is superior to ED and improved ED.	Single PU and single SU are considered for investigation but a practical wireless communication system has many PUs and SUs.
[109]	K-means clustering	Unsupervised ML	Cooperative	Achieves resilient detection performance in the noisy condition of the channel.	Requires large training datasets.
[110]	Q-learning	RL	Cooperative	Improves spectrum detection in CR network.	The impact of changes in the network environment on the performance of RL algorithm is not addressed.
[111]	RF	Supervised ML	Cooperative	RF exhibits better accuracy than SVM, KNN, GMM, and NB.	The work is limited to the case of a single PU and three SUs.
[112]	Actor–critic	RL	Cooperative	Reduces the communication overhead required for SS.	The authors do not evaluate the performance of the learning model in terms of $P_{D}$ and $P_{F A}$ . The study overlooks real-world challenges and imperfections like small- and large-scale fading as well as noise interference, which may impede practical deployment.
[113]	CNN (AlexNet, LeNet, and VGG-16)	Supervised DL	Cooperative	Improves the accuracy of spectrum detection in the CRIoT network. The CNN models exhibit better sensing performance than traditional AND, OR, and voting-based SS schemes.	Large quantity of datasets is needed for its implementation.
[114]	ANN	Supervised ML	Cooperative	Utilizes an ANN to improve the performance of ED and MF at low SNRs. Achieves better accuracy in spectrum detection. Furthermore, the proposed ANN+ED and ANN+MF reduce the false alarm rate and BER. It is shown that ANN+MF outperforms ANN+ED, ED, CFD, and SVM,	High computational complexity and large training datasets are needed. The proposed models are susceptible to overfitting and require a vast amount of fine tuning to suit a real network environment.
[115]	SVM, KNN, and RL	Supervised ML and RL	Cooperative	Demonstrates the effectiveness of RL for SS.	The study lacks detailed analysis of practical deployment and challenges in the real world.
[116]	SVM	Supervised ML	Cooperative	Reduces computational time and improves the accuracy of spectrum detection. Furthermore, the work theoretically demonstrates how to reduce potential harm of redundant and abnormal SUs in an IoT network.	The paper overlooks imperfections like fading and noise interference in the real world.
[117]	Q-learning	RL	Cooperative	Reduces computational resources and delay in accessing the vacant PU band.	Noise interference in real-time implementation is not accounted for in the study.
[118]	ANN, SVM, decision tree, and KNN	Supervised ML	Experimental	Detects spectral holes of the primary communication in real time application.	Huge number of real signals are needed for training the ML models. The study overlooks the noise effect, shadowing, and multipath fading, which degrade performance in real-time applications.
[119]	Hybrid CNN-LSTM	Supervised DL	Cooperative	Improves the accuracy of spectrum detection in low SNR regimes of the wireless channel.	High computational complexity and dependence on large training datasets for implementation.
[120]	Deep Autoencoder	Unsupervised DL	__	Utilizes a deep autoencoder to learn the features of the PUs and determine whether they are active or not.	The research overlooks imperfections associated with the wireless channel. This may derail the adoption of the model for practical applications.
[121]	Deep, LSTM, and variational Autoencoders	Unsupervised DL	__	Demonstrates the superiority of LSTM and deep autoencoders in sensing and distinguishing LTE and Wi-Fi signals.	The research overlooks the impact of wireless channels on signal transmission. The effect of fading, shadowing, and noise are not considered, limiting the implementation of the study in the real world.
[122]	Denoising Autoencoder	Unsupervised DL	Cooperative	Proposes a denoising autoencoder for detecting the vacant spectrum in integrated sensors and communication networks.	The study overlooks the mobility or non-stationarity of the wireless channel, critical in real-life applications.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Raji, A.A.; Olwal, T.O. Spectrum Sensing in Cognitive Radio Internet of Things: State-of-the-Art, Applications, Challenges, and Future Prospects. J. Sens. Actuator Netw. 2025, 14, 109. https://doi.org/10.3390/jsan14060109

AMA Style

Raji AA, Olwal TO. Spectrum Sensing in Cognitive Radio Internet of Things: State-of-the-Art, Applications, Challenges, and Future Prospects. Journal of Sensor and Actuator Networks. 2025; 14(6):109. https://doi.org/10.3390/jsan14060109

Chicago/Turabian Style

Raji, Akeem Abimbola, and Thomas O. Olwal. 2025. "Spectrum Sensing in Cognitive Radio Internet of Things: State-of-the-Art, Applications, Challenges, and Future Prospects" Journal of Sensor and Actuator Networks 14, no. 6: 109. https://doi.org/10.3390/jsan14060109

APA Style

Raji, A. A., & Olwal, T. O. (2025). Spectrum Sensing in Cognitive Radio Internet of Things: State-of-the-Art, Applications, Challenges, and Future Prospects. Journal of Sensor and Actuator Networks, 14(6), 109. https://doi.org/10.3390/jsan14060109

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spectrum Sensing in Cognitive Radio Internet of Things: State-of-the-Art, Applications, Challenges, and Future Prospects

Abstract

1. Introduction

2. Cognitive Radio Internet of Things

3. SS in CRIoT Networks

3.1. Energy Detection (ED)

3.2. Cyclostationary Feature Detection (CFD)

3.3. Matched Filter (MF)

3.4. Pietra–Ricci Index Detection

4. ML-Based SS Techniques

4.1. Supervised Learning

4.1.1. Support Vector Machine (SVM)

4.1.2. Logistic Regression

4.1.3. Decision Tree

4.1.4. Random Forest (RF)

4.1.5. K-Nearest Neighbor (KNN)

4.1.6. Naïve Bayes

4.2. Unsupervised Learning

4.2.1. K-Means Clustering

4.2.2. Bayesian Learning

4.3. Reinforcement Learning (RL)

4.4. Deep Learning (DL)

4.4.1. Multilayer Perceptron

4.4.2. Convolutional Neural Network

4.4.3. Recurrent Neural Network

4.4.4. Autoencoder

4.4.5. Multi-Agent Deep Reinforcement Learning (MADRL)

5. Review of the Application of ML for SS in Cognitive Radio Networks

6. Importance of SS in 5G and Beyond IoT Networks

7. Lessons Learnt

8. Challenges of SS in CRIoT Network

9. Future Prospects

10. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI