A Lightweight Deep Learning Model for Automatic Modulation Classification Using Residual Learning and Squeeze–Excitation Blocks

Nisar, Malik Zohaib; Ibrahim, Muhammad Sohail; Usman, Muhammad; Lee, Jeong-A

doi:10.3390/app13085145

Open AccessArticle

A Lightweight Deep Learning Model for Automatic Modulation Classification Using Residual Learning and Squeeze–Excitation Blocks

Department of Computer Engineering, Chosun University, Gwangju 61452, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2023, 13(8), 5145; https://doi.org/10.3390/app13085145

Submission received: 27 February 2023 / Revised: 13 April 2023 / Accepted: 19 April 2023 / Published: 20 April 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Automatic modulation classification (AMC) is a vital process in wireless communication systems that is fundamentally a classification problem. It is employed to automatically determine the type of modulation of a received signal. Deep learning (DL) methods have gained popularity in addressing the problem of modulation classification, as they automatically learn the features without needing technical expertise. However, their efficacy depends on the complexity of the algorithm, which can be characterized by the number of parameters. In this research, we presented a deep learning algorithm for AMC, inspired by residual learning, which has remarkable accuracy and great representational ability. We also employed a squeeze-and-excitation network that is capable of exploiting modeling interconnections between channels and adaptively re-calibrates the channel-wise feature response to improve performance. The proposed network was designed to meet the accuracy requirements with a reduced number of parameters for efficiency. The proposed model was evaluated on two benchmark datasets and compared with existing methods. The results show that the proposed model outperforms existing methods in terms of accuracy and has up to

72.5 %

fewer parameters than convolutional neural network designs.

Keywords:

automatic modulation classification; deep neural network; residual learning; squeeze and excitation

1. Introduction

In wireless communication, the complexity of the environment and the signals is rapidly increasing. A vital phenomenon in ad-hoc networks such as cognitive radio (CR) and software-defined radio (SDR) is automatic modulation classification (AMC) [1]. In modulation, information is typically communicated between the transmitter and receiver in a standard communication environment [2], whereas devices in CR transmitters autonomously choose modulation schemes based on external contexts, and CR receivers should independently verify signal modulation patterns [3]. AMC assists the CR receivers in identifying the type of modulation selected by the transmitter. In SDR, AMC is applied to quickly respond to diverse and evolving communication networks whilst avoiding protocols overhead. The current technology in a cognitive jamming scenario involves the automatic discovery of the modulation schemes utilized by both favorable and adversarial signals [1].

While military technology has always been a driving force behind the advancement of AMC, commercial applications such as interference detection and spectrum sensing are also widespread [4]. The development of the 5th generation of telecommunication networks (5G), which is predicted to result in the proliferation of end devices in use and congestion of the electromagnetic spectrum, has sparked renewed interest in AMC. Without knowing the system parameters, AMC is used to determine the transmitter’s modulation configuration from the received signal as shown in Figure 1.

Signal, noise, and channel models have a significant effect on the classification result. Therefore, they are all used to develop AMC techniques. When the expected signal model or noise model does not fit the actual signal or noise, the corresponding classification model fails to perform adequately. A more sophisticated model may be expected to mostly reduce the gap with the real scenario. There are many unspecified parameters to evaluate, which leads to greater estimation mistakes, and the additional computing complexity cannot be overlooked. Furthermore, certain situations, such as molecular communications may not have manageable predictive methods, severely decreasing the classification accuracy of the typical design classifier [5]. Due to computing complexity, they have been restricted in their relevancy to a wider range of fields. Data-driven AMC methods have been designed to address these complexities [6].

To investigate the aforementioned complexities, we present a data-driven AMC technique using deep learning (DL) in order to classify the modulation of the signal. To acquire this, we compared the performance of various DL algorithms from preceding works with the proposed technique. The contributions of this paper are mentioned as follows:

We employed the property of a residual learning block possessing significant representational capability in order to acquire latent information from received signals repeatedly, enhancing classification accuracy.
Additionally, we utilized the squeeze-and-excitation network block that is customized for the AMC task. This takes full advantage of modeling channel interconnections and iteratively adjusts channel-wise characteristic responses to boost efficiency.
To improve the architecture, we stacked various type of layers, such as the conv2D layer, the batch normalization layer, and the global average pooling layer, which were used to extract the features.
Two datasets, namely RadioML 2016.04C and RadioML 2016.10A, were utilized from [7] to compare the effectiveness and generalization ability of AMC with diverse architecture configuration.
The datasets included eleven types of modulated signals, namely QPSK, AM-DSB, AM-SSB, BPSK, CPFSK, GFSK, PAM4, QAM16, QAM64, QPSK, and WBFM. They were all utilized to train the network. The simulation results demonstrate that in terms of accuracy, by including a higher number of modulation types, the proposed model achieves better performance to extract features compared to contemporary modulation classification techniques.

The remainder of the paper is organized as follows: The literature review is presented in Section 2, the proposed design is presented in Section 3, and Section 4 comprises the simulation results. Finally, Section 5 concludes the paper.

2. Literature Review

2.1. Likelihood-Based (LB) Method

AMC is treated as a hypothesis-testing problem in the LB method. The algorithm based on the LB method can be efficient from a Bayesian perspective, and it is beneficial for reducing the likelihood of a hypothesis problem occurring. High computational complexity often affects accurate decisions, which can be difficult to obtain in actual systems. The LB method can reduce the probability of misclassification and can obtain the best classification accuracy, as such methods maximize the chance of correct classification with perfect channel situations. Furthermore, in real-world scenarios, uncertainty factors must be considered, and the likelihood function is ineffective in handling any unknown parameters. The unknown parameters problem is replaced with the essential component of their probability density function (PDF) in the average-likelihood ratio test (ALRT) [8]. However, as the number of missing factors grow, the likelihood function in ALRT becomes more sophisticated, resulting in a significant processing cost. To solve the complexity, the generalized likelihood ratio test (GLRT) was developed. The parameters in GLRT are estimated by using a maximum likelihood (ML) estimator [9]. This biased classifier affects the performance of nested modulations such as 16-quadrature amplitude modulation (QAM) and 64-QAM. The hybrid likelihood ratio test (HLRT) improves the performance of the likelihood function with respect to the unknown function. It first evaluates the likelihood of the data symbols as discrete random variables, consistently allocated across the alphabet set, and considers the carrier phase as a predetermined variable [10]. Since this method requires prior information about the signal, including its carrier frequency and other channel parameters, its implementation becomes difficult in the presence of complex and unknown parameters. Despite their ability to provide optimal solutions, they may not be appropriate in practical scenarios [11].

2.2. Feature-Based (FB) Method

FB methods, which are widely used for AMC, extract features from the received signal and feed them into a classification system [12]. They are found to outperform LB techniques in terms of reliability and computational overhead. To detect the modulation type of a signal, FB methods have been employed on a set of data description features, which assist in formulating decisions [6]. There are two steps to the creation of FB modulation classification method: preprocessing and a classification algorithm.

Preprocessing: This stage is responsible for extracting features from the received signal. Different features can be chosen based on various circumstances and predictions. Certain immediate aspects of the signal, such as instantaneous signal power, frequency, phase, amplitude, and so on, are retrieved during the feature extraction phase [13]. As a result, these characteristics transform the raw data into patterns that must be learned by the classifier for the purpose of recognition.
Classification Algorithm: The classification algorithm utilizes the features from the preprocessor as an input, and outputs the modulation type of the signal for each received signal.

The FB approach creates a higher-dimensional environment in which signal characteristics can be isolated using a hyperplane [14]. Among the most commonly utilized features in FB approaches, high-order cumulants [15], wavelet transform [16], and cyclostationary features [17] are principally employed for feature extraction. In noncooperative circumstances, these statistical characteristics are often combined to improve reliability. A classifier processes and compares the obtained statistical properties of the incoming signal, with preset limits, to identify the modulation type in the classifier step.

Comparison between Likelihood-Based and Feature-Based Method

In contemporary research, several classifiers have been presented, notably maximum likelihood [18], distribution test-based [13], and machine learning-based classifiers. Notably, the efficiency and statistical complexity of each classifier are routinely measured. For every realistic implementation, choosing the proper classifier is crucial [19]. In contrast to the processes employed in likelihood-based AMC, statistical methods for feature extraction are often particularly less complex in terms of processing cost. Due to their intrinsic low complexity and the use of blind modulation schemes, feature-based techniques are increasingly popular for real-life scenarios, which demand no extra information about the signal or channel [20]. Therefore, owing to the aforementioned features, these two types of classifiers have dominated AMC for decades. In comparison, the LB classifier can find the best solution using Bayes sense to reduce the likelihood of incorrect classification. The FB method can achieve high reliability for recognizing basic modulation types such as BPSK and QPSK [21]. Moreover, for assessing unknown values, LB classifications possess a high computing complexity [22], whilst the FB classifier’s efficacy is significantly impacted by feature cohesion. Conventional AMC innovation has always relied on likelihood- and feature-based techniques. It seeks to develop more effective features and classifiers [23]. With the advancement of artificial intelligence in the past few decades, deeper learning-based algorithms have been utilized to tackle the AMC issues presented in [24,25]. With the data presented in [24], a convolutional neural network (CNN) for AMC and investigated structure optimal depth was proposed. Furthermore, in [25], a data-driven model based on LSTM was presented to overcome the AMC problem. A design composed of LSTM and CNN modules was considered as a solution for achieving high efficiency of AMC with different SNR regions. AMC approaches usually depend on feature extraction to reduce the complexity of signal data and classification accuracy [21]. In modern times, deep learning has made significant progress in a variety of applications, including resource allocation in LoRaWAN [26], edge computing [27], control science [28], voice recognition [29], and bioinformatics [30,31]. The capability of DL to easily discover features, from data in an end-to-end process, is partly responsible for the achievement of conceptual tasks due to its superior feature extraction and classification abilities. Therefore, the DL-based AMC technique can accurately analyze and detect modulated signals [32].

2.3. Deep Learning Techniques for Automatic Modulation Classification

Model-driven approaches mostly choose their features based on experience [33]. FB techniques lose certain original details whilst extracting some statistical features. This affects the performance of categorization, especially in low-SNR circumstances, while the DL-based network may extract highly representative features from the source signals and incorporate feature extraction as part of the classifier training process. Consequently, it surpasses conventional FB approaches in terms of classification performance [34].

For AMC problems, the first DL technique was used in [7], which consisted of a convolutional neural network (CNN) based on synthetic datasets for model learning, testing, and analysis (known as RML2016.10A and RML2016.04C). Due to the simplistic architecture of the convolution design, the accuracy rate was

71.30 %

and

87.4 %

with RML2016.10A and RML2016.04C, respectively. The datasets RML2016.10A and RML2016.04C will be referred to as

D 1

and

D 2

, respectively. The authors of [35] utilized the dataset of [7] to demonstrate the response of a convolutional neural network to temporal radio signals with complex values. The researchers evaluated the efficacy of radio modulation categorization by comparing naively learned features with expert feature-based approaches that are commonly used today. The study results revealed that the former approach had superior performance. The researchers evaluated the efficacy of radio modulation categorization by comparing naively learned features with expert feature-based approaches that are commonly used today. The study results revealed that the former approach had superior performance.

The work in [36] used the

D 1

dataset from [7], and an

80 %

classification accuracy was achieved through the implementation of a signal distortion correction module (CM). Recently, in DL-based methods, researchers have utilized residual learning techniques that were used for the feature-based approach initially introduced by He et al. [37]. The residual structure was employed to overcome the degradation issue and extract discriminative features to achieve sufficient performance. For AMC, the residual learning-based method was deployed in [38]. The ResNet structure was employed to identify the modulation formats for AMC, in which they yield moderate classification accuracy without any network structure adjustments. The authors of [39] proposed an innovative shared model based on a deep learning network using CNN-LSTM utilizing two expert features: wideband frequency modulation (WBFM) and quadrature amplitude modulation (QAM), to improve classification accuracy, and achieved better accuracy with

D 1

. In [40], a DL-based technique for categorizing signal modulation was proposed. The researchers compared multiple DL algorithms by leveraging insights from previous studies and utilizing a diverse set of layers to enhance the existing designs. They employed various techniques such as convolutional layers, dropout layers, and Gaussian noise layers to reduce overfitting and modify the scenario. Additionally, they improved accuracy while minimizing compute time by using a reduced number of filters in each layer. In [41], three efficient models for AMC—a convolutional long short-term deep neural network (CLDNN), a long short-term memory neural network (LSTM), and a deep residual network (ResNet)—were investigated, with the goal of ensuring high accuracy whilst shortening the time needed in order to train the systems.

In this research, we consider the datasets

D 1

and

D 2

from [7] to train and evaluate a convolutional neural network. We propose a less complex AMC method consisting of fewer parameters, compared to the contemporary methods, in an effort to leverage recent progress in neural networks. This advancement will be useful for the implementation of deep learning-based wireless network solutions.

3. Proposed Design

It is shown that the feature extraction ability of CNN is improved by increasing the depth of the network [42]. Even though increasing the number of network layers will allow neural networks to operate efficiently, it comes at a cost. For instance, the gradient vanishing problem is a challenge for training an exceedingly deep network [43]. Moreover, when the network is too dense, a degradation problem arises, causing the training accuracy to decrease [44]. Additionally, there are issues with information loss during information transmission. Some techniques, such as normalized initialization [45] and batch normalization [46], help to alleviate the problem to some extent. Recently, to resolve the above-mentioned issue of the DNN, deep learning models such as residual learning [37] have been shown to achieve outstanding performance by utilizing an effective bottleneck architecture, and they are effectively applied for modulation classification, whereas squeeze-and-excitation networks (SENet) [47] lead to improved performance at a lower cost of computing. We consider the properties of both architectures and propose an effective combination of these models, as detailed in the following sections.

3.1. Residual Learning

The residual network has been proposed to overcome the aforementioned challenges. The residual network adds the input data to the result of certain layers, and the accumulated result is transmitted through the output. Therefore, the network as a whole only needs to learn to differentiate between input and output, which reduces learning challenges, speeds up network convergence, and preserves data integrity. This direct connection is often referred to as a shortcut or skip connection. Residual learning approaches have received a lot of attention since they can extract discriminating features and perform better under various channels and noise configurations. Strong representational capabilities allow the deep residual learning network (ResNet) to continually learn latent information from the signals it receives, therefore increasing classification accuracy.

Considering the benefits of residual learning, we designed a CNN with a residual block to identify the modulation scheme of the Radio ML dataset. The proposed residual learning approach is shown in Figure 2. As depicted, the input data x have two transmission paths: one that transmits the information directly into the output, and another that is evaluated by three convolution layers to produce the output.

This process is formulated as follows:

f_{1} : g (x) = C O N V 2 D (x)

(1)

f_{2} : p (x) = C O N V 2 D (g (x))

(2)

f_{3} : q (x) = C O N V 2 D (p (x))

(3)

The residual block takes x as the input and forwards it to a stack of convolution layers represented as

f_{1}, f_{2}, and f_{3}

, and produces the corresponding outputs

g (x)

,

p (x)

, and

q (x)

. The residual block can be represented as

f (x) = x + q (x)

(4)

The inputs to the model are in-phase and quadrature (I & Q) samples of raw data with a shape of

2 \times 128

. The convolution layers consist of a filter size of 32 followed by a kernel size of

5 \times 5

,

3 \times 3

, and

1 \times 1

. The output of the convolution is passed through an activation function, which adds non-linearity to the neural network. The leaky rectified linear unit (Leaky ReLU) is employed in this network, which is a popular activation function due to its simplicity. It is useful for preventing gradient vanishing problems. The padding is kept the same in each layer, and kernels are initialized with a Glorot uniform initializer [48]. The output of the convolution layer is followed by batch normalization, which normalizes the input and speeds up the training process. The final layer is the max pooling layer, which determines the maximum value of the target region, reduces the training parameter, and extracts the important characteristics found in the dataset. This layer is helpful in enhancing the model’s reliability and effectiveness. The pool size in this network is

1 \times 2

. To avoid overfitting, we employed a regularization technique of dropout with a value of

0.5

. The final output of the network is represented as

Y = M a x p o o l (L e a k y R e L U (B N (f (x))))

(5)

3.2. Squeeze–Excitation Network

Convolutional filters are responsible for extracting low- and high-level features from the input data. As the data proceed to the successive layers, the spatial and channel information of the data are fused together to retain the best representation of the input. In conventional CNNs, each input channel is specifically weighted equally to create an output activation for the next layer. In SENets, however, each channel-wise feature value is adaptively readjusted by explicitly modeling interconnection between channels. This is carried out by squeezing the input feature maps to a single number using global average pooling, resulting in a vector of length C, where C is the number of convolutional channels. The input feature map of the squeeze-and-excitation block is denoted by

X \in R^{H \times W \times C}

, where H, W, and C represent the height, width, and number of channels of the input feature map, respectively, and are mathematically represented as

z = GlobalAvgPool (X) \in R^{C}

(6)

Here, the function GlobalAvgPool performs the average pooling operation over the height and width dimensions of the input feature X. Moreover, to perform the excitation operation, the vector is fed to a network consisting of two fully connected layers, which output a vector with the same length as the input. The weight matrices of these two FC layers are denoted as

W 1

and

W 2

, which can be expressed as

W_{1} \in R^{\frac{C}{r} \times C} a n d W_{2} \in R^{C \times \frac{C}{r}}

(7)

These C values are then used as weights on the original feature maps with each channel scaled according to their importance. The authors of SENet in [47] demonstrated that adding the SE blocks in the ResNet-50 results in a similar accuracy to that of ResNet-101, requiring only half of the computational cost. The design of the proposed SE block is shown in Figure 3. As represented, the input has two connections, one of which leads directly to the output, and the other of which is fed through an average pooling, in which each channel is squeezed into a numeric value. Furthermore, it is followed by a dense layer that contains a leaky ReLU, which assists in reducing the output channel complexity by the reduction ratio r, which is set to 16, such that the intermediate output of the first FC layer has a smaller dimension. The output of the excitation process can be written as follows, where

σ

is the softmax function that gives each channel a smooth gating function.

s = σ (W_{2} \cdot LeakyReLU (W_{1} \cdot z)) \in R^{C}

(8)

Here, LeakyReLU denotes the leaky rectified linear unit activation function. Finally, the output of the excitation operation is reshaped to have the same size as the input feature map X, and multiplied element-wise with the input feature map X to produce the output of the squeeze-and-excitation block; illustrated as

Y = reshape (s) \cdot X \in R^{H \times W \times C}

(9)

In Equation (9), the function

r e s h a p e (.)

transforms the input feature map s to have the same shape as the input feature map X. The skip connection from the input feature map X to the output feature map Y is added element-wise after the reshaping and scaling operations to obtain the final output of the block.

Overall Proposed Block

In various advance communication systems, accuracy is among the most significant variables affecting the quality of service. This implies a well-performing AMC model on the basis of computational power and classification performance, specifically if AMC is designed for application on edge devices for the end users. Therefore, this section presents the design of the proposed cost-effective method for modulation classification, utilizing the residual learning and squeeze-and-excitation networks.

Figure 4 depicts the architecture of the proposed modulation classification technique. The suggested design is based on four residual learning blocks and SE blocks. Firstly, the input reshape layer is linked with a residual block, which is used for making the dimension of data compatible with the model input without changing the data. The dimensions of the input data are

2 \times 128

, consisting of in-phase and quadrature (I and Q) samples. The modulation signal is sampled with a

l = 128

sample rectangular window that separates the data for training and testing. The architecture based on residual learning consists of three convolution layers. These convolution layers of residual learning are interconnected and are able to enfold common features and reduce the spatial dimension. Later on, two layers of asymmetric convolution kernels are stacked, containing kernel sizes of

(3, 3)

and

(1, 1)

, to reduce the number of trainable parameters without compromising the quality of the retrieved features. This is further followed by the batch normalization and max pooling layers, with a pool size and stride of

(2, 2)

. The residual blocks are applied to retrieve high-level features and perform identity mapping. Each block of convolution layers performs feature extraction. Followed by the residual block, the SE blocks are deployed, consisting of global average pooling with two fully connected layers, resulting in a decreased number of parameters. The SE blocks contain the dimension sizes of the output channel, which are 128, 64, 32 and 16, and the transformation ratio r is set to 16. The transformation ratio is a key hyperparameter that enables adjustment of the computational burden and performance of the SE blocks in the model. Therefore, to provide even better efficiency, the network then passes through the fully connected layer (selu) and the activation function (softmax). In order to increase the accuracy of the model, a skip connection is employed to connect the SE blocks in order to prevent the model from degradation issue and enhance the learning and representational properties of the deep networks.

Various features gathered from each block and the informational identity retained throughout the network via the skip connection approach are merged to enhance the classification model. As a result, during the network training phase, the proposed approach can overcome the problems of vanishing gradients and overfitting. At the end, the last layer contains the two fully connected layers and the softmax layer. The number of neurons in the FC layer is adjusted to the number of modulation classes of a stated dataset. The number of parameters is directly related to the computational complexity; the proposed model, with only 253,274 parameters, is extremely efficient in terms of computation burden. Reducing the number of residual and SE blocks, as well as the kernel size in convolution layers, would result in even lower number of parameters; however, because the system is not dense enough to generate highly discriminative features at multiscale interpretations of feature maps, the classification performance will be reduced.

The latent features of the proposed model have been extracted from the flattened layer and visualized using the t-distributed stochastic neighbor embedding (t-SNE). The t-SNE of the input space and the latent feature space are calculated and visualized as presented in Figure 5a,b, respectively. The t-SNE plots of the input and test feature space illustrate the superior discriminating feature extraction capability of the proposed network. However, there are a few instances of misclassification, such as the overlap in the QAM16 and QAM64 embeddings. It is established in [37] that as the network depth increases, the accuracy saturates and further degrades; therefore, increasing the number of layers in a network may lead to more training errors. Residual learning addresses this by identity mapping where a rather shallow network can achieve an equivalent, if not superior, performance compared to its deeper counterparts. This is due to the reason that the original mapping is recast into the relationship presented in Equation (4), which in turn results in a better optimization of the residual mapping compared to the original non-residual mapping. It is well known that most CNNs weight each of the input feature maps equally while performing convolution in a certain layer [49]. This might seem suitable in tasks related to computer vision, but in tasks such as AMC, some feature maps may hold more information than others. Therefore, the squeeze-and-excitation block ensures that appropriate emphasis be put on such feature maps. Subsequently, the proposed method combines these two feature extractors in various configurations and presents the best configuration as a result of the ablation study presented in the ensuing sections.

4. Results and Discussion

This section begins with an overview of the datasets [7] used in the simulation, followed by a discussion of the performance of our suggested approach under the influence of various noise distributions. However, in this study the training and testing process are performed on Python

3.9.13

, using Theano/Keras based on Tensorflow

2.4 . 1

with GPU RTX2070, 2560 CUDA cores, and 8GB GDDR6 VRAM on top of Intel(R) Core(TM) i5-9500 CPU @ 3.00 GHz (6 cores) processor and 16 GB of RAM. To tune the network, all models are trained in an end-to-end manner using the Adam optimizer algorithm. The learning rate in our experiment is set to 1

\times 10^{- 4}

. The presented system is trained for 100 epochs, with 512 input sample batch sizes.

4.1. Dataset Description

In order to evaluate and substantiate the effectiveness of the presented AMC approach, two datasets generated with GNU radio [7] are used to evaluate the modulation classification task. The datasets are designated as RML2016.10A and RML2016.04C, respectively. Their parameter list is shown in Table 1. Additionally, the characteristics of pulse shape, modulation, and carried data are made similar. The data are created randomly to match the data from the actual system. There are two data sources that comprise the RML2016a dataset. The publicly accessible Serial Episode No. 1 is utilized as an input for the analog modulations, along with other continuous sources such as voice. The full Gutenberg edition of Shakespeare’s plays is used for digital communication. The block normalizer is used for data whitening in order to make the digitally modulated data equiprobable. Then, to obtain the effect of unknown scaling and translation, the synthetic signals are routed across a number of channels. The final dataset is produced by the GNU radio channel model unit, which divides the time series signals into the datasets for training and testing using a rectangular window of 128 samples. Each sample in these data collection possesses two data channels (I/Q) of raw data with a size of

2 \times 128

, consisting of real and imaginary values. This dataset includes 11 digitally modulated signals: 8 digital and 3 analog modulation, which are all frequently employed in wireless communication networks. These include BPSK, 8PSK, CPFSK, BFSK, 16QAM, 64QAM, and PAM4 for digital modulations and AM-DSB, WB-FM, and AM-SSB for analog modulations. The signal-to-noise ratio (SNR) varies from

- 18

dB to 20 dB, with an overlap of 2 dB. All the SNRs in this range are contained in the train-and-test label.

4.2. Comparative Experiments of Various Networks

The two datasets mentioned in Table 1 are used to train the neural networks and evaluate their performance. The “Glorot” uniform initializer is utilized to initialize the weights of the convolutional layers in each network, while the “He” initializer is employed to initialize the weights of the fully connected layers. The training weights are learned through the Adam optimizer by keeping a learning rate of

0.0001

, which minimizes the categorical cross-entropy loss function. We compared the performance of the proposed model with the existing modulation classification methods, namely GRU2, CLDNN [50], ResNet, Inception [51], 2DCNN [35], and ANN [52]. The proposed design presents a lower number of parameters with low computational complexity, as shown in Table 2. The total number of parameters of the proposed model was 253,274, which is fewer than the contemporary automatic modulation classification that focused on parameter reduction using alternative methods for performing convolution. The proposed model outperforms the baseline inception [53] method by having parameter reductions of

98.9

%, which is a higher amount of reduction than the other contemporary methods. For datasets RML 2016.10a and RML 2016.04c, the number of parameters and amount of reduction in terms of percentage with the contemporary approach are tabulated in Table 2. It can be seen that compared to the GRU2 method, the proposed method shows a

79.9

% parameter reduction, whereas compared to ResNet, the proposed model shows a

88.7

% reduction in parameters. Similarly, a reduction of

89.1 %

and

96.1

% in the number of parameters of the proposed method was observed compared to CLDNN [50], Conv2D [12], and

96.22

% 2DCNN [35], respectively. When compared to ANN [52], depthwise convolution, and depthwise separable convolution [12], parameter reductions of

97.2

%,

97.5

%, and

98.3

% were observed for the proposed method. We also compared the number of parameters of the proposed method with that of the pruned network. In particular, AlexNet was pruned and utilized for the classification of radio modulations [54]. The accuracy graph is also plotted in Figure 6 and Figure 7 for

D 1

and

D 2

, respectively.

We also evaluated the effectiveness of the proposed AMC method by comparing it with popular traditional feature-based methods, including Gaussian naive Bayes, Bernoulli naive Bayes, RBF-SVM, linear-SVM, decision tree, random forest, KNN, and ELM [35,55,56]. Figure 8 displays the classification accuracy at various SNRs. We show the classification accuracy of

D 2

for several SNR settings in which the proposed method performed significantly better than the other methods.

4.3. Ablation Study

The goal of an ablation study is to investigate how modifying or replacing network components and the hyperparameters affects the overall performance. We performed five groups of ablation studies, respectively, as mentioned in Table 3.

For each configuration, the model was trained and tested 10 times, and the one yielding the best performance in terms of accuracy was selected as the final model. The performance comparison of several configurations is shown in Figure 9, indicating that the RS-proposed network has better classification accuracy than its counterparts.

In the process of building the residual learning and squeeze-and-excitation network (RS), a range of models are developed for different aspects, such as number of RS blocks, the number of fully connected layers, the replacement of the max pooling layer, and replacing the number of filters and kernel sizes. These models are used to further process the input data after the features have been extracted by the convolution layer. The objective is to analyze and determine the optimal network structure that can effectively utilize a larger number of training features to enhance the classification accuracy and robustness of the model. These networks are represented as RS-1, RS-2, RS-3, RS-4, and RS-proposed. In Table 3, the configuration is illustrated.

A random selection of 70% of the total data is used as the training set, with 30% used as the verification set. The batch size is set to 512 and the epoch is 50 when training. Utilizing the categorical cross-entropy loss function and the Adam optimizer, the learning rate is set to 0.0001. In Table 4, the experimental findings are mentioned.

The proposed network has the best findings, as illustrated in Table 4. When compared to other ablation tests, the proposed approach has the best classification performance. This indicates that the diverse benefits of the different components of the residual learning and squeeze-and-excitation blocks (RS blocks) complement each other, and their combination leads to exceptional performance. In particular, the proposed network and RS-3 network perform better than the RS-4 network in classification performance, indicating that rich temporal information for modulation classification may be extracted by the layers of temporal feature extraction. The addition of signal-relevant features raises the overall number of features. The model appears to be overfitting. Overfitting can be reduced by reducing a layer of FC. Furthermore, the comparison of RS-3 with RS-proposed reveals that designing lightweight networks with multiple RS blocks and the replacement of different numbers of parameters and kernel sizes can be a meaningful approach to reduce model complexity. This insight can help in developing more efficient models.

By comparing these models, RS1 outperforms RS2, having few layers and one FC layer. On the other hand, RS-3 and RS-4 are based on RS-2, which adds the various sizes of filters and kernels by changing pooling layers, respectively. The comparison of RS-3 and RS-2 showed that alternating the size of the filter and kernel sizes with two FC layers can yield better classification accuracy than RS-2, which has only six RS layers and two FC layers. In RS-4, the max pool 2D is replaced by global average pooling. It takes the average of the feature map, and the resulting vector is fed directly into the activation layer. This layer yields low classification from other layers due to overfitting. Finally, our proposed network consists of a multichannel learning framework that plays a significant role in performance improvement. Additionally, group convolution can be utilized to reduce the model complexity while maintaining its effectiveness. However, we can conclude that our proposed network performs better and has a significant classification accuracy compared to other network configurations.

4.4. Comparison with Contemporary Methods

In [7], an AMC technique using

D 2

was presented, in which the use of a convolution neural network was investigated in the domain of complex-valued temporal radio signals. The efficacy of radio modulation classification using naively learned features was compared to expert feature-based techniques, and the results show significant performance. However, their proposed approach yielded

91.91

% at 18 dB SNR. In [12], the authors proposed deep learning techniques to meet the latency requirements in Internet of Things (IoT) applications. An effective convolution neural network based on 2D convolution, depthwise convolution, and depthwise separable convolution was presented to classify the modulation of received signals. On

D 1

and

D 2

, their models achieved

71.3

% and

87.4

% accuracy with Conv2d convolution at 18 dB SNR,

69.90

% and

92.10

% with depthwise convolution, and

71.25

% and

83.03

% with depthwise separable convolution.

Modified CNN was utilized in [35] to classify dataset

D 2

. On the basis of the naively learned technique, the results were retrieved to determine the best features for radio modulation classification. Their approach obtained

87.4 %

accuracy at 18 dB SNR. In [36], the authors deployed a signal distortion correction module (CM) to enhance the performance of modulation recognition schemes. They demonstrated that their suggested module with a CNN approach can achieve better accuracy than the CNN and CLDNN. Furthermore, their model was based on a neural network that can be assumed as estimator of carrier frequency and phase offset presented by the channel. They achieved 80% accuracy at an SNR above

- 14

% dB, with significant improvement above 0 dB on dataset

D 1

. In [57], IQ-modulated radio frequency signals were classified analytically by using reservoir computing on narrowband optoelectronic oscillators (OEOs) operated by a continuous-wave semiconductor laser. Additionally, they designed high-Q OEOs for ultralow-phase noise microwave production consisting of efficient hardware architecture to process such multi-GHz modulated signals. However, their architecture achieved

89 %

accuracy at 18 dB SNR on dataset

D 2

. However, compared to the prior work, our proposed model can learn latent information from received signals repeatedly, resulting in a higher accuracy of

81.0

% and

94.05

% for

D 1

and

D 2

, respectively, at 18 dB SNR, with fewer parameters. Results of the aforementioned methods are tabulated in Table 5.

Furthermore, Figure 10 demonstrates the confusion matrices for both datasets at an SNR value of 18 dB. In a confusion matrix, the real component is represented by each row, while the prediction component is represented by each column. Figure 10b illustrates the performance of the proposed network for dataset RML2016.04.C. The confusion matrix shows the superior classification performance of the network surpassing the contemporary methods. Figure 10a shows the classification performance on

D 1

, which indicates the two main factors affecting the classification accuracy at high SNR. WBFM is misclassified as AM-DSB; this situation could be the result of the analog audio stream in the dataset, which is a silent period. On the other hand, the constellation points of QAM16 are similar to the constellation points in QAM64; the network misclassified the modulations as QAM64, and may exhibit similar patterns of noise and inference. Furthermore, it has been shown in a study that some of the SNR values in 64-QAM have two different noise floor values, making the classification problem difficult [59]. Using RML 2016.10A, an edge intelligence algorithm of a convolutional neural network (CNN) based on an attention mechanism to carry out modulation recognition (MR) was developed, in which a similar misclassification in QAM modulations was reported [60]. They suggested that the misclassification of QAM modulations was due to a lack of information and similar symbolic structures when short-term observations are made in the presence of a large amount of noise. Likewise, in [61], RML 2016.10A was utilized to propose a framework that integrated one-dimensional (1D) convolution, two-dimensional (2D) convolution, and long-short-term-memory (LSTM) layers for feature extraction. The results presented in their study also indicated a similar misclassification of QAM modulations. Several other works that utilized the said dataset, including [51,62,63,64], have reported a similar misclassification of QAM modulations. The similar problem reflected in several of the recent works suggests that the misclassification must have been caused by the ineffective curation of the QAM dataset. However, we ought to improve the classification performance of the proposed model by introducing a fusion of various feature extraction techniques, including tSNE, PCA, etc., along with the convolutional features for various communication datasets in the future. Considering the superior performance of the proposed model for other modulation schemes, it is anticipated that the model would exhibit adequate performance to classify higher-order QAMs.

5. Conclusions

In this paper, to identify the modulation types of radio signals, an innovative network based on residual learning and squeeze-and-excitation architecture has been proposed. The proposed method outperforms traditional machine learning and deep learning feature based methods at high-SNR domains. In comparison with conventional CNN, separable convolution, and depthwise convolution models, the proposed model requires significantly fewer parameters and has outstanding performance in classifying the modulation of signals. Future research will focus on improving the performance for AMC issues in the lower-SNR region by more effectively evaluating the received signals. On the other hand, further improvements to the DL algorithms in terms of network complexity and inconsistencies left in the paper will be considered in the field of AMC.

Author Contributions

Conceptualization, M.Z.N. and M.U.; methodology, M.Z.N.; software, M.Z.N.; validation, M.Z.N., M.S.I. and M.U.; formal analysis, M.Z.N. and M.S.I.; investigation, M.S.I. and M.U.; resources, J.-A.L.; data curation, M.Z.N. and M.S.I.; writing—original draft preparation, M.Z.N.; writing—review and editing, M.S.I. and M.U.; visualization, M.Z.N.; supervision, J.-A.L.; project administration, J.-A.L.; funding acquisition, J.-A.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by research fund from Chosun University, 2022.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets employed in this study are publicly available at [7] https://www.deepsig.io/datasets (accessed on 23 March 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, X.; Wang, Q.; Wang, H. A Two-Fold Group Lasso Based Lightweight Deep Neural Network for Automatic Modulation Classification. In Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
Kim, B.; Kim, J.; Chae, H.; Yoon, D.; Choi, J.W. Deep neural network-based automatic modulation classification technique. In Proceedings of the 2016 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea, 19–21 October 2016; pp. 579–582. [Google Scholar]
Liang, Y.C.; Chen, K.C.; Li, G.Y.; Mahonen, P. Cognitive radio networking and communications: An overview. IEEE Trans. Veh. Technol. 2011, 60, 3386–3407. [Google Scholar] [CrossRef]
Triantaris, P.; Tsimbalo, E.; Chin, W.H.; Gündüz, D. Automatic modulation classification in the presence of interference. In Proceedings of the 2019 European Conference on Networks and Communications (EuCNC), Valencia, Spain, 18–21 June 2019; pp. 549–553. [Google Scholar]
Huang, S.; Dai, R.; Huang, J.; Yao, Y.; Gao, Y.; Ning, F.; Feng, Z. Automatic modulation classification using gated recurrent residual network. IEEE Internet Things J. 2020, 7, 7795–7807. [Google Scholar] [CrossRef]
Dobre, O.A.; Abdi, A.; Bar-Ness, Y.; Su, W. Survey of automatic modulation classification techniques: Classical approaches and new trends. IET Commun. 2007, 1, 137–156. [Google Scholar] [CrossRef]
O’shea, T.J.; West, N. Radio machine learning dataset generation with gnu radio. In Proceedings of the 6th GNU Radio Conference, Boulder, CO, USA, 12–16 September 2016; Volume 1. [Google Scholar]
Ramezani-Kebrya, A.; Kim, I.M.; Kim, D.I.; Chan, F.; Inkol, R. Likelihood-based modulation classification for multiple-antenna receiver. IEEE Trans. Commun. 2013, 61, 3816–3829. [Google Scholar] [CrossRef]
Polydoros, A.; Kim, K. On the detection and classification of quadrature digital modulations in broad-band noise. IEEE Trans. Commun. 1990, 38, 1199–1211. [Google Scholar] [CrossRef]
Panagiotou, P.; Anastasopoulos, A.; Polydoros, A. Likelihood ratio tests for modulation classification. In Proceedings of the MILCOM 2000 Proceedings. 21st Century Military Communications. Architectures and Technologies for Information Superiority (Cat. No. 00CH37155), Los Angeles, CA, USA, 22–25 October 2000; IEEE: Piscataway, NJ, USA, 2000; Volume 2, pp. 670–674. [Google Scholar]
Majhi, S.; Gupta, R.; Xiang, W.; Glisic, S. Hierarchical hypothesis and feature-based blind modulation classification for linearly modulated signals. IEEE Trans. Veh. Technol. 2017, 66, 11057–11069. [Google Scholar] [CrossRef]
Usman, M.; Lee, J.A. AMC-IoT: Automatic Modulation Classification Using Efficient Convolutional Neural Networks for Low Powered IoT Devices. In Proceedings of the 2020 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea, 21–23 October 2020; pp. 288–293. [Google Scholar]
Ghasemzadeh, P.; Banerjee, S.; Hempel, M.; Sharif, H. Accuracy analysis of feature-based automatic modulation classification with blind modulation detection. In Proceedings of the 2019 International Conference on Computing, Networking and Communications (ICNC), Honolulu, HI, USA, 18–21 February 2019; pp. 1000–1004. [Google Scholar]
Khan, R.; Yang, Q.; Ullah, I.; Rehman, A.U.; Tufail, A.B.; Noor, A.; Rehman, A.; Cengiz, K. 3D convolutional neural networks based automatic modulation classification in the presence of channel noise. IET Commun. 2022, 16, 497–509. [Google Scholar] [CrossRef]
Huang, S.; Yao, Y.; Wei, Z.; Feng, Z.; Zhang, P. Automatic modulation classification of overlapped sources using multiple cumulants. IEEE Trans. Veh. Technol. 2016, 66, 6089–6101. [Google Scholar] [CrossRef]
Ho, K.; Prokopiw, W.; Chan, Y. Modulation identification of digital signals by the wavelet transform. IEE Proc.-Radar, Sonar Navig. 2000, 147, 169–176. [Google Scholar] [CrossRef]
Dobre, O.A.; Oner, M.; Rajan, S.; Inkol, R. Cyclostationarity-based robust algorithms for QAM signal identification. IEEE Commun. Lett. 2011, 16, 12–15. [Google Scholar] [CrossRef]
Huynh-The, T.; Nguyen, T.V.; Pham, Q.V.; Kim, D.S.; Da Costa, D.B. MIMO-OFDM Modulation Classification Using Three-Dimensional Convolutional Network. IEEE Trans. Veh. Technol. 2022, 71, 6738–6743. [Google Scholar] [CrossRef]
Cardoso, C.; Castro, A.R.; Klautau, A. An efficient FPGA IP core for automatic modulation classification. IEEE Embed. Syst. Lett. 2013, 5, 42–45. [Google Scholar] [CrossRef]
Hou, C.; Liu, G.; Tian, Q.; Zhou, Z.; Hua, L.; Lin, Y. Multi-signal Modulation Classification Using Sliding Window Detection and Complex Convolutional Network in Frequency Domain. IEEE Internet Things J. 2022, 9, 19438–19449. [Google Scholar] [CrossRef]
Hazza, A.; Shoaib, M.; Alshebeili, S.A.; Fahad, A. An overview of feature-based methods for digital modulation classification. In Proceedings of the 2013 1st International Conference on Communications, Signal Processing, and Their Applications (ICCSPA), Sharjah, United Arab Emirates, 12–14 February 2013; pp. 1–6. [Google Scholar]
Hameed, F.; Dobre, O.A.; Popescu, D.C. On the likelihood-based approach to modulation classification. IEEE Trans. Wirel. Commun. 2009, 8, 5884–5892. [Google Scholar] [CrossRef]
Abdel-Moneim, M.A.; Al-Makhlasawy, R.M.; Abdel-Salam Bauomy, N.; El-Rabaie, E.S.M.; El-Shafai, W.; Farghal, A.E.; Abd El-Samie, F.E. An efficient modulation classification method using signal constellation diagrams with convolutional neural networks, Gabor filtering, and thresholding. Trans. Emerg. Telecommun. Technol. 2022, 33, e4459. [Google Scholar] [CrossRef]
Zhou, Y.; Fadlullah, Z.M.; Mao, B.; Kato, N. A deep-learning-based radio resource assignment technique for 5G ultra dense networks. IEEE Netw. 2018, 32, 28–34. [Google Scholar] [CrossRef]
Huang, H.; Guo, S.; Gui, G.; Yang, Z.; Zhang, J.; Sari, H.; Adachi, F. Deep learning for physical-layer 5G wireless techniques: Opportunities, challenges and solutions. IEEE Wirel. Commun. 2019, 27, 214–222. [Google Scholar] [CrossRef]
Farhad, A.; Kim, D.H.; Yoon, J.S.; Pyun, J.Y. Deep Learning-Based Channel Adaptive Resource Allocation in LoRaWAN. In Proceedings of the 2022 International Conference on Electronics, Information, and Communication (ICEIC), Jeju, Republic of Korea, 6–9 February 2022; pp. 1–5. [Google Scholar]
Blanco-Filgueira, B.; Garcia-Lesta, D.; Fernández-Sanjurjo, M.; Brea, V.M.; López, P. Deep learning-based multiple object visual tracking on embedded system for IoT and mobile edge computing applications. IEEE Internet Things J. 2019, 6, 5423–5431. [Google Scholar] [CrossRef]
Usama, M.; Lee, I.Y. Data-Driven Non-Linear Current Controller Based on Deep Symbolic Regression for SPMSM. Sensors 2022, 22, 8240. [Google Scholar] [CrossRef]
Deng, L.; Hinton, G.; Kingsbury, B. New types of deep neural network learning for speech recognition and related applications: An overview. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 8599–8603. [Google Scholar]
Usman, M.; Khan, S.; Lee, J.A. Afp-lse: Antifreeze proteins prediction using latent space encoding of composition of k-spaced amino acid pairs. Sci. Rep. 2020, 10, 7197. [Google Scholar] [CrossRef]
Usman, M.; Khan, S.; Park, S.; Lee, J.A. AoP-LSE: Antioxidant Proteins Classification Using Deep Latent Space Encoding of Sequence Features. Curr. Issues Mol. Biol. 2021, 43, 1489–1501. [Google Scholar] [CrossRef] [PubMed]
Wei, X.; Luo, W.; Zhang, X.; Yang, J.; Gui, G.; Ohtsuki, T. Differentiable Architecture Search-Based Automatic Modulation Classification. In Proceedings of the 2021 IEEE Wireless Communications and Networking Conference (WCNC), Nanjing, China, 29 March–1 April 2021; pp. 1–6. [Google Scholar]
Ullah, A.; Abbas, Z.H.; Zaib, A.; Ullah, I.; Muhammad, F.; Idrees, M.; Khattak, S. Likelihood ascent search augmented sphere decoding receiver for MIMO systems using M-QAM constellations. IET Commun. 2020, 14, 4152–4158. [Google Scholar] [CrossRef]
Guo, Y.; Liu, Y.; Oerlemans, A.; Lao, S.; Wu, S.; Lew, M.S. Deep learning for visual understanding: A review. Neurocomputing 2016, 187, 27–48. [Google Scholar] [CrossRef]
O’Shea, T.J.; Corgan, J.; Clancy, T.C. Convolutional radio modulation recognition networks. In International Conference on Engineering Applications of Neural Networks; Springer: Cham, Switzerland, 2016; pp. 213–226. [Google Scholar]
Yashashwi, K.; Sethi, A.; Chaporkar, P. A learnable distortion correction module for modulation recognition. IEEE Wirel. Commun. Lett. 2018, 8, 77–80. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Liu, X.; Yang, D.; El Gamal, A. Deep neural network architectures for modulation classification. In Proceedings of the 2017 51st Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 29 October–1 November 2017; pp. 915–919. [Google Scholar]
Yao, T.; Chai, Y.; Wang, S.; Miao, X.; Bu, X. Radio signal automatic modulation classification based on deep learning and expert features. In Proceedings of the 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 12–14 June 2020; Volume 1, pp. 1225–1230. [Google Scholar]
Zhang, H.; Huang, M.; Yang, J.; Sun, W. A Data Preprocessing Method for Automatic Modulation Classification Based on CNN. IEEE Commun. Lett. 2020, 25, 1206–1210. [Google Scholar] [CrossRef]
Ramjee, S.; Ju, S.; Yang, D.; Liu, X.; Gamal, A.E.; Eldar, Y.C. Fast deep learning for automatic modulation classification. arXiv 2019, arXiv:1901.05850. [Google Scholar]
Miao, J.; Xu, S.; Zou, B.; Qiao, Y. ResNet based on feature-inspired gating strategy. Multimed. Tools Appl. 2022, 81, 19283–19300. [Google Scholar] [CrossRef]
Li, W.; Guo, Y.; Wang, B.; Yang, B. Learning spatiotemporal embedding with gated convolutional recurrent networks for translation initiation site prediction. Pattern Recognit. 2023, 136, 109234. [Google Scholar] [CrossRef]
Subramanian, M.; Shanmugavadivel, K.; Nandhini, P. On fine-tuning deep learning models using transfer learning and hyper-parameters optimization for disease identification in maize leaves. Neural Comput. Appl. 2022, 34, 13951–13968. [Google Scholar] [CrossRef]
Xuan, H.; Liu, J.; Yang, P.; Gu, G.; Cui, D. Emotion Recognition from EEG Using All-Convolution Residual Neural Network. In International Workshop on Human Brain and Artificial Intelligence; Springer: Singapore, 2023; pp. 73–85. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 448–456. [Google Scholar]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; JMLR Workshop and Conference Proceedings. pp. 249–256. [Google Scholar]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Hong, D.; Zhang, Z.; Xu, X. Automatic modulation classification using recurrent neural networks. In Proceedings of the 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 13–16 December 2017; pp. 695–700. [Google Scholar]
West, N.E.; O’shea, T. Deep architectures for modulation recognition. In Proceedings of the 2017 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Baltimore, MD, USA, 6–9 March 2017; pp. 1–6. [Google Scholar]
Jagannath, J.; Polosky, N.; O’Connor, D.; Theagarajan, L.N.; Sheaffer, B.; Foulke, S.; Varshney, P.K. Artificial neural network based automatic modulation classification over a software defined radio testbed. In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas City, MO, USA, 20–24 May 2018; pp. 1–6. [Google Scholar]
O’Shea, T.J.; Roy, T.; Clancy, T.C. Over-the-air deep learning based radio signal classification. IEEE J. Sel. Top. Signal Process. 2018, 12, 168–179. [Google Scholar] [CrossRef]
Zhang, Z.; Tu, Y. A Pruning Neural Network for Automatic Modulation Classification. In Proceedings of the 2021 8th International Conference on Dependable Systems and Their Applications (DSA), Yinchuan, China, 5–6 August 2021; pp. 189–194. [Google Scholar]
Güner, A.; Alçin, Ö.F.; Şengür, A. Automatic digital modulation classification using extreme learning machine with local binary pattern histogram features. Measurement 2019, 145, 214–225. [Google Scholar] [CrossRef]
Rajendran, S.; Meert, W.; Giustiniano, D.; Lenders, V.; Pollin, S. Deep learning models for wireless signal classification with distributed low-cost spectrum sensors. IEEE Trans. Cogn. Commun. Netw. 2018, 4, 433–445. [Google Scholar] [CrossRef]
Dai, H.; Chembo, Y.K. Classification of IQ-modulated signals based on reservoir computing with narrowband optoelectronic oscillators. IEEE J. Quantum Electron. 2021, 57, 5000408. [Google Scholar] [CrossRef]
Wang, Y.; Liu, M.; Yang, J.; Gui, G. Data-driven deep learning for automatic modulation recognition in cognitive radios. IEEE Trans. Veh. Technol. 2019, 68, 4074–4077. [Google Scholar] [CrossRef]
Cyclostationary Signal Processing. Available online: https://www.cyclostationary.blog (accessed on 5 April 2023).
Jiao, J.; Sun, X.; Zhang, Y.; Liu, L.; Shao, J.; Lyu, J.; Fang, L. Modulation recognition of radio signals based on edge computing and convolutional neural network. J. Commun. Inf. Netw. 2021, 6, 280–300. [Google Scholar] [CrossRef]
Xu, J.; Luo, C.; Parr, G.; Luo, Y. A spatiotemporal multi-channel learning framework for automatic modulation recognition. IEEE Wirel. Commun. Lett. 2020, 9, 1629–1632. [Google Scholar] [CrossRef]
Alzaq-Osmanoglu, H.; Alrehaili, J.; Ustundag, B.B. Low-SNR Modulation Recognition based on Deep Learning on Software Defined Radio. In Proceedings of the 2022 5th International Conference on Advanced Communication Technologies and Networking (CommNet), Marrakech, Morocco, 12–14 December 2022; pp. 1–6. [Google Scholar]
Dong, B.; Liu, Y.; Gui, G.; Fu, X.; Dong, H.; Adebisi, B.; Gacanin, H.; Sari, H. A Lightweight Decentralized-Learning-Based Automatic Modulation Classification Method for Resource-Constrained Edge Devices. IEEE Internet Things J. 2022, 9, 24708–24720. [Google Scholar] [CrossRef]
Wang, N.; Liu, Y.; Ma, L.; Yang, Y.; Wang, H. Multidimensional CNN-LSTM network for automatic modulation classification. Electronics 2021, 10, 1649. [Google Scholar] [CrossRef]

Figure 1. Block diagram of a communication link with automatic modulation classification.

Figure 2. Residual block.

Figure 3. Proposed squeeze block.

Figure 4. Overall block.

Figure 5. t-SNE of the input space vs. the latent feature space.

Figure 6. Classification accuracy of the various network architectures on

D 1

.

Figure 6. Classification accuracy of the various network architectures on

D 1

.

Figure 7. Classification accuracy of the various network architectures on

D 2

.

Figure 7. Classification accuracy of the various network architectures on

D 2

.

Figure 8. Classification accuracy of the various traditional feature-based methods on

D 2

.

Figure 8. Classification accuracy of the various traditional feature-based methods on

D 2

.

Figure 9. Performance comparison of different network configurations used in ablation study.

Figure 10. (a) Prediction performance of the proposed network on

D 1

(RML2016.10A), (b) Prediction performance of the proposed network on

D 2

(RML2016.04C).

Figure 10. (a) Prediction performance of the proposed network on

D 1

(RML2016.10A), (b) Prediction performance of the proposed network on

D 2

(RML2016.04C).

Table 1. Database representation.

Classes	BPSK, AM-DSB, AM-SSB, 8PSK, QPSK, BFSK, CPFSK, QAM16, QAM64, PAM4, WB-FM
Sample length	128
SNR Range	−20 dB to 18 dB
Databases	RadioML 2016.04C, RadioML 2016.10A
Number of Samples	162,060, 220,000

Table 2. Comparison of proposed method with contemporary approaches in terms of parameter reduction and number of parameters on 18 dB SNR.

Architecture	Parameters	Reduction	FLOPs
Inception [51]	23.9 M	–	1502 M
GRU2 [50]	4.8 M	$79.9 %$	0.234 M
ResNet [53]	2.7 M	$88.7 %$	149 M
CLDNN [50]	2.6 M	$89.1 %$	1.65 M
Conv2D [12]	921,611	$96.1 %$	3.15 M
2DCNN [35]	900,000	$96.2 %$	29.6 K
ANN [52]	670,000	$97.2 %$	NA
Depthwise [12]	596,491	$97.5 %$	3.58 M
Separable [12]	385,307	$98.3 %$	18.6 M
Pruned CNN [54]	36.6 M	NA	116.6 M
Proposed	253,274	$98.9 %$	748.88 M

Table 3. Network structures of several models included in ablation study.

Models	Network Structure
RS-1	RS, RS, RS, FC
RS-2	RS, RS, RS, RS, RS, RS, FC, FC
RS-3	RS, RS, RS, RS, RS, FC, FC, kernel size and filter replaced
RS-4	RS, global average pooling 2D, RS, RS, RS, FC, FC
RS-Proposed	RS, RS, RS, RS, max pooling 2D, FC, FC

Table 4. Parameters, model weight, and accuracy of several models in ablation study.

Neural Network	Parameters	Model Weight	Accuracy
RS-1	265,929	1.1 MB	$92.61$ %
RS-2	256,354	980 Kb	$92.12$ %
RS-3	283,142	1.668 MB	$93.40$ %
RS-4	259,258	1.52 MB	$91.55$ %
RS-Proposed	253,274	920 KB	$94.26$ %

Table 5. Comparison of accuracy attained for both RML datasets on various architecture configurations.

Paper	Dataset	Accuracy Per SNR	Classification Method	Architecture
[7]	Radio ML2016.04C	91.91% for SNR = 18 dB, Range [−20, +18]	Feature-Based	VTCNN2
[12]	$D 1$ = RML2016.10A, $D 2$ = RML2016.04C	69.90% for SNR = 18 dB for $D 1$ , and 92.10% at SNR = 18 dB for $D 2$ , Range [−20, +18]	Feature-Based	Depthwise
[12]	$D 1$ = RML2016.10A, $D 2$ = RML2016.04C	71.30% for SNR= 18 dB for $D 1$ , and 83.4% at SNR = 18 dB for $D 2$ , Range [−20, +18]	Feature-Based	Conv2D
[12]	$D 1$ = RML2016.10A, $D 2$ = RML2016.04C	71.25% for SNR = 18 dB for $D 1$ , and 83.03% at SNR = 18 dB for $D 2$ , Range [−20, +18]	Feature-Based	Separable
[35]	$D 1$ = RML2016.10A, $D 2$ = RML2016.04C	73% for SNR = 18 dB for $D 1$ , and 87.4% at SNR = 18 dB for $D 2$ , Range [−20, +18]	Feature-Based	2DCNN
[36]	RadioML2016.10A	80% for SNR > 0 dB, Range [−20, +18]	Feature-Based	CM+CNN
[57]	RadioML2016.04C	88.94% for SNR =18 dB, Range [−20, +18]	Feature-Based	RCN
[53]	$D 1$ = RML2016.10A, $D 2$ = RML2016.04C	79% for SNR = 18 dB for $D 1$ , and 90% at SNR = 18 dB for $D 2$ , Range [−20, +18]	Feature-Based	ResNet
[52]	RadioML2016.10.A	71% for SNR = 18 dB, Range [−20, +18]	Feature-Based	ANN
[51]	RadioML2016.10.A	74% for SNR = 18 dB, Range [−20, +18]	Feature-Based	Inception
[50]	RadioML2016.04C	93.20% for SNR = 18 dB, Range [−20, +18]	Feature-Based	GRU2
[50]	RadioML2016.04C	93.22% for SNR = 18 dB, Range [−20, +18]	Feature-Based	CLDNN
[58]	RadioML2016.04C	90% for SNR = 18 dB, Range [−20, +18]	Feature-Based	CNN
Proposed	$D 1$ = RML2016.10A, $D 2$ = RML2016.04C	81.0% for SNR = 18 dB for $D 1$ , and 94.05% at SNR = 18 dB for $D 2$ , Range [−20, +18]	Feature-Based	Residual+SE

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nisar, M.Z.; Ibrahim, M.S.; Usman, M.; Lee, J.-A. A Lightweight Deep Learning Model for Automatic Modulation Classification Using Residual Learning and Squeeze–Excitation Blocks. Appl. Sci. 2023, 13, 5145. https://doi.org/10.3390/app13085145

AMA Style

Nisar MZ, Ibrahim MS, Usman M, Lee J-A. A Lightweight Deep Learning Model for Automatic Modulation Classification Using Residual Learning and Squeeze–Excitation Blocks. Applied Sciences. 2023; 13(8):5145. https://doi.org/10.3390/app13085145

Chicago/Turabian Style

Nisar, Malik Zohaib, Muhammad Sohail Ibrahim, Muhammad Usman, and Jeong-A Lee. 2023. "A Lightweight Deep Learning Model for Automatic Modulation Classification Using Residual Learning and Squeeze–Excitation Blocks" Applied Sciences 13, no. 8: 5145. https://doi.org/10.3390/app13085145

APA Style

Nisar, M. Z., Ibrahim, M. S., Usman, M., & Lee, J.-A. (2023). A Lightweight Deep Learning Model for Automatic Modulation Classification Using Residual Learning and Squeeze–Excitation Blocks. Applied Sciences, 13(8), 5145. https://doi.org/10.3390/app13085145

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Lightweight Deep Learning Model for Automatic Modulation Classification Using Residual Learning and Squeeze–Excitation Blocks

Abstract

1. Introduction

2. Literature Review

2.1. Likelihood-Based (LB) Method

2.2. Feature-Based (FB) Method

Comparison between Likelihood-Based and Feature-Based Method

2.3. Deep Learning Techniques for Automatic Modulation Classification

3. Proposed Design

3.1. Residual Learning

3.2. Squeeze–Excitation Network

Overall Proposed Block

4. Results and Discussion

4.1. Dataset Description

4.2. Comparative Experiments of Various Networks

4.3. Ablation Study

4.4. Comparison with Contemporary Methods

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI