A Survey of Applications of Deep Learning in Radio Signal Modulation Recognition

Wang, Tiange; Yang, Guangsong; Chen, Penghui; Xu, Zhenghua; Jiang, Mengxi; Ye, Qiubo

doi:10.3390/app122312052

Open AccessReview

A Survey of Applications of Deep Learning in Radio Signal Modulation Recognition

¹

School of Ocean Information Engineering, Jimei University, Xiamen 361021, China

²

School of Informatics, Xiamen University, Xiamen 361005, China

³

School of Electrical Engineering, Hebei University of Technology, Tianjin 300401, China

⁴

School of Advanced Manufacturing, Fuzhou University, Jinjiang 362251, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2022, 12(23), 12052; https://doi.org/10.3390/app122312052

Submission received: 30 September 2022 / Revised: 17 November 2022 / Accepted: 21 November 2022 / Published: 25 November 2022

(This article belongs to the Special Issue Applications of Deep Learning and Artificial Intelligence Methods)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

With the continuous development of communication technology, the wireless communication environment becomes more and more complex with various intentional and unintentional signals. Radio signals are modulated in different ways. The traditional radio modulation recognition technology cannot recognize the modulation modes accurately. Consequently, the communication system has embraced Deep Learning (DL) models as they can automatically recognize the modulation modes and have better accuracy. This paper systematically summarizes the related contents of radio Automatic Modulation Recognition (AMR) based on DL over the last seven years. First, we summarize the current research status of modulation recognition and the necessity of AMR research based on DL. Then, we review current radio AMR methods based on DL. In addition, we also propose a network model of AMR based on Convolutional Neural Network (CNN) and prove its effectiveness. Finally, we highlight existing challenges and research directions of radio AMR based on DL.

Keywords:

deep learning; automatic modulation recognition; neural network; radio signal

1. Introduction

Since the 20th century, with the continuous development of radio communication technology [1], the communication environment has become more and more complex. In order to ensure the accuracy, speed, security and effectiveness of information in the actual communication process, radio modulation recognition is needed, which is an intermediate process of signal detection and signal demodulation. The modulation recognition technology of radio signals plays an important role in military [2], national security and civil fields. In the crowded electromagnetic spectrum environment, the information sent by the sender in wireless communication is affected by various factors, which will lead to the fusion of information and noise, so it is very challenging for the receiver to accurately recognize and receive complete information. For example, in military applications, it is necessary to ensure that friendly signals can be sent and received safely and at the same time to recognize, interfere and locate hostile signals [3]. However, the frequency range of the signal is very wide now and the modulation mode has changed from simple narrowband modulation to broadband modulation, so there are more and more modulation types. In this case, it is increasingly difficult to recognize the radio modulation modes accurately in real time. In order to improve the efficiency and accuracy of radio modulation recognition, it is imperative to study new approaches for radio signal modulation recognition. Of course, with the development of science and technology, some AMR technologies have emerged.

AMR can provide basic modulation information of input radio, especially in non-cooperative radio signals. AMR is an intermediate step between signal modulation and signal demodulation. It can be seen that AMR technology is a prerequisite for demodulating signals at the receiver and a key link in wireless communication. It plays a key role in cognitive radio, spectrum sensing, interference identification, signal monitoring and other scenarios. In the process of signal transmission, on one hand, the signal transmitted by the transmitter is usually affected by noise, multipath fading, center frequency offset, etc.; on the other hand, the signal structure is distorted due to poor hardware design or crystal oscillator drifting, which makes it difficult to distinguish different modulation schemes. At this time, AMR plays a key role. This technology can automatically identify the modulation type of the signal, so as to obtain the information contained in the signal without knowing the system parameters. Its automation can greatly reduce the consumption of human resources. At the same time, it can greatly improve the accuracy of signal modulation recognition.

In the early days, traditional modulation recognition mainly depended on manual work. The operator could judge the modulation mode of the signal by observing the time domain and frequency domain of the signal with the oscilloscope and other instruments. This manual method not only had large recognition error, but also long recognition time and could only be applied to limited modulation types. Later, with the development of modulation recognition technologies, there are two Automatic Modulation Classification (AMC) methods. One is the modulation recognition method based on the maximum likelihood ratio [4,5,6] and the basic idea of this method is hypothesis testing. First, the likelihood probability model is given to estimate the probabilities of different modulation modes. Then, the possible modulation modes are tested. Finally, the modulation type with the maximum likelihood probability is selected as the experimental result. This method can make the classification result optimal while ensuring the Bayesian minimum error criterion. However, this method needs a lot of computation and a priori information, so it cannot be used widely. Another is the modulation method based on feature extraction [7,8,9,10,11,12,13]. This method extracts the spectrum differences between different modulation types and classifies these features by constructing a classifier model, so as to obtain the modulation mode of the unknown signal. This method has low complexity. However, it relies on the selection of signal features. If the selected features are not distinguishable in the communication system, the classification effect will be very poor. Therefore, it is necessary to find an algorithm which has strong generalization ability and can automatically learn modulation features from sample data.

Today, Machine Learning (ML) and DL show overwhelming advantages in fields such as computer vision [14], speech recognition [15], image processing [16] and robotics [17]. For ML, its learning algorithms need to be designed by expert engineers and are mainly aimed at manual extraction of engineering features. When ML algorithms fail in the prediction process, expert engineers are required to adjust them. However, with DL, features are learned automatically at multiple levels. DL models can learn and extract features unsupervised from unlabeled or unstructured data, making decisions without human supervision. Therefore, DL is more automatic and efficient than ML. More recently, DL has also been applied to the field of radio modulation recognition. The process of radio AMR based on DL can be roughly described as below. First, the DL algorithm is used to code and learn the radio time domain signals; then, through deep learning, the similarity of the feature vectors or the same characteristics of similar modulation signals can be matched automatically; finally, the radio characteristics can be fine-tuned from top to bottom using the class label information of training data to obtain a radio in DL representation vector to train the fully connected classifier for radio modulation classification.

In this paper, we collected the milestone works and the latest progress of AMC based on DL in the past seven years. The papers in the reference section are downloaded from the following sources: Google Scholar, MPDI, IEEE Explore, Scopus Elsevier, Springer, Web of Science, Research Gate, arXiv, etc. This paper mainly reviews the methods of AMC based on DL. There are many DL-based AMC methods. According to the different DL network models used, the methods are divided into four categories: CNN-based AMR method, RNN-based AMR method, DBN-based AMR method and hybrid network-based AMR method. By observing the literature of the recent years, the latest and novel methods of AMC based on DL are summarized. Considering that datasets are necessary for signal training and testing with DL networks, this paper also summarizes the reference papers that propose different datasets.

In recent decades, radio AMR has made significant progress, especially in recent years. DL has brought advancement to many research areas, including radio AMR. The aim of this survey is to comprehensively summarize the relevant work of DL in the field of radio AMR. The main contributions of this paper are summarized as follows:

We briefly review the relevant progress of DL-based AMR in the past seven years and point out the benefits of DL technology for AMR research.
We summarize the existing methods of DL-based AMR and classify them according to CNN, RNN, DBN and hybrid network. In addition, the new research methods and research trends in the past year are also given.
We investigate the radio signal datasets used by DL-based AMR. They are introduced in detail.
We propose a CNN-based AMR method, which is proved to have good performance and high recognition accuracy through simulation experiments.
We introduce the commonly used evaluation parameters of DL-based AMR for clearer understanding of the relevant literature.
We discuss and compare the existing AMC methods based on DL in detail. The existing problems and future research directions are summarized.

This paper is organized as follows. In Section 2, we introduce the related work of AMC based on DL. Next, Section 3 summarizes the existing methods of AMR based on DL. Section 4 summarizes and describes the published radio signal datasets. In Section 5, a radio modulation recognition method based on CNN is proposed and its experimental simulation and verification are carried out. Section 6 introduces the commonly used evaluation parameters. Section 7 discusses the existing DL-based AMR methods. Section 8 summarizes and discusses the problems that need to be solved in the field of radio modulation recognition and the future research directions. Section 9 concludes the whole paper.

2. Related Work

Researchers worldwide have done a lot of work on radio modulation recognition based on DL. They have proposed a variety of network models for radio AMR based on DL. In 2016, Kim [18] first proposed AMR technology based on Deep Neural Network (DNN). Subsequently, many modulation recognition methods based on DL have been developed. To facilitate the training and testing of the network, Timothy et al. [19] established a benchmark dataset using GNU’S Not Unix (GNU) radio. O’Shea et al. [19,20] proposed two methods to recognize the signals of the dataset. One is CNN [19], which is used in the field of modulation recognition and the other is to apply the Recurrent Neural Network (RNN) [20] to modulation recognition. By 2017, some new methods had been developed. Mendis et al. [21] proposed an AMC recognition scheme based on Deep Belief Network (DBN). Ali et al. [22] first proposed a non-negative constraint training method based on Auto-Encoder (AE). Hang et al. [23] proposed a two-layer GRU based on circular neural network with appropriate parameters. In 2018, Rajendran et al. [24] proposed an AMC with Long Short-Term Memory (LSTM) based on improved RNN.

Until recent years, more and more methods have been studied in this field and a large number of methods based on DL have been proposed. In 2022, Ghanem et al. [25] proposed a wireless modulation classification algorithm based on CNNs in which the radon transform (RT) of constellation diagrams with different modulation types is used as input. Soon after, Ghanem et al. [26] conducted a more in-depth study in this direction. An AMC approach based on 2D transforms and CNN was proposed. Various transform methods were used. Abdel-Moneim et al. [27] proposed a new AMC method that combines Gabor filtering, thresholding and CNN. Hamidi-Rad et al. [28] proposed MCformer, a transformer for AMC based on DNN. MCformer makes use of the convolution layer and self-attention mechanism. Wu et al. [29] proposed a multi-scale feature network with large kernel size and squeeze-and-excitation mechanism for AMC. Sun et al. [30] proposed two AMC methods based on DL using image classification technology. One uses constellation images and image classification technology and the other uses Graphic Representation of Features (GRF) technology.

From them, it is found that using DL to recognize the modulated radio signals has many advantages. The recognition accuracy is greatly improved compared with the traditional modulation recognition methods. The recognition methods based on DL solve the problems of dependence on manual characteristics extraction, low robustness and difficult model deployment in the traditional radio signal recognition methods. They overcome the shortcomings of traditional linear classification methods, achieve accurate recognition of radio-coded modulation signals and effectively enhance the classification performance of radio modulation.

3. Radio Modulation Recognition Methods Based on Deep Learning

Recently, a variety of radio AMR methods based on DL have been proposed by the academic community. After investigation, it is found that the commonly used DL models are CNN, RNN, LSTM, AE, DBN, etc. In addition, there are some hybrid DL models. Although they are not commonly used, they have great advantages, such as, Deep Multi-scale Convolutional Neural Network (DMCNN), Convolutional Long Short-Term Deep Neural Network (CLDNN). We classify literature according to the different DL models they have used. According to the literature survey, major papers of DL models in radio modulation recognition are shown in Table 1 and the applications of different models will be described below in detail.

3.1. CNN

CNN is a kind of feedforward neural network with deep structure and large amount of computation. It is one of the representative algorithms of DL [75,76]. CNN has local perception, weight sharing and shift invariance. It exploits spatial local correlation by enforcing local connectivity patterns of adjacent layers, sharing weights between each layer. A basic assumption of CNN is that input data are local and shift invariant. Wireless signal sampling data fit this hypothesis. With the continuous expansion of DL knowledge, the structure of CNN model is more and more diverse. Representative CNN algorithms include LeNet [77], AlexNet [78], ZFNet [79], VGGNet [80], Google LeNet [81], Residual Network (ResNet) [82] and DenseNet [83]. The general architecture of CNN is shown in Figure 1. Typical CNN consists of an input layer, a hidden layer and an output layer. The radio signal x is input into the network from the input layer and data features are extracted and processed through the hidden layer; then the signal modulation mode is output from the output layer and the signals are classified into different modulation modes. The hidden layer includes a convolution layer, an excitation layer, a pooling layer and a fully connected layer (also called dense layer). Some models also use other functional layers in between, such as the normalization layer, the dropout layer and so on. The functions of the main layers are described as follows:

Input layer: This layer is used for data entry.
Convolution layer: This layer uses the convolution kernel for feature extraction and feature mapping [84].
Activating layer: This layer adds nonlinear mapping by using activation functions, because linear models are not expressive enough.
Pooling layer: This layer carries out a subsampling operation on the feature graph output after convolution, so as to reduce the number of parameters.
Fully connected layer: This layer converts the previous activation graph into a probability distribution and finally sends it to the Softmax layer for classification of categories.
Output layer: This layer is used to output classification results.

Radio modulation recognition is no exception in becoming an area of using CNN. There is a lot of literature on radio modulation recognition methods based on the CNN as summarized in Table 2 and described below. Zhang et al. [31] used CNN to recognize and classify radio waveforms and used a two-dimensional time-frequency diagram to characterize various signals. When the signal-to-noise ratio (SNR) is −2 dB, the overall ratio of successful recognition (RSR) can reach 93.7%. Sethi et al. [32] proposed a signal distortion correction module (CM). They used CM to shift the signal frequency and phase before modulation recognition. Even if CM plus CNN is used for radio modulation recognition, the experimental results show that the recognition accuracy is significantly higher than that of CNN plus CLDNN. There are many modulation recognition methods based on the radar signals. Gao et al. [33] proposed an AMR network for radar signals based on transfer learning CNN. The effective information on the fused image is extracted and identified. When the SNR is −6 dB, the overall RSR can reach 95.5%. Wang et al. [34] combined two CNNs trained on different datasets (i.e., a DL-based combination of two CNNs) to achieve more accurate radio AMR and designed a constellation-based CNN to identify modulation modes that were difficult to distinguish in previous CNNs. Xu et al. [35] proposed a method of radio automatic modulation recognition based on CNN, which used short-time Fourier transform (STFT) to create spectrogram images of different complex signals to convert complex modulation recognition problems into image recognition problems. Rakesh et al. [36] proposed radio access technology (RAT), a wireless access technology based on the combination of time-frequency distribution and CNN. Time-frequency analysis was used to obtain the spectral content of the signal. CNN was used for feature extraction and recognition. Performance charts and confusion matrices for correct recognition were used to analyze the accuracy of the network. Wu et al. [37] proposed a radio AMC method with multifeature fusion based on CNN, which can achieve the same or better results with less learning parameters and training time. Gu et al. [38] proposed a BCI-assisted generalized AMR based on DL of two CNNs: the former identifies the channel category of the signal, while the latter classifies the signals under the same channel. The simulation shows that the proposed GenAMR method is significantly better than the traditional one. Yang et al. [39] studied three fusion methods when the signal length is longer than the designed CNN input length: voting-based fusion, trust-based fusion and feature-based fusion. Experiments show that the latter two methods have better performance. Yongshi et al. [40] proposed a radio AMC method based on wavelet denoising pre-processing and improved CNN architecture. The method first decomposes the baseband signal into various frequency scales by wavelet transform, then classifies the pre-processed signal by improved CNN. Dileep et al. [41] proposed an AMC modulation classification based on dense layer DropoutCNN (DDrCNN), selected the classification cross-entropy as the loss function and selected Adam as the optimization function, including only one CNN. The modulation schemes are classified by the IQ samples of training data. Above 97% accuracy can be achieved over 2 dB SNR. Li et al. [42] proposed a sparse filtering criterion to carry out unsupervised layer-by-layer pre-training of the CNN network, which effectively improved the generalization ability. Peng et al. [43] used AlexNet and GoogleNet, two DL models based on CNN, to classify the signals for modulation. That is, several methods have been developed to represent the modulated signals in the data format of CNN with a grid topology. Wu et al. [44] constructed a five-layer CNN model to identify VHF signals. Simulation and actual signals showed that the influence of frequency shift and noise on accuracy was great. Peng et al. [45] proposed the idea of using CNN to classify the modulation types in the communication system. In their method, the constellation diagram was used to represent the modulated signal of CNN. AlexNet model was used for training and testing. Kulin et al. [46] used time-domain features, such as IQ vector and amplitude/phase vector, to train the CNN classifier. Experimental results showed that the scheme could identify ZigBee, WiFi and Bluetooth signals well. O’Shea et al. [47] extended the deep CNN model of radio and used deep residual network for signal classification. Its robustness was also discussed. Longi et al. [48] proposed a supervision model based on CNN and trained a series of pseudo labeled time slice spectral data in the model. Zhang et al. [49] proposed an AMR framework based on CNN. A preprocessed signal representation was proposed, which combined the orthogonal, fourth-order statistics of the modulated signals. The accuracy was improved by 8%. Sang et al. [50] proposed an AMR method for radio signals based on improved CNN. The classification accuracy can reach 93%. Wang et al. [51] proposed a lightweight CNN for AMC. Different model blocks in the network were used for feature extraction, feature reconstruction and full connection classification. The results showed that this method greatly reduced the number of parameters and inference time. Zhang et al. [52] proposed a multiscale CNN for constellation-based modulation classification. The network structure was composed of multiple processing modules to fully understand more internal features from images similar to constellations. At the same time, the convolution gray image was developed and the convolution kernel was used to overcome the shortcomings of the existing imaging schemes. The average classification accuracy of the network trained on convolution gray image dataset was about 97.7% at 4 dB SNR. Ghanem et al. [25] proposed an AMC method based on CNN, which used RT of constellation diagrams as input. For several modulation types, constellation radom transform improved the performance and accuracy of the classifier. Du et al. [53] proposed a dilated CNN for AMR. Firstly, the one-dimensional modulation signal was converted into a two-dimensional asynchronous delay histogram. Then, it is input to CNN based on dilated convolution kernel. Experiment results showed that this AMR method significantly improved the recognition accuracy in low SNR. Shi et al. [54] proposed an AMR method, which includes a multi-scale convolution deep network with attention module and a shallow network for recognizing modulation types that were easy to misclassify. The overall recognition accuracy could reach 98.7%. It performed well in recognizing high-order and analog signals. Le et al. [56] proposed five CNN models, including ResNet18, SqueezeNet, GoogleNet, MobileNet and RepVGG. The experimental results showed that the SqueezeNet model achieved the highest accuracy of 97.5% when the SNR was +8 dB. Based on the evaluation results of a single model, an ensemble learning method was proposed. Experimental results showed that ensemble learning improved the accuracy of modulation recognition. Weighted ensemble had better performance than the unweighted model.

3.2. RNN

RNN was first proposed by Pollack in 1990 [85]. It takes sequence data as input and recurses in the evolution direction of sequence. All nodes are connected by chain. Unlike other feedforward machine learning algorithms, data flow only flows in one direction, that is, input to output.

As shown in Figure 2, a cyclic neural network diagram is composed of input layer, cyclic layer, fully connected layer and output layer. X(t) refers to the input vector, O(t) refers to the output vector, H(t) refers to the weight matrix from the hidden layer to the fully connected layer at the current time and H(t-1) refers to the weight matrix from the hidden layer to the fully connected layer at the previous time. The cyclic layer is used for feature extraction and the fully connected layer is used for feature classification. It is a network with memory. Because it contains something like memory inside, the current state depends not only on the input of the current moment, but also on the input of the previous moment. This looks like the circulatory unit in RNN neurons. Expanding the neurons of RNN circulatory layer along the time sequence is the structure shown in Figure 3, where the grey box is the circulatory unit, X(t) represents the input vector at a certain time and H(t) indicates the state of the hidden layer at the current time.

Although RNN is theoretically unrestricted in length of time series, it has a long-term dependence problem; that is, when learning a long sequence, in the cyclic neural network will appear gradient vanishing and gradient explosion [87] phenomena, unable to grasp the long-span non-linear relationship. To improve long-term dependency, Hochreiter proposed the LSTM in 1997 to improve its cell structure [88].

As shown in Figure 4, X(t) represents the input of the LSTM unit and H(t) represents the output of the LSTM unit. The RNN is endowed with the ability to control its internal information accumulation by adding a gating unit [89], so as to control the influence of the input at the current time on the network. More specifically, the LSTM unit contains three gates: the input gate, the forgotten gate and the output gate. Input gates allow input signals to adjust the storage unit state or prevent extreme operations such as setting the input door to zero; output gates allow input signal unit states to affect other neurons or prevent similar operations; and forgotten gates enable storage units to remember or forget their previous state, with only a small amount of linear interaction in the information flow. This makes it easier to remember long-term information and the model more easily converges. LTSM can solve the problem of gradient vanishing on the one hand and remember the past data on the other hand. LSTM is trained using back propagation, so time series can also be classified with time lags of unknown duration.

There are also many studies in the literature on radio modulation recognition methods based on RNN as summarized in Table 3 and decribed below. Hong et al. [23] proposed a two-layer GRU model with appropriate parameters based on RNN model. Using the time series characteristics of the signal, the original signal can be directly used with limited data length, avoiding manual signal extraction. Compared with [90], the double-layer GRU model has obvious advantages in high SNR. O’Shea et al. [57] analyzed the effect of CNN layer size and depth on classification accuracy and proposed a complex priori module that combines CNN and LSTM modules to improve classification accuracy of radio AMR. Rajendran et al. [24] proposed a classification method of radio AMR based on LSTM, learning from the amplitude and phase information of the modulation scheme in the time domain that exists in the training data, without the need for expert features such as high-order cyclic moments. Zhang et al. [49] proposed an AMR model based on LSTM. A preprocessed signal representation was proposed, which combined the orthogonal, fourth-order statistics of the modulated signal. The accuracy was improved by 8%. Sang et al. [50] proposed an AMR model based on improved LSTM. The model achieved accuracy of 76% under all SNR. Daldal et al. [58] proposed an automatic recognition of digital modulation based on depth LSTM model. This method did not need any feature extraction and directly input the modulated signal into the system. The classification accuracy reached 94.72%.

3.3. DBN

DBN is another typical DL algorithm proposed by Hinton in 2006 [91]. It is a probability generation model and widely used in natural language processing [92,93,94,95] and image recognition [95,96,97,98,99,100].

Figure 5 shows the classic DBN network structure, which is composed of several restricted Boltzmann machine (RBM) layers. The network is “limited” to a visible layer (i.e., input layer) and a hidden layer. Each RBM has two layers: an upper hidden layer and a lower visible layer. The DBN training process is divided into two steps. The first step is pretreatment and uses layerwise training. The lower layer serves as the input to the upper layer. The second step is fine-tuning. It is supervised to train the last layer and the error generated by result comparison is propagated backward layer by layer, fine-tuning the overall weight.

There is not much literature on radio modulation recognition methods based on DBN. At present, there are several as summarized in Table 4 and described below. Zhang et al. [59] used deep confidence network and unsupervised greedy algorithm to pre-train RBM. The result obtained was used as the initial value of the supervised learning training probability model and the time IQ data representation was used to identify the modulation types, which improved the recognition rate. In the cognitive radio signal modulation pattern recognition algorithm proposed by Wei et al. [60] and Mendis et al. [21] for DBN, the spectral correlation function was used as the characteristic representation of the received signal even in the presence of environmental noise. Cui et al. [102] proposed a DBN-based DL algorithm which was applied in the main user classification, and significantly reduced the number of tagged data. At the same time it has a better recognition rate than CR engine using traditional strategies such as shallow learning, with a detection accuracy of more than 90% and a classification accuracy of more than 85%. Sun et al. [61] proposed a cooperative Bayesian compression spectrum detection method based on RBM, which used Bayesian compression sensing model to detect wideband sparse signals and then used RBM learning to implement fusion decision based on multiuser recovery signals.

3.4. Other Models Based on DL Networks

In addition to the above several common DL network models, there are some not so commonly used DL network models for radio signal modulation recognition. They are as follows: AE, CLDNN, dynamic multi-pooling convolutional neural network (DMCNN), Gated Recurrent Unit (GRU), Adversarial Training for Supervised and Semi-Supervised Learning and so on.

The relevant literature is summarized in Table 5 and described in the following. Xie et al. [68] proposed a hybrid network model. Its network structure consists of DenseNet, BLSTM and DNN. They tested and proved the effectiveness of the proposed algorithm by using the RadioML2016.10a dataset. Experimental results showed that the classification accuracy of this network model was much higher than that of the benchmark model at high SNR. It could extract deeper information. Li et al. [65] proposed a signal classifier production antagonism network, increased the encoder network and signal space transformation module and achieved significant accuracy improvement. Nie et al. [66] proposed a new deep hierarchical network (DHN) based on CNN, which combined shallow features with advanced features and used SNR as the weight in training. Liu et al. [67] combined CNN and the short-term memory architecture into a deep neural network CLDNN, which improved the accuracy by about 13.5% compared with the original CNN model. Ali et al. [22] proposed an unsorted input data classifier (UDNN) based on k-sparse self-encoder, which used the power of DNN to learn advanced abstraction from raw input data to omit Klog (K) comparison operations as much as possible. Hao et al. [69] proposed an AMR method based on a CNN-GRU hybrid network. This method used different structures to extract and classify features with different dimensions automatically. The comprehensive recognition accuracy on RadioML2016.04c and RadioML2016.10a was 60.64% and 73.2%, respectively. Njoku et al. [71] proposed an AMC method based on a hybrid neural network composed of a shallow CNN, a GRU and a DNN. The recognition accuracy reached 93.5% and 90.38% on RadioML2016.10a and RadioML2016.10b, respectively. Wang et al. [72] proposed an AMC method of hierarchical multifeature fusion based on multidimensional CNN and LSTM. Multidimensional CNN compensated the interactive features extracted by two-dimensional convolution filter with the features extracted by one-dimensional filter. The LSTM layer was used to extract the time characteristics of the signal. The recognition accuracy was higher than other methods. Wang et al. [74] proposed a novel multi-cue fusion network for AMR. The network consisted of a signal cue multi-stream (SCMS) module and a visual cue discrimination (VCD) module. The SCM module based on CNN and Independently Recurrent Neural Network (IndRNN) was used to extract two signal cues (In-phase/Quadrature and amplitude-phase), which aimed to explore various differences and make use of the supplement of multiple data forms. The VCD module took the constellation map as the visual clue and used CNN to extract the structural information of the map. The recognition accuracy reached 97.8% and 96.1% on RadioML2016.10a and RadioML2018.01a, respectively. Liu et al. [70] proposed a modulation recognition method, which combined GRU based on feature extraction with CNN based on cyclic spectrum. The results showed that this method greatly improved the modulation recognition rate under low SNR. The recognition rate was more than 90% when the SNR was −6 dB. The recognition rate was 100% when the SNR was −1 dB. Aiming at the problem of low accuracy of wireless signal modulation recognition, Lei et al. [73] proposed a rough and fine feature fusion network. The method of combining rough and fine feature fusion module with LSTM achieved better recognition accuracy.

3.5. Recent Research Trends of AMC Based on DL

By reading through the literature, we found that most of the previous studies had focused on the design of network models. Most network models were based on the extension or integration of basic models such as CNN, RNN, LSTM and DBN. In order to further improve the performance of DL-based AMC, there are some new research trends. These can be roughly divided into the following three categories:

The first category considers the transform of input signals. The research of some articles not only focuses on the design of the network model but also on the signal form of the input network. The original IQ signal is not used as the input of the network, but the IQ signal is transformed differently. The transformed form is used as the input of the network so that the network can better extract the characteristics of the signal. This improves the performance of AMC. In 2022, Ghanem et al. [25] orthogonally projected the original IQ signal to obtain the constellation diagrams of the signal. An AMC algorithm based on CNNs was proposed. The RT of constellation diagrams with different modulation types is used as input of the network. The experimental results showed that the RT of constellation diagrams could improve the performance of the classifier and the recognition accuracy at low SNRs. Soon, Ghanem et al. [26] conducted a more in-depth study in this direction. An AMC approach based on 2D transforms and CNN was proposed. The constellation diagrams were processed using three different 2D transforms. These transforms were RT, the curvelet transform and the phase congruency (PC). The effect of using different transformed constellation diagrams on the performance of AMC was analyzed experimentally. Quan et al. [103] proposed an LPI radar signal recognition method based on dual channel CNN and feature fusion. The author used the wavelet transform method to transform the signal into a time-frequency image and carried out gray processing on the time-frequency image, then input it into the dual-channel CNN model. This model could extract two features from the signal time-frequency diagram, namely, the directional gradient and the depth feature histogram. Finally, the two features were combined for classification. The recognition rate could reach more than 95% when the SNR was 6 dB.

The second category is about feature extraction. Recently, many studies have added transformer or attention mechanism to the basic network. The transformer can extract the temporal correlation features between signals and improve the recognition accuracy of neural networks. The attention mechanism can make the model focus on the relevant characteristics of signals and accelerate the training process. It is able to judge the importance of each feature, select important signal features for processing and improve the efficiency of the neural network. In 2022, Hamidi-Rad et al. [28] proposed a new transformer based DNN-MCformer for AMC of complex radio signals. MCformer used the convolution layer and self-attention mechanism. It significantly reduced the number of parameters and achieved the most advanced performance. Experiments showed the excellent performance of the architecture based on MCformer. Lin et al. [55] proposed a time-frequency attention mechanism for automatic modulation recognition based on CNN. The time-frequency attention mechanism was designed to learn which channel, frequency and time information is more meaningful for modulation recognition in CNN. Experiment results demonstrated that the proposed attention mechanism required a similar inference time as the other methods and fewer learned parameters than IQ-CNN and CLDNN.

The third category concerns hardware implementation. Although most previous studies on AMC have achieved good results, most of them are still in the simulation stage. Recently, many scholars began studying AMC hardware implementation based on DL. In 2022, Kumar et al. [104] designed AMC schemes based on CNN for the complex time radio signal domain and implemented them on the FPGA platform. Based on the training mechanism of iterative pruning, the model size on the hardware was reduced and the overall accuracy was kept above a certain threshold. The proposed scheme achieved an accuracy of at least 1.4% higher than the baseline and only took 40% of the hardware resources. The model achieved a real-time throughput of 527k classifications per second, with a delay of 7.5

μ

s.

4. Datasets

When using the DL network model to train and test radio modulation signals, datasets are essential. After training and testing the model with the dataset, we can know the recognition effect of our model on the data and whether the model can recognize the modulation of radio signals correctly. Therefore, we will introduce the datasets currently used in the field of radio modulation recognition based on DL. They can be roughly divided into two categories: public open source synthetic datasets and other datasets. The public open source synthetic datasets are a type of dataset that can be publicly used. This paper mainly refers to the RF datasets for machine learning created by O’Shea et al., including RadioML2016.10a [105], RadioML2016.10b [105], RadioML2016.04c [19] and RadioML2018.01a [47]. HisarMod2019.1 was created with MATLAB by Tekbıyık et al. [106]. Other datasets refer to those created by some simulation software according to the experimental requirements. Next, we will introduce these datasets through two parts: dataset parameters and generation methods.

4.1. RadioML2016.04c

This is a dataset composed of 11 modulation modes: 8 digital modulation and 3 analog modulation, which include BPSK, QPSK, 8PSK, 16QAM, 64QAM, BFSK, CPFSK and PAM4 for digital modulation, WB-FM, AM-SSB and AM-DSB for analog modulation. The rate of modulated data is about eight samples per symbol. The normalized average transmit power is 0 dB. There are 220,000 samples in each modulation mode. Each signal is sampled into

2 \times 128

vector data. The tag includes SNR value and modulation type. The SNR of each signal sample ranges from −20 dB to 18 dB. The display parameters of the dataset are described in Table 6.

Data generation methods are described below. The synthesis of the radio communication signal introduces the same modulation, data carrying, pulse shaping and other good transmission parameters as the real world. Real speech and text datasets are modulated to communication signals. At the same time, in order to ensure that the bit is of equal probability, the block randomizer is used when the signal is digitally modulated. In addition, the robust model is used for the multipath fading of impulse response in time-varying channel, random walk drift of carrier oscillator and sampling clock and additive white Gaussian noise. The synthetic signal set is passed through a rigorous channel model. Unknown scale, translation, dilation and impulse noise are introduced into the model. This makes the infinite band channel better. The generation of datasets is modeled in GNU radio [107] using GNU radio channel model [108]. By using a 128 sample rectangular window process, each time series signal is divided into a test and training set. The total dataset is stored as a Python pickle file. It contains 32-bit floating-point samples and is about 500 Mbytes.

4.2. RadioML2016.10a

The signal data in the dataset include 11 modulation modes, namely, 8PSK, AM-DSB, AM-SSB, BPSK, CPFSK, GFSK, PAM4, QAM16, QAM64, QPSK and WBFM, in which 8 of them are digital modulation modes and 3 analog modulation modes. It contains 220,000 communication signal data after sampling. Each modulation mode has 2000 signal data. Each signal is sampled as vector data of

2 \times 128

, where 2 represents I and Q signal data and 128 represents 128 time nodes. All SNR ranges from −20 to 18 dB and is included in the serial number and test tag. The number of samples per symbol parameter is a modulation characteristic that specifies the number of samples that represent each modulation symbol. The display parameters of the dataset are described in Table 7.

This dataset is generated using GNU radio. The specific data generation method is divided into the following four steps. The first step is to select the source alphabet. The analog modulation uses a publicly available copy of serial episode 1. Digital modulation uses the ASCII code of Shakespeare’s entire Gutenberg work. In order to equalize the symbol and bit of the data, a whitening randomizer is used in digital modulation. The second step is modulation. In order to form a normalized symbol rate in all digital modulation, each symbol value has a normalized sample. Different modulation modes have different usage and transmission modes. The third step is channel simulation. The production of this set of radio signals mainly uses the block layer of GNU’s wireless dynamic channel model. The fourth step is data storage. The data are standardized first. Based on the unit energy in each 128 sample data vector, each stored signal sample is scaled. Then, using numpy and cpickle, the data are stored as an n-dimensional vector. Next, the time period sampled from the analog output stream is stored in the output vector. The last step is to classify the signals by machine learning.

4.3. RadioML2016.10b

This dataset contains 11 modulation modes: 8 digital modulation and 2 analog modulation. These include 8PSK, BPSK, SPFSK, GFSK, PAM4, QAM16, QAM64 and QPSK for digital modulation and WBFM, AM-SSB and AM-DSB for analog modulation. It contains 1,200,000 communication signal data after sampling. The parameters of RadioML2016.10a and RadioML2016.10b are the same except for the difference of modulation types and signal quantity. The display parameters of the dataset are described in Table 8. In addition, the two datasets are generated in the same way.

4.4. RadioML2018.01A

This dataset contains 24 modulation modes: 32PSK, 16APSK, 32QAM, FM, GMSK, 32APSK, OQPSK, 8ASK, BPSK, 8PSK, AM-SSB-SC, 4ASK, 16PSK, 64APSK, 128QAM, 128APSK, AM-DSB-SC, AM-SSB-WC, 64QAM, QPSK, 256QAM, AM-DSB-WC, OOK and 16QAM. Each modulation mode contains 26 SNRs, 4096 data in each SNA, two IQ signals in each datum and 1024 points in each signal, so the dataset contains (24 × 26 × 4096) × 1024 × 2 = 5,234,491,392 data. The dataset is stored in three dimensions (X, Y, Z) in the HDF5 file. The X dimension stores a three-dimensional array that stores signal modulation data (that is, data used to recognize signal types) in the shape and size of (2,555,904, 1024, 2), representing a total of 25,555,904 signal data, with 1024 signal modulation data containing two IQ signals per signal. The Y dimension stores a two-dimensional array that stores signal types in the shape and size of (2,555,904, 24), representing 2,555,904 signal data of each signal type, represented by 24 digits. The Z dimension stores a one-dimensional array that stores the SNR (2,555,904, 1), representing 2,555,904 signal data. The SNR of each signal sample ranges from −20 dB to 18 dB. The parameter description of the dataset is shown in Table 9.

This dataset is an improved version of the previous datasets. Unlike the three datasets mentioned earlier, this dataset uses 24 different modulation types, divided into analog and digital. The single carrier molation scheme is also included. Firstly, the model is built to generate several analog wireless channels and dataset signals. Then the OTA test is carried out on the clean signal channel without the damage of synthetic signals. The way to test is this to first modulate the signal and then let it transmit. The signal is set to offset about 1 MHz and stored in baseband. The test transmitter is stored with the transmitter modulated real-world label on the ground. The next part classifies the signals. The specific classification methods include baseline method, CNN and ResNet [47]. After signals are classified, appropriate data can be obtained.

4.5. HisarMod2019.1

This dataset contains 26 different modulation signals belonging to the five modulation groups and is affected by five types of fading noise. The dataset consists of five main modulation groups. Each modulation type comprises 1500 signals with a length of 1024 I/Q samples. The number of samples is 780,000. The SNR of each signal sample ranges from −20 dB to 18 dB. When generating signals, the oversampling rate is selected as 2 and the raised cosine pulse shaping filter with a roll off factor of 0.35 is used. The display parameters of the dataset are described in Table 10.

MATLAB 2017a is used to create random bit sequences, symbols and wireless fading channels. In addition, the dataset consists of signals passing through five different wireless communication channels, which are ideal, static, Rayleigh, Rician (k = 3) and Nakagami–m (m = 2). These channels may also be distributed on the dataset; therefore, there are 300 signals for each modulation type and each SNR level. An ideal channel is one in which there is no fading but additive white Gaussian noise (AWGN). In a static channel, the channel coefficients are randomly determined at the beginning and remain constant during the propagation time. The signal passing through the Rayleigh channel is used to make the system resist the non-line-of-sight (NLOS) condition. On the other hand, since the dataset covers mild fading, Rician fading with shape parameter k of 3 is used. In addition to these channel models, for the rest of the signals in the dataset, the distribution of received power is selected as Nakagami–m and the shape parameter m is 2. Therefore, the dataset includes signals with different fading models. Note that the number of multi-path channel taps may also be 4 and 6 and ITU–R M1225 [109] is adopted for these two taps.

4.6. Other Datasets

In addition to the publicly available datasets mentioned above, there are many documents that use self-created datasets, which are created by simulation software according to their needs. For example, Dileep et al. [41] used Matlab to generate the required dataset; Zhang et al. [59] generated datasets by GNU Radio simulation, and so on. The parameters of different datasets are different. The parameters and formats of the datasets are simulated and set according to different requirements for the data.

5. Radio Modulation Recognition Model Based on CNN

5.1. Modulation Recognition of Radio Signals by CNN

As one of the representative methods of DL, CNN has excellent performance in many fields. We mentioned some radio modulation recognition methods based on CNN in Section 2. They can produce good results. However, we can find that the accuracy of modulation recognition experiments using the dataset mentioned in the third part is not very high. Therefore, we have carried out relevant experiments in this aspect and achieved a satisfactory result.

The general framework of the radio signal AMR method based on CNN is shown in Figure 6. The specific steps are described as follows. Firstly, divide the dataset into training set and test set in the ratio of 1:1. This method uses raw IQ data. Secondly, the training set data are preprocessed and input into the neural network. Thirdly, the neural network is used to train the training set data iteratively many times and the network parameters are constantly updated, so that the modulation recognition effect of the network is the best. Then, the trained neural network is used to test the test set data. Finally, the modulation recognition results predicted by the neural network are output.

The experiment we have carried out is based on CNN radio signal modulation recognition. The dataset used is the open source dataset RadioML2016.04c mentioned in Section 4. During training, the dataset is divided into 50% training set and 50% test set. Keras framework is used in the experiment. TensorFlow acts as the back end. Training is accelerated by GPU. Batch_size is set to 1024 by default during training. Dropout is 0.5. The training is conducted by using the categorical_crossentropy function and Adam solution.

We designed a simple CNN network framework to verify the effectiveness of CNN for AMC. Many existing references are mainly based on this simple framework and add some additional modules to improve the performance. The implementation process of this CNN based on AMC is shown in Figure 7. Firstly, the signal is input into the network and four convolution layers are used to process the signal and automatically learn the characteristics of the signal. At the same time, in order to prevent overfitting, we use dropout after each convolution layer. Finally, the signal modulation recognition result is output through the fully connected layer and Softmax activation function.

The network structure of CNN is adopted in the experiment. The network model structure is shown in Figure 8 and Table 11. There are six layers in the network, including four convolution layers and two dense layers. Reshape the input from [N, 2, 128] to [N, 1, 2, 128], where N is samples/ batch_size. At a time, one matrix with size

2 \times 128

will enter the network. The input layer is followed by four convolution layers. All four layers use ReLU as the activation function. These four layers are preceded by zero padding with symmetric width pad. This zero padding uses the 2D input zero padding and the padding is set to (0, 2). Channels_first is used for ordering the dimensions in the inputs (batch_size, channels, height, width). The first, second and third Conv/ReLU layers have 256 output filters and

1 \times 3

filter size. The fourth Conv/ReLU layer has the number of output filters 80 and the filter size

2 \times 3

. These four layers are ended with dropout for regularization. The four convolution layers are followed by two dense (fully connected) layers. The first dense layer, after the input is flattened, obtains its output by taking the dot product between input tensor and 256 kernel matrices and then the activation function ReLU is used. The second dense layer obtains its output by taking the dot product between input tensor and an 11 kernel matrix and then Softmax is used as activation function. Since at this stage we want to classify the types of signals to be recognized, the output here is 11 classes, corresponding to 11 types of different signals in the dataset. After running, we can obtain the results described below.

In order to verify the effectiveness of the proposed model, existing recognition methods such as CNN, ResNet, Inception and CLDNN are compared with the methods in this Section. The parameters of the existing recognition methods are kept in the original parameter setting of the method. After training on the RadioML2016.04C dataset, the experimental results are shown in Figure 9. This figure shows the recognition accuracy of each method with SNR range of [−20 dB, +18 dB], in which the blue line represents the recognition accuracy of the method in this section. It can be seen that the recognition accuracy of the proposed model in the range of [−20 dB, −10 dB] is not much different from those of the existing methods, but the recognition accuracy of the proposed model in the range of [−10 dB, 18 dB] is better than those of the other methods. The highest recognition accuracy is 98.47% at +18 dB. It is verified that our method is effective for wireless signal modulation recognition task.

Table 12 shows the comparison of simulation results of these five methods. From the table, we can see that the method in this paper has a relatively small number of parameters. The recognition accuracy and average recognition accuracy are the highest. The performance of AMR is superior to the other four methods.

Figure 10 shows the confusion matrices of the above five methods when the SNR is 18 dB. Through comparison, it can be found that the confusion matrix of our proposed method has almost no light color blocks and the diagonal color blocks are darker. This further shows that our method has almost no misclassification and most modulation types can be accurately recognized.

We set epoch to 100 and batch to 1024. Cross entropy loss function and Adam optimizer are used. At the same time, the early stop mechanism is used. The patience is set to 5, which means that the loss function of five consecutive batches has not changed and the training will be stopped automatically, so as to prevent overfitting. Finally, the training time is about 28 min and a batch takes 24 s, much less than the training time of CNN. The training time is similar to ResNet, Inception and CLDNN, but the recognition effect is better than them. Therefore, in terms of online learning, our proposed method has higher recognition correlation and optimal performance.

5.2. Influence of CNN Network Hyperparameters on Modulation Recognition Rate

A large part of the reason why this model can achieve such a good effect is that it selects the appropriate hyperparameters. Next, we will discuss the influence of hyperparameters on modulation recognition in detail. The hyperparameters of a neural network refer to the types of neuron activation functions, the number of layers, the number of neurons in each hidden layer, the size of convolution kernel, etc. These hyperparameters will not participate in the training of the sample. However, the difference of hyperparameters will affect the learning speed of the neural network and the final classification structure. In other words, it affects how fast the cost function drops on the training set and the classification accuracy on the verification set. However, there is no complete theoretical method and basis for how to select the hyperparameters. Therefore, this section mainly studies the problem of hyperparameter selection of the neural network model for radio modulation recognition. Here we use the model designed in Figure 8 as a benchmark. When one hyperparameter is discussed, the other hyperparameters remain unchanged.

5.2.1. Influence of the Number of Network Layers on Recognition Results

We use the model shown in Figure 8. Keep the original hyperparameters unchanged and change the number of convolution layers with 256 convolution cores. We set up four sets of experiments. The number of convolution layers of each group is 1, 2, 3, 4 and 5. Comparison of corresponding recognition accuracy rates on different SNRs is observed, as shown in Figure 11. It can be seen that under the low SNR (less than −8 dB), the number of convolution layers has little influence on the recognition rate. The recognition rate of different convolution layers is almost the same. However, when the SNR is greater than −8 dB, the recognition rate is the best when the convolution layer number is 3. When the number of convolutional layers is less than or equal to 3, the recognition rate increases with the increase of the number of convolutional layers; when the number of convolutional layers is greater than 3, the recognition rate decreases with the increase of the number of convolutional layers. The recognition rate of convolution layery 1, 2, 4 and 5 is not as good as that of convolution layer 3. This indicates that too few or too many convolution layers will affect the overall performance of recognition. When the number of convolutional layers is too small, the range of local perceptual field is small and the learning ability of the network is very low, which leads to low recognition rate. However, that does not mean that the larger the number of convolution layers, the better. There are also negative effects when the number of convolution layers is too deep. When the number of convolution layers is too large, the overfitting phenomenon easily occurs, resulting in gradient attenuation and it is difficult to find the optimal solution of parameters. This can degrade network performance. That leads to bad results. Therefore, the appropriate number of convolution layers should be selected when modulation recognition of radio is carried out.

5.2.2. Influence of the Number of Convolution Kernels on Recognition Results

We still use the model shown in Figure 8. The number of convolution kernels in the 1st, 2nd and 3rd convolution layer is changed from 256 to 128 and 64, respectively. Comparison of the recognition accuracy of different SNRs is shown in Figure 12. We can see that the number of convolution kernels has little effect on the recognition rate at low SNR (SNR less than −4 dB). However, when SNR is higher than −4 dB, we can observe that the recognition rate increases with the increase of convolution kernel. The recognition rate of 256 convolution kernels is the highest. This shows that the more convolution kernels, the stronger the fitting ability and the higher the recognition rate. At the same time, we also observe that the recognition rate of 256 convolution kernels is a little bit higher than that of the other two. This shows that the fitting ability of convolution training dataset is also limited. When the convolution kernel number increases to a certain extent, the feature extraction ability will not change much. The corresponding recognition rate is similar. Therefore, it is very important to choose the number of convolution kernels when recognizing radio signals.

6. Accuracy of Common Evaluation Parameters and Classical Methods

Common Evaluation Parameters

Confusion matrix, accuracy rate, recall rate and average precision are commonly used parameters in DL. We can judge the performance of a network model according to them. That is to say, when we observe the training results, we pay attention to the changes in these parameters. Therefore, it is necessary to observe these evaluation parameters when using DL to recognize the modulation of radio signals.

First of all, the most important parameter is the confusion matrix, from which we can clearly see the prediction and the actual situation. Let us take the classification problem as an example. The binary confusion matrix is shown in Table 13, where ’1’ is a positive class and ’0’ is a negative class and ’Predicted’ stands for predicted results and ’Actual’ stands for actual results. Other variables in the table are defined as

TP: The true value is positive and the predicted value of the model is positive (True Positive = TP)
FN: The actual value is positive and the predicted value of the model is negative (False Negative = FN)
FP: The actual value is negative and the predicted value of the model is positive (False Positive = FP)
TN: The true value is negative and the predicted value of the model is negative (True Negative = TN)

Suppose we want to classify radio signals containing only QPSK and WBFM modulation modes. Assume that in the confusion matrix Table 13, ’1’ represents QPSK and ’0’ represents WBFM. Then, TP represents the number of QPSK correctly predicted. FN represents the amount of QPSK that has been mispredicted as WBFM. TN represents the number of WBFM correctly predicted. FP represents the number of WBFM mispredicted as QPSK. TP + FN represents the actual amount of QPSK. FP + TN represents the actual amount of WBFM. TP + FP represents the amount predicted to be QPSK. FN + TN represents the amount predicted to be WBFM. TP + TN + FP + FN represents the total number of WBFM and QPSK samples. Therefore, we can clearly see the actual and predicted results from the confusion matrix.

For a given set of test data, accuracy is defined as the ratio of the number of samples correctly classified by the classifier to the total number of samples, which is expressed as below

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(1)

The accuracy rate is generally used to evaluate the global accuracy of a model. It cannot contain too much information to comprehensively evaluate the performance of a model. In short, it is the proportion of the number of correctly predicted samples to the total number of samples.

Precision refers to the proportion of the number of samples correctly predicted as a class to the total number of samples predicted as a class. This is for the forecast results. For the example we gave earlier. The precision is the proportion of the number of correctly predicted QPSK TP to the total number of predicted QPSK TP + FP as shown in (2) below. The precision of WBFM is the same.

P r e c i s i o n = \frac{T P}{T P + F P}

(2)

Recall refers to the proportion of the number of samples correctly predicted as a certain category to the actual total number of such samples. This is for the original actual sample. For the example we gave earlier, the recall is the proportion of the quantity TP correctly predicted as QPSK to the total quantity TP + FN of the original QPSK as shown in (3) below. The recall of WBFM is the same.

R e c a l l = \frac{T P}{T P + F N}

(3)

F1-score is based on the harmonic average of recall and precision, which is a comprehensive evaluation of recall and precision, as expressed below

F 1 - s c o r e = \frac{2 \times r e c a l l \times p r e c i s i o n}{r e c a l l + p r e c i s i o n}

(4)

7. Discussion

With the continuous development of DL, there are many kinds of neural networks based on DL for radio modulation recognition. Some methods have high recognition rate, while others have low recognition rate. Of course, the recognition rate of the same network model is different due to the different hyperparameters such as the number of network layers and activation function. Below, we summarize the recognition rate of different methods based on dataset RadioML2016.10a, as shown in Table 14. From the table, we can see that DHN has the highest recognition accuracy because the model is different from the general model. In order to obtain the characteristics of modulated signals, the model is designed to extract two layer features. One is shallow information, the other is deep information. Shallow confidence corresponds to shallow features and the unemployment of shallow features is slightly larger than the sample of each symbol. Deep information corresponds to high-level features, which have global vision. In addition, the general model only takes SNR as the criterion of robustness, but the DHN model takes SNR as the weight of loss function, which can achieve stronger system robustness. The model makes full use of all data, extracts useful information from bad information and can play a greater role in the case of small datasets.

Considering that the datasets used in some of the latest literature are different from those used in the literature analyzed in our table, we cannot compare the experimental results of all the latest literature. Only the latest literature of the dataset RadioML2016.10a can be selected for further comparison. ConvLSTMAE, a new method proposed in 2022 using the dataset RadioML2016.10a, was used for comparison. Through comparison, it is found that ConvLSTMAE can achieve higher accuracy than DHN. This is because it uses an AE as the backbone and convolutional AE and LSTM-AE are combined in parallel as temporal and spatial feature extractors. It has temperament structure and fewer parameters, which can keep high accuracy and lower down calculation costs.

8. Limitations

Due to the advancement and applications of modern radio communication technology, the electromagnetic environment has become more complex, which puts forward higher requirements for the analysis and processing of communication signals. Improving the radio modulation recognition efficiency and accuracy has great benefits for various military and civil fields such as electronic detection, spectrum monitoring, etc. By analyzing and researching various methods of radio modulation recognition based on DL, it is found that most methods still have some defects as described below.

The recognition rate of SNR below 0 dB is low. It can be seen from the literature worldwide that the recognition rate of radio modulation based on DL decreases with the decrease of SNR. Generally, the recognition rate for SNR above 10 dB is higher, about 90% and the recognition rate of SNR above 0 dB can reach 80%. However, the recognition rate of SNR below 0 dB is very low, generally less than 50%. Therefore, further research is needed to improve the recognition rate of SNR below 0 dB.

It is still difficult to use various models based on DL in practical applications. The signals of the datasets that have been published are ideal, with almost no noise and lossless channels. However, the real channel environment is very complex. Actual noise and other uncertain factors will have a great impact on modulation recognition. The difficulty of modulation recognition far exceeds that of signal recognition simulation. This is also an important research direction in the future.

There is some difficulty in implementing DL-based AMR methods on real hardware. At the moment most researcher are working on simulation, for example on computer software. However, very few researcher are working on implementation on real hardware. Deploying DL-based AMR to real hardware is very challenging and requires further investigation.

9. Concluding Remarks

In the previous sections, we reviewed the network models of radio AMR based on DL. The related literature about AMR based on DL was summarized. We found that most of the literature only talked about the advantages of the proposed network model, but few studies mentioned the disadvantages of the model. Below, we summarize the advantages and disadvantages of several models mentioned above. As shown in Table 15, the CNN model has the advantage of local perception. It is good at sensing signal features locally and synthesizing local information at a higher level to obtain global information. CNN also has the advantages of weight sharing and shift-invariance. These can greatly reduce the parameters of model training. However, CNN is a feed-forward neural network with unidirectional propagation, which can only process the current signals. The input signal must be of fixed length. However, in reality, in many cases, the state of the signal at that time is related to the influence of the time before and after. The length of the signal is not exactly the same. The structural form of RNN solved this problem. The input of RNN includes not only the current time, but also the feedback after the output of the current time. As time changes, RNN can also process variable signal sequences. However, it has the problem of disappearing gradients. It cannot solve the problem of long-term dependence. Subsequently, the structural form of LSTM which is the revision of RNN solved the above problems to some extent. LSTM has a memory function for a certain period of time by adding a control unit. LSTM is different from the previous two networks. It is trained in the way of back propagation. However, these functions in turn make LSTM more complex and time-consuming to train. Compared with CNN, RNN and LSTM, DBN is a probability generation model. DBN starts from RBM, based on Bayesian thought and finds out the joint probability distribution of data, so as to automatically obtain the high-level information hidden in the data and difficult to interpret. It can be understood as an unsupervised data coding and the output information has a certain characterization effect on the data. In unsupervised learning, an unlabeled input dataset is provided for the algorithm. The goal is to identify patterns and cluster the data into multiple groups for learning based on similarity [111]. However, its recognition accuracy is not very high and the process of model training is also very complicated. Therefore, when we build the model, we might as well consider designing some hybrid models. By comprehensively utilizing the advantages of each model, more effects can be obtained. For example, in the CLDNN model mentioned in [57], the convolution layer is followed by the cyclic layer. In this case, CNN and LSTM achieve complementary effects, because CNN is good at reducing frequency variation and LSTM is good at temporal modeling [112,113].

As a key part of communication signal processing, radio signal modulation recognition has become a research hotspot with the development of artificial intelligence, including DL, neural networks and others. This paper provides a literature survey and summarizes the development and state-of-the-art of radio modulation recognition based on DL. This paper also summarizes the existing methods of radio modulation recognition based on neural networks and open radio signal datasets. In addition, a CNN network for radio modulation recognition is designed and the effectiveness of the proposed network is verified. A comparison of the proposed network with the previously existing ones has been carried out to show its advantages. Finally, the common evaluation parameters of the existing classical methods are introduced. The problems to be solved in the field of radio modulation recognition and the future research direction are discussed. It is hoped that this paper will introduce and provide useful information on the radio modulation methods based on DL, so that more research and attention can be given to this topic to further improve the efficiency and accuracy of radio modulation recognition.

Author Contributions

Data curation, T.W. and M.J.; funding acquisition, Q.Y.; investigation, P.C.; methodology, T.W.; supervision, Z.X.; validation, G.Y. and T.W.; writing—original draft, T.W.; writing—review and editing, Q.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research has received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AMR	Automatic Modulation Recognition
AMC	Automatic Modulation Classification
AE	Auto-Encoder
AWGN	Additive White Gaussian Noise
NLOS	Non-line-of-sight
CNN	Convolutional Neural Network
CM	Correction Module
CLDNN	Convolutional Long short-term Deep Neural Network
DL	Deep Learning
DNN	Deep Neural Network
DBN	Deep Belief Network
DHN	Deep Hierarchical Network
DMCNN	Deep Multi-scale Convolutional Neural Network
DDrCNN	Dense layer Dropout Convolutional Neural Network
GRU	Gated Recurrent Unit
GNU	GNU’S Not Unix
GRF	Graphic Representation of Features
LSTM	Long Short Term Memory
ML	Machine Learning
PC	Phase Congruency
RNN	Recurrent Neural Network
RT	Radon Transform
ResNet	Residual Network
RSR	Ratio of Successful Recognition
RAT	Radio Access Technology
RBM	Restricted Boltzmann Machine
SNR	Signal-to-Noise Ratio
STFT	Short-Time Fourier Transform
UDNN	Unsorted Deep Neural Network

References

Mitola, J.; Maguire, G.Q. Cognitive radio: Making software radios more personal. IEEE Pers. Commun. 1999, 6, 13–18. [Google Scholar] [CrossRef] [Green Version]
Li, P. Research on radar signal recognition based on automatic machine learning. Neural Comput. Appl. 2020, 32, 1959–1969. [Google Scholar] [CrossRef]
Zhu, Z.; Nandi, A.K. Automatic Modulation Classification: Principles, Algorithms and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Wei, W.; Mendel, J.M. Maximum-likelihood classification for digital amplitude-phase modulations. IEEE Trans. Commun. 2000, 48, 189–193. [Google Scholar] [CrossRef]
Hameed, F.; Dobre, O.A.; Popescu, D.C. On the likelihood-based approach to modulation classification. IEEE Trans. Wirel. Commun. 2009, 8, 5884–5892. [Google Scholar] [CrossRef]
Yuan, Y.; Zhao, P.; Wang, B.; Wu, B. Hybrid maximum likelihood modulation classification for continuous phase modulations. IEEE Commun. Lett. 2016, 20, 450–453. [Google Scholar] [CrossRef]
Dobre, O.A.; Abdi, A.; Bar-Ness, Y.; Su, W. Survey of automatic modulation classification techniques: Classical approaches and new trends. IET Commun. 2007, 1, 137–156. [Google Scholar] [CrossRef] [Green Version]
Aslam, M.W.; Zhu, Z.; Nandi, A.K. Automatic modulation classification using combination of genetic programming and KNN. IEEE Trans. Wirel. Commun. 2012, 11, 2742–2750. [Google Scholar]
Kishore, T.R.; Rao, K.D. Automatic intrapulse modulation classification of advanced LPI radar waveforms. IEEE Trans. Aerosp. Electron. Syst. 2017, 53, 901–914. [Google Scholar] [CrossRef]
Shi, Y.; Zhang, X.D. A Gabor atom network for signal classification with application in radar target recognition. IEEE Trans. Signal Process. 2001, 49, 2994–3004. [Google Scholar] [CrossRef]
Abuella, H.; Ozdemir, M.K. Automatic modulation classification based on kernel density estimation. Can. J. Electr. Comput. Eng. 2016, 39, 203–209. [Google Scholar] [CrossRef] [Green Version]
Rodriguez, P.M.; Fernandez, Z.; Torrego, R.; Lizeaga, A.; Mendicute, M.; Val, I. Low-complexity cyclostationary-based modulation classifying algorithm. AEU-Int. J. Electron. Commun. 2017, 74, 176–182. [Google Scholar] [CrossRef]
Madhavan, N.; Vinod, A.; Madhukumar, A.; Krishna, A.K. Spectrum sensing and modulation classification for cognitive radios using cumulants based on fractional lower order statistics. AEU-Int. J. Electron. Commun. 2013, 67, 479–490. [Google Scholar] [CrossRef]
Nishani, E.; Çiço, B. Computer vision approaches based on deep learning and neural networks: Deep neural networks for video analysis of human pose estimation. In Proceedings of the 2017 6th Mediterranean Conference on Embedded Computing (MECO), Bar, Montenegro, 11–15 June 2017; pp. 1–4. [Google Scholar]
Wang, P. Research and design of smart home speech recognition system based on deep learning. In Proceedings of the 2020 International Conference on Computer Vision, Image and Deep Learning (CVIDL), Chongqing, China, 10–12 July 2020; pp. 218–221. [Google Scholar]
Dong, Y.n.; Liang, G.s. Research and discussion on image recognition and classification algorithm based on deep learning. In Proceedings of the 2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 8–10 November 2019; pp. 274–278. [Google Scholar]
Zunjani, F.H.; Sen, S.; Shekhar, H.; Powale, A.; Godnaik, D.; Nandi, G. Intent-based object grasping by a robot using deep learning. In Proceedings of the 2018 IEEE 8th International Advance Computing Conference (IACC), Greater Noida, India, 14–15 December 2018; pp. 246–251. [Google Scholar]
Kim, B.; Kim, J.; Chae, H.; Yoon, D.; Choi, J.W. Deep neural network-based automatic modulation classification technique. In Proceedings of the 2016 International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 19–21 October 2016; pp. 579–582. [Google Scholar]
O’Shea, T.J.; Corgan, J.; Clancy, T.C. Convolutional radio modulation recognition networks. In Engineering Applications of Neural Networks. EANN 2016; Communications in Computer and Information Science; Springer: Cham, Switzerland, 2016; pp. 213–226. [Google Scholar]
O’Shea, T.J.; Hitefield, S.; Corgan, J. End-to-end radio traffic sequence recognition with recurrent neural networks. In Proceedings of the 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Washington, DC, USA, 7–9 December 2016; pp. 277–281. [Google Scholar]
Mendis, G.J.; Wei, J.; Madanayake, A. Deep learning-based automated modulation classification for cognitive radio. In Proceedings of the 2016 IEEE International Conference on Communication Systems (ICCS), Shenzhen, China, 14–16 December 2016; pp. 1–6. [Google Scholar]
Ali, A.; Yangyu, F. k-Sparse Autoencoder-Based Automatic Modulation Classification with Low Complexity. IEEE Commun. Lett. 2017, 21, 2162–2165. [Google Scholar] [CrossRef]
Hong, D.; Zhang, Z.; Xu, X. Automatic modulation classification using recurrent neural networks. In Proceedings of the 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 13–16 December 2017; pp. 695–700. [Google Scholar]
Rajendran, S.; Meert, W.; Giustiniano, D.; Lenders, V.; Pollin, S. Deep learning models for wireless signal classification with distributed low-cost spectrum sensors. IEEE Trans. Cogn. Commun. Netw. 2018, 4, 433–445. [Google Scholar] [CrossRef] [Green Version]
Ghanem, H.S.; Al-Makhlasawy, R.M.; El-Shafai, W.; Elsabrouty, M.; Hamed, H.F.; Salama, G.M.; El-Samie, F.E.A. Wireless modulation classification based on Radon transform and convolutional neural networks. J. Ambient. Intell. Humaniz. Comput. 2022. [Google Scholar] [CrossRef]
Ghanem, H.S.; Shoaib, M.R.; El-Gazar, S.; Emara, H.; El-Shafai, W.; El-Moneim, S.A.; El-Fishawy, A.S.; Taha, T.E.; Hamed, H.F.; El-Banby, G.M.; et al. Automatic modulation classification with 2D transforms and convolutional neural network. Trans. Emerg. Telecommun. Technol. 2022, e4623. [Google Scholar] [CrossRef]
Abdel-Moneim, M.A.; Al-Makhlasawy, R.M.; Abdel-Salam Bauomy, N.; El-Rabaie, E.S.M.; El-Shafai, W.; Farghal, A.E.; Abd El-Samie, F.E. An efficient modulation classification method using signal constellation diagrams with convolutional neural networks, Gabor filtering and thresholding. Trans. Emerg. Telecommun. Technol. 2022, 33, e4459. [Google Scholar] [CrossRef]
Hamidi-Rad, S.; Jain, S. Mcformer: A transformer based deep neural network for automatic modulation classification. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar]
Wu, X.; Wei, S.; Zhou, Y. Deep multi-scale representation learning with attention for automatic modulation classification. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 28 July 2022; pp. 1–8. [Google Scholar]
Sun, Y.; Ball, E.A. Automatic modulation classification using techniques from image classification. IET Commun. 2022, 16, 1303–1314. [Google Scholar] [CrossRef]
Zhang, M.; Diao, M.; Guo, L. Convolutional neural networks for automatic cognitive radio waveform recognition. IEEE Access 2017, 5, 11074–11082. [Google Scholar] [CrossRef]
Yashashwi, K.; Sethi, A.; Chaporkar, P. A learnable distortion correction module for modulation recognition. IEEE Wirel. Commun. Lett. 2018, 8, 77–80. [Google Scholar] [CrossRef] [Green Version]
Gao, L.; Zhang, X.; Gao, J.; You, S. Fusion image based radar signal feature extraction and modulation recognition. IEEE Access 2019, 7, 13135–13148. [Google Scholar] [CrossRef]
Wang, Y.; Liu, M.; Yang, J.; Gui, G. Data-driven deep learning for automatic modulation recognition in cognitive radios. IEEE Trans. Veh. Technol. 2019, 68, 4074–4077. [Google Scholar] [CrossRef]
Zhang, Q.; Xu, Z.; Zhang, P. Modulation recognition using wavelet-assisted convolutional neural network. In Proceedings of the 2018 International Conference on Advanced Technologies for Communications (ATC), Ho Chi Minh City, Vietnam, 18–20 October 2018; pp. 100–104. [Google Scholar]
Hiremath, S.M.; Deshmukh, S.; Rakesh, R.; Patra, S.K. Blind identification of radio access techniques based on time-frequency analysis and convolutional neural network. In Proceedings of the TENCON 2018—2018 IEEE Region 10 Conference, Jeju Island, Republic of Korea, 28–31 October 2018; pp. 1163–1167. [Google Scholar]
Wu, H.; Li, Y.; Zhou, L.; Meng, J. Convolutional neural network and multi-feature fusion for automatic modulation classification. Electron. Lett. 2019, 55, 895–897. [Google Scholar] [CrossRef]
Gu, H.; Wang, Y.; Hong, S.; Gui, G. Blind channel identification aided generalized automatic modulation recognition based on deep learning. IEEE Access 2019, 7, 110722–110729. [Google Scholar] [CrossRef]
Zheng, S.; Qi, P.; Chen, S.; Yang, X. Fusion methods for CNN-based automatic modulation classification. IEEE Access 2019, 7, 66496–66504. [Google Scholar] [CrossRef]
Yongshi, W.; Jie, G.; Hao, L.; Li, L.; Zhigang, W.; Houjun, W. CNN-based modulation classification in the complicated communication channel. In Proceedings of the 2017 13th IEEE International Conference on Electronic Measurement &Instruments (ICEMI), Yangzhou, China, 20–23 October 2017; pp. 512–516. [Google Scholar]
Dileep, P.; Das, D.; Bora, P.K. Dense layer dropout based CNN architecture for automatic modulation classification. In Proceedings of the 2020 National Conference on Communications (NCC), Kharagpur, India, 21–23 February 2020; pp. 1–5. [Google Scholar]
Li, R.; Li, L.; Yang, S.; Li, S. Robust automated VHF modulation recognition based on deep convolutional neural networks. IEEE Commun. Lett. 2018, 22, 946–949. [Google Scholar] [CrossRef]
Peng, S.; Jiang, H.; Wang, H.; Alwageed, H.; Zhou, Y.; Sebdani, M.M.; Yao, Y.D. Modulation classification based on signal constellation diagrams and deep learning. IEEE Trans. Neural Netw. Learn. Syst. 2018, 30, 718–727. [Google Scholar] [CrossRef]
Wu, H.; Wang, Q.; Zhou, L.; Meng, J. VHF radio signal modulation classification based on convolution neural networks. In Proceedings of the 1st International Symposium on Water System Operations, MATEC Web of Conferences, Beijing, China, 17 October 2018; Volume 246, p. 03032. [Google Scholar]
Peng, S.; Jiang, H.; Wang, H.; Alwageed, H.; Yao, Y.D. Modulation classification using convolutional neural network based deep learning model. In Proceedings of the 2017 26th Wireless and Optical Communication Conference (WOCC), Newark, NJ, USA, 7–8 April 2017; pp. 1–5. [Google Scholar]
Kulin, M.; Kazaz, T.; Moerman, I.; De Poorter, E. End-to-end learning from spectrum data: A deep learning approach for wireless signal identification in spectrum monitoring applications. IEEE Access 2018, 6, 18484–18501. [Google Scholar] [CrossRef]
O’Shea, T.J.; Roy, T.; Clancy, T.C. Over-the-air deep learning based radio signal classification. IEEE J. Sel. Top. Signal Process. 2018, 12, 168–179. [Google Scholar] [CrossRef] [Green Version]
Longi, K.; Pulkkinen, T.; Klami, A. Semi-supervised convolutional neural networks for identifying wi-fi interference sources. In Proceedings of the Ninth Asian Conference on Machine Learning, Seoul, Republic of Korea, 15–17 November 2017; pp. 391–406. [Google Scholar]
Zhang, M.; Zeng, Y.; Han, Z.; Gong, Y. Automatic modulation recognition using deep learning architectures. In Proceedings of the 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Kalamata, Greece, 25–28 June 2018; pp. 1–5. [Google Scholar]
Sang, Y.; Li, L. Application of novel architectures for modulation recognition. In Proceedings of the 2018 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), Chengdu, China, 26–30 October 2018; pp. 159–162. [Google Scholar]
Wang, Z.; Sun, D.; Gong, K.; Wang, W.; Sun, P. A Lightweight CNN Architecture for Automatic Modulation Classification. Electronics 2021, 10, 2679. [Google Scholar] [CrossRef]
Zhang, W.T.; Cui, D.; Lou, S.T. Training images generation for CNN based automatic modulation classification. IEEE Access 2021, 9, 62916–62925. [Google Scholar] [CrossRef]
Du, R.; Liu, F.; Xu, J.; Gao, F.; Hu, Z.; Zhang, A. D-GF-CNN Algorithm for Modulation Recognition. Wirel. Pers. Commun. 2022, 124, 989–1010. [Google Scholar] [CrossRef]
Shi, F.; Hu, Z.; Yue, C.; Shen, Z. Combining neural networks for modulation recognition. Digit. Signal Process. 2022, 120, 103264. [Google Scholar] [CrossRef]
Lin, S.; Zeng, Y.; Gong, Y. Learning of Time-Frequency Attention Mechanism for Automatic Modulation Recognition. IEEE Wirel. Commun. Lett. 2022, 11, 707–711. [Google Scholar] [CrossRef]
Le, H.K.; Doan, V.S.; Hoang, V.P. Ensemble of Convolution Neural Networks for Improving Automatic Modulation Classification Performance. J. Sci. Technol. 2022, 20, 25–32. [Google Scholar] [CrossRef]
West, N.E.; O’shea, T. Deep architectures for modulation recognition. In Proceedings of the 2017 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Baltimore, MD, USA, 6–9 March 2017; pp. 1–6. [Google Scholar]
Daldal, N.; Yıldırım, Ö.; Polat, K. Deep long short-term memory networks-based automatic recognition of six different digital modulation types under varying noise conditions. Neural Comput. Appl. 2019, 31, 1967–1981. [Google Scholar] [CrossRef]
Zhang, Y.; Tong, L.; Zhang, L.; Kan, W. A deep learning approach for modulation recognition. In Proceedings of the 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP), Shanghai, China, 19–21 November 2018; pp. 1–5. [Google Scholar]
Mendis, G.J.; Wei, J.; Madanayake, A. Deep belief network for automated modulation classification in cognitive radio. In Proceedings of the 2017 Cognitive Communications for Aerospace Applications Workshop (CCAA), Cleveland, OH, USA, 27–28 June 2017; pp. 1–5. [Google Scholar]
Sun, X.; Gao, L.; Luo, X.; Su, K. RBM based cooperative Bayesian compressive spectrum sensing with adaptive threshold. In Proceedings of the 2016 IEEE/CIC International Conference on Communications in China (ICCC), Chengdu, China, 27–29 July 2016; pp. 1–6. [Google Scholar]
Huang, S.; Dai, R.; Huang, J.; Yao, Y.; Gao, Y.; Ning, F.; Feng, Z. Automatic modulation classification using gated recurrent residual network. IEEE Internet Things J. 2020, 7, 7795–7807. [Google Scholar] [CrossRef]
Jiyuan, T.; Limin, Z.; Zhaogen, Z.; Wenlong, Y. Multi-modulation recognition using convolution gated recurrent unit networks. J. Phys. Conf. Ser. 2019, 1284, 012052. [Google Scholar] [CrossRef]
Li, J.; Qi, L.; Lin, Y. Research on modulation identification of digital signals based on deep learning. In Proceedings of the 2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT), Harbin, China, 20–22 August 2016; pp. 402–405. [Google Scholar]
Li, M.; Li, O.; Liu, G.; Zhang, C. Generative adversarial networks-based semi-supervised automatic modulation recognition for cognitive radio networks. Sensors 2018, 18, 3913. [Google Scholar] [CrossRef] [Green Version]
Nie, J.; Zhang, Y.; He, Z.; Chen, S.; Gong, S.; Zhang, W. Deep hierarchical network for automatic modulation classification. IEEE Access 2019, 7, 94604–94613. [Google Scholar] [CrossRef]
Liu, X.; Yang, D.; El Gamal, A. Deep neural network architectures for modulation classification. In Proceedings of the 2017 51st Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 29 October–1 November 2017; pp. 915–919. [Google Scholar]
Xie, X.; Yang, G.; Jiang, M.; Ye, Q.; Yang, C.F. A Kind of Wireless Modulation Recognition Method Based on DenseNet and BLSTM. IEEE Access 2021, 9, 125706–125713. [Google Scholar] [CrossRef]
Hao, X.; Luo, Y.; Ye, Q.; He, Q.; Yang, G.; Chen, C.C. Automatic Modulation Recognition Method Based on Hybrid Model of Convolutional Neural Networks and Gated Recurrent Units. Sens. Mater. 2021, 33, 4229–4243. [Google Scholar] [CrossRef]
Liu, F.; Zhang, Z.; Zhou, R. Automatic modulation recognition based on CNN and GRU. Tsinghua Sci. Technol. 2021, 27, 422–431. [Google Scholar] [CrossRef]
Njoku, J.N.; Morocho-Cayamcela, M.E.; Lim, W. CGDNet: Efficient hybrid deep learning model for robust automatic modulation recognition. IEEE Netw. Lett. 2021, 3, 47–51. [Google Scholar] [CrossRef]
Wang, N.; Liu, Y.; Ma, L.; Yang, Y.; Wang, H. Multidimensional CNN-LSTM network for automatic modulation classification. Electronics 2021, 10, 1649. [Google Scholar] [CrossRef]
Lei, Z.; Jiang, M.; Yang, G.; Guan, T.; Huang, P.; Gu, Y.; Xu, Z.; Ye, Q. Towards recurrent neural network with multi-path feature fusion for signal modulation recognition. Wirel. Netw. 2022, 28, 551–565. [Google Scholar] [CrossRef]
Wang, T.; Hou, Y.; Zhang, H.; Guo, Z. Deep learning based modulation recognition with multi-cue fusion. IEEE Wirel. Commun. Lett. 2021, 10, 1757–1760. [Google Scholar] [CrossRef]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef] [Green Version]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Computer Vision—ECCV 2014; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; pp. 818–833. [Google Scholar]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, Massachusetts, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Lauer, F.; Suen, C.Y.; Bloch, G. A trainable feature extractor for handwritten digit recognition. Pattern Recognit. 2007, 40, 1816–1824. [Google Scholar] [CrossRef]
Werbos, P.J. Backpropagation through time: What it does and how to do it. Proc. IEEE 1990, 78, 1550–1560. [Google Scholar] [CrossRef] [Green Version]
Fang, W.; Jiang, J.; Lu, S.; Gong, Y.; Tao, Y.; Tang, Y.; Yan, P.; Luo, H.; Liu, J. A LSTM algorithm estimating pseudo measurements for aiding INS during GNSS signal outages. Remote Sens. 2020, 12, 256. [Google Scholar] [CrossRef] [Green Version]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
Azzouz, E.E.; Nandi, A.K. Automatic identification of digital modulation types. Signal Process. 1995, 47, 55–69. [Google Scholar] [CrossRef]
Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
Cairong, Z.; Xinran, Z.; Cheng, Z.; Li, Z. A novel DBN feature fusion model for cross-corpus speech emotion recognition. J. Electr. Comput. Eng. 2016, 2016, 1–11. [Google Scholar] [CrossRef] [Green Version]
Jun, C.; Qin, Y.; Yi, Z. Speech signals identification base on improved DBN. In Proceedings of the 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 25–26 March 2017; pp. 1144–1148. [Google Scholar]
Shi, P. Speech emotion recognition based on deep belief network. In Proceedings of the 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), Zhuhai, China, 27–29 March 2018; pp. 1–5. [Google Scholar]
Xie, Y.; Zou, C.R.; Liang, R.Y.; Tao, H.W. Phoneme recognition based on deep belief network. In Proceedings of the 2016 International Conference on Information System and Artificial Intelligence (ISAI), Hong Kong, China, 24–26 June 2016; pp. 352–355. [Google Scholar]
Kakkar, D. Facial expression recognition with LDPP & LTP using deep belief network. In Proceedings of the 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 22–23 February 2018; pp. 503–508. [Google Scholar]
Wu, F.; Wang, Z.; Lu, W.; Li, X.; Yang, Y.; Luo, J.; Zhuang, Y. Regularized deep belief network for image attribute detection. IEEE Trans. Circuits Syst. Video Technol. 2016, 27, 1464–1477. [Google Scholar] [CrossRef]
Uddin, M.Z.; Hassan, M.M.; Almogren, A.; Alamri, A.; Alrubaian, M.; Fortino, G. Facial expression recognition utilizing local direction-based robust features and deep belief network. IEEE Access 2017, 5, 4525–4536. [Google Scholar] [CrossRef]
Fan, R.; Hu, W. Face recognition with improved deep belief networks. In Proceedings of the 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guilin, China, 29–31 July 2017; pp. 1822–1826. [Google Scholar]
Cheng, M. The cross-field DBN for image recognition. In Proceedings of the 2015 IEEE International Conference on Progress in Informatics and Computing (PIC), Nanjing, China, 18–20 December 2015; pp. 83–86. [Google Scholar]
Hamel, P.; Eck, D. Learning features from music audio with deep belief networks. In International Society for Music Information Retrieval Conference; ISMIR: Utrecht, The Netherlands, 2010; Volume 10, pp. 339–344. [Google Scholar]
Cui, Y.; Jing, X.J.; Sun, S.; Wang, X.; Cheng, D.; Huang, H. Deep learning based primary user classification in cognitive radios. In Proceedings of the 2015 15th International Symposium on Communications and Information Technologies (ISCIT), Nara, Japan, 7–9 October 2015; pp. 165–168. [Google Scholar]
Quan, D.; Tang, Z.; Wang, X.; Zhai, W.; Qu, C. LPI radar signal recognition based on dual-Channel CNN and feature fusion. Symmetry 2022, 14, 570. [Google Scholar] [CrossRef]
Kumar, S.; Mahapatra, R.; Singh, A. Automatic modulation recognition: An FPGA implementation. IEEE Commun. Lett. 2022, 26, 2062–2066. [Google Scholar] [CrossRef]
O’shea, T.J.; West, N. Radio machine learning dataset generation with gnu radio. In Proceedings of the GNU Radio Conference, Boulder, CO, USA, 12–16 September 2016; Volume 1. [Google Scholar]
Tekbıyık, K.; Ekti, A.R.; Görçin, A.; Kurt, G.K.; Keçeci, C. Robust and fast automatic modulation classification with CNN under multipath fading channels. In Proceedings of the 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 25–28 May 2020; pp. 1–6. [Google Scholar]
Blossom, E. GNU radio: Tools for exploring the radio frequency spectrum. Linux J. 2004, 2004, 4. [Google Scholar]
O’shea, T.J. Gnu radio channel simulation. In Proceedings of the GNU Radio Conference, Boston, MA, USA, 1–3 October 2013. [Google Scholar]
ITU-R M.1225; Guidelines for Evaluation of Radio Transmission Technologies for IMT-2000. International Telecommunications Union, Radio communications Sector (ITU-R): Englewood, CO, USA, 1997.
Yunhao, S.; Hua, X.; Lei, J.; Zisen, Q. ConvLSTMAE: A Spatiotemporal Parallel Autoencoders for Automatic Modulation Classification. IEEE Commun. Lett. 2022, 26, 1804–1808. [Google Scholar] [CrossRef]
Sheraz, M.; Ahmed, M.; Hou, X.; Li, Y.; Jin, D.; Han, Z.; Jiang, T. Artificial intelligence for wireless caching: Schemes, performance and challenges. IEEE Commun. Surv. Tutor. 2020, 23, 631–661. [Google Scholar] [CrossRef]
Sainath, T.N.; Vinyals, O.; Senior, A.; Sak, H. Convolutional, long short-term memory, fully connected deep neural networks. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Qld, Australia, 19–24 April 2015; pp. 4580–4584. [Google Scholar]
Ramjee, S.; Ju, S.; Yang, D.; Liu, X.; Gamal, A.E.; Eldar, Y.C. Fast deep learning for automatic modulation classification. arXiv 2019, arXiv:1901.05850. [Google Scholar]

Figure 1. Typical CNN structure.

Figure 2. RNN structure.

Figure 3. Cyclic unit [86].

Figure 4. LSTM cell structure.

Figure 5. DBN network structure [101].

Figure 6. Flow chart of CNN-based AMR method.

Figure 7. Implementation of our method.

Figure 8. Modulation recognition model based on CNN.

Figure 9. Classification accuracy over different SNRs.

Figure 10. Confusion matrices for different models when SNR = 18 dB.

Figure 11. Recognition accuracy of different number of convolution layers.

Figure 12. Recognition accuracy of different number of convolution kernels.

Table 1. Applications of DL models in radio modulation recognition.

DL Model	Literature
CNN	[31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56]
RNN	[23]
LSTM	[24,49,50,57,58]
DBN	[21,59,60,61]
Other	GRU [62,63], AE [22,64], CGRN [65], CLDNN [66,67], DenseNet + BLSTM + DNN [68], CNN + GRU [69,70], CNN + GRU + DNN [71], CNN + LSTM [72,73], CNN + IndRNN [74]

Table 2. Radio modulation recognition networks based on CNN.

Year	Author	The Innovation of the Paper	Dataset	Evaluation Parameter	The Technology of DL
2017	Zhang et al. [31]	CNN	Created dataset ¹	The overall RSR is 93.7%.	CNN architecture; Two dimensional time frequency diagram.
2017	Peng et al. [45]	Modulation recognition based on CNN.	Unspecified	High SNR area is close to 100%.	AlexNet Model; CNN.
2017	Wang et al. [40]	Improved AMC method on CNN.	Created dataset ¹	The accuracy is up to 90%.	CNN architecture
2018	Li et al. [42]	A novel sparse-filtering criterion; Unsupervised pre-train.	Created dataset ¹	Accuracy higher than 95%.	Sparse-Filtering CNN; Unsupervised pretraining.
2018	Peng et al. [43]	Modulation classification based on CNN.	Created dataset ¹	The classification accuracy reached 97.1%.	AlexNet; GoogLeNet.
2018	Wu et al. [44]	VHF radio signal modulation classification based on CNN.	Created dataset ¹	The classification accuracy can reach 99%.	CNN
2018	Kulin et al. [46]	Time domain features are used to train the CNN classifier.	RadioML2016.10a	The classification accuracy is up to 99%.	CNN
2018	O’Shea et al. [47]	Modulation Recognition Based on Residual Network.	RadioML2018.01a	Up to 94% accuracy.	Deep residual network
2018	Xu et al. [35]	CNN	Create spectrum image.	The average recognition rate can reach 95%.	CNN
2018	Rakesh et al. [36]	Combination of time frequency distribution and CNN.	Generating rat spectrum with Matlab.	Classification accuracy up to 100%.	Blind identification method; CNN architecture.
2018	Longi et al. [48]	Supervision model based on CNN.	Collected data	Achieve 2% error rate.	Semi-supervised learning; CNN.
2018	Zhang et al. [49]	CNN; A preprocessed signal representation.	RadioML2016.10a	The accuracy is improved by 8%.	CNN
2018	Sang et al. [50]	Improved CNN.	RadioML2016.10a	The accuracy can reach 93%.	CNN
2019	Sethi et al. [32]	Correction module CM + CNN.	RadioML2016.10a	The accuracy is up to 90%.	Calibration module cm; CNN architecture.
2019	Gao et al. [33]	CNN based on Transfer Learning.	Created dataset ¹	The overall RSR is up to 95.5%.	Image fusion algorithm; CNN of transfer learning.
2019	Wang et al. [34]	Combination of two CNN based on DL; CNN based on Constellation.	Constellation dataset created.	The accuracy of the former can reach more than 95%; The latter precision is close to 100%.	Planisphere; DrCNN.
2019	Wu et al. [37]	AMC with multi feature fusion based on CNN.	RadioML2016.10a	The average accuracy is 80%; Reduced training time.	Multi feature fusion; CNN architecture.
2019	Gu et al. [38]	Geneamr based on two CNN.	Created dataset ¹	The accuracy can reach 98%.	CNN architecture.
2019	Yang et al [39]	AMR of CNN based on three fusion methods.	Created dataset ¹	The accuracy is 96%, 97%, 98%.	CNN architecture.
2020	Dileep et al. [41]	Dense layer dropout CNN (DDrCNN).	Created dataset ¹	More than 97% accuracy can be achieved.	CNN architecture; Classification cross entropy.
2021	Wang et al. [51]	An AMC method based on lightweight CNN.	RadioML2016.10a; RadioML2018.01a	The proposed network can save 70∼98% model parameters and 30∼99% inference time.	CNN; Residual architecture.
2021	Zhang et al. [52]	An AMC method based on multiple-scale CNN.	Created dataset ¹	The averaged classification accuracy reaches approximately 97.7% at 4 dB SNR.	CNN
2022	Ghanem et al. [25]	An AMC method based on CNN, which uses radom transform of constellation diagrams as input.	Created dataset ¹	The classification accuracy reaches 100% at 5 dB SNR.	CNN; AlexNet; VGG.
2022	Du et al. [53]	A dilated CNN for AMR.	Created dataset ¹	The recognition accuracy under low SNR is significantly improved.	Dilated CNN.
2022	Shi et al. [54]	An AMR method based on a multi-scale convolution deep network.	RadioML2018.01a	The recognition accuracy can reach 98.7%.	CNN; Attention mechanisms.
2022	Lin et al. [55]	An AMR framework based on CNN with time-frequency attention mechanism.	RadioML2018.01a; RadioML2016.10b	This method has higher recognition rate and fewer parameters.	CNN; Attention mechanisms.
2022	Le et al. [56]	Five CNN models are proposed for AMC.	HisarMod2019.1	The highest accuracy can reach 97.5%.	ResNet18; SqueezeNet; GoogleNet; MobileNet; RepVGG.

¹ The dataset created refers to the data signals designed by the authors according to their own needs by using simulation software.

Table 3. Radio modulation recognition networks based on RNN and LSTM.

Year	Author	The Innovation of the Paper	Dataset	Evaluation Parameter	The Technology of DL
2017	Hong et al. [23]	RNN	RadioML2016.10a	The classification accuracy can reach 91%.	RNN
2017	West et al. [57]	LSTM	RadioML2016.10a	The classification accuracy is about 90%.	LSTM
2018	Rajendran et al. [24]	Two layer LSTM.	RadioML2016.10a	The average classification accuracy is close to 90%.	RNN; LSTM.
2018	Zhang et al. [49]	LSTM; Preprocess signal	RadioML2016.10a	The accuracy is improved by 8%.	LSTM
2018	Sang et al. [50]	Improved LSTM	RadioML2016.10a	The accuracy can reach 93%	LSTM
2019	Daldal et al. [58]	LSTM	Created dataset ¹	The accuracy can reach 94.72%	LSTM

¹ The dataset created refers to the data signals designed by the authors according to their own needs by using simulation software.

Table 4. Radio modulation recognition networks based on DBN.

Year	Author	The Innovation of the Paper	Dataset	Evaluation Parameter	The Technology of DL
2015	Cui et al. [102]	User centered DBN model	Actual sampling data	Short time; Accuracy increased.	DBN model
2016	Sun et al. [61]	Collaborative Bayesian compressed spectrum detection method based on RBM.	Created dataset ¹	Improve detection accuracy; Enhance anti-interference capability	RBM
2016	Wei et al. [60]	DBN with anti noise ability.	Created dataset ¹	The accuracy can reach more than 90%.	Feature representation mechanism based on SCF; DBN Network.
2017	Wei et al. [21]	DBN based on low complexity.	Simulation creation dataset	the classification accuracy can reach more than 90%.	Feature representation mechanism based on SCF; DBN Network
2018	Zhang et al. [59]	DBN	Simulation creation dataset	The average recognition rate is 92.12%.	Unsupervised greedy algorithm; DBN Network.

¹ The dataset created refers to the data signals designed by the authors according to their own needs by using simulation software.

Table 5. Other radio modulation recognition networks.

Year	Author	The Innovation of the Paper	Dataset	Evaluation Parameter	The Technology of DL
2017	Liu et al. [67]	Convolution long short term deep neural network (CLDNN)	RadioML2016.10a	The accuracy can reach 88.5%.	CNN; Convolution long short term network.
2017	Ali et al. [22]	Data classifier (udnn)	Signals actually collected.	The classification accuracy can reach 95%.	sparse autoencoder; Classification cross entropy.
2017	Qi et al. [64]	Deep automatic encoder network.	Created dataset ¹	When SNR is 10 dB, the recognition rate can reach 1.	AE
2018	Li et al. [65]	Semi supervised learning method for antagonistic training.	RadioML2016.10a	The classification accuracy is 91%.	STN network structure; Cgrn countermeasure network.
2019	Nie et al. [66]	Deep hierarchical network (DHN) based on CNN.	RadioML2016.10a	The accuracy can reach 93%.	SNR as a weight in trainning; DBN.
2021	Xie et al. [68]	An AMR method based on DenseNet + BLSTM + DNN network.	RadioML2016.10a	The recognition accuracy of this method is higher than traditional modulation recognition methods.	DenseNet; BLSTM; DNN
2021	Hao et al. [69]	An AMR method based on a CNN–GRU hybrid network.	RadioML2016.04c; RadioML2016.10a	The comprehensive recognition accuracy on the two datasets is 60.64% and 73.2%, respectively.	CNN; GRU.
2021	Njoku et al. [71]	An AMC method based on CNN + GRU + DNN network.	RadioML2016.10a; RadioML2016.10b	The recognition accuracy can reach 93.5% and 90.38% on RadioML2016.10a and RadioML2016.10b, respectively.	CNN; GRU; DNN
2021	Wang et al. [72]	An AMC method of hierarchical multifeature fusion based on multidimensional CNN and LSTM.	RadioML2016.10a; RadioML2016.10b	The recognition accuracy is higher than other methods.	CNN; LSTM
2021	Wang et al. [74]	A novel multi-cue fusion network for AMR.	RadioML2016.10a; RadioML2018.01a	The recognition accuracy can reach 97.8% and 96.1% on RadioML2016.10a and RadioML2018.01a, respectively.	CNN; IndRNN; Attention mechanisms
2021	Liu et al. [70]	The GRU based on feature extraction and CNN based on cyclic spectrum are combined.	Created dataset ¹	The recognition rate is 100% when the SNR is −1 dB.	CNN; GRU
2022	Lei et al. [73]	An AMR method based on a novel multi-path features fusion network.	RadioML2016.04c	The recognition accuracy is 99.04% at 18 dB SNR.	CNN; LSTM

¹ The dataset created refers to the data signals designed by the authors according to their own needs by using simulation software.

Table 6. Description of display parameters for Radio2016.04c dataset.

Dataset	RadioML2016.04c
Number of modulation mode	11
Number of digital modulation mode	8
Number of analog modulation mode	3
Modulation mode	8PSK, AM-DSB, AM-SSB, BPSK, CPFSK, GFSK, PAM4, QAM16, QAM64, QPSK, WBFM
Format of each sample	$2 \times 128$
Number of samples	220,000
Samples per symbol	8
SNR (dB)	−20:2:18

Table 7. Description of display parameters for Radio2016.10a dataset.

Dataset	RadioML2016.10a
Number of modulation mode	11
Number of digital modulation mode	8
Number of analog modulation mode	3
modulation mode	8PSK, AM-DSB, AM-SSB, BPSK, CPFSK, GFSK, PAM4, QAM16, QAM64, QPSK, WBFM
Format of each sample	$2 \times 128$
Number of samples	220,000
Samples per symbol	8
SNR(dB)	−20:2:18

Table 8. Description of display parameters for Radio2016.10b dataset.

Dataset	RadioML2016.10b
Number of modulation mode	11
Number of digital modulation mode	8
Number of analog modulation mode	3
Modulation mode	8PSK, BPSK, SPFSK, GFSK, PAM4, QAM16, QAM64, QPSK, WBFM, AM-DSB
Format of each sample	$2 \times 128$
Number of samples	1,200,000
SNR (dB)	−20:2:18

Table 9. Description of display parameters for RadioML2018.01a dataset.

Dataset	RadioML2018.01a
Number of modulation mode	24
Modulation mode	32PSK, 16APSK, 32QAM, FM, GMSK, 32APSK, OQPSK, 8ASK, BPSK, 8PSK, AM-SSB-SC, 4ASK, 16PSK, 64APSK, 128QAM, 128APSK, AM-DSB-SC, AM-SSB-WC, 64QAM, QPSK, AM-DSB-WC, 256QAM, OOK, 16QAM
Format of each sample	$2 \times 1024$
Number of samples	5,234,491,392
SNR(dB)	−20:2:30

Table 10. Description of display parameters for HisarMod2019.1 dataset.

Dataset	HisarMod2019.1
Number of modulation mode	26
Modulation mode	Analog modulation: AM-DSB, AM-SC, AM-USB, AM-LSB, FM, PM. FSK modulation: 2FSK, 4FSK, 8FSK, 16FSK. PAM modulation: 4PAM, 8PAM, 16PAM. PSK modulation: BPSK, QPSK, 8PSK, 16PSK, 32PSK, 64PSK. QAM modulation: 4QAM, 8QAM, 16QAM, 32QAM, 64QAM, 128QAM, 256QAM.
Format of each sample	$2 \times 1024$
Number of samples	780,000
Number of signals each modulation type	1500
SNR (dB)	−20:2:18

Table 11. Detailed structure of the model.

Layer	Size of Convolution Kernel	Layer Output Dimension	Layer Activation Function	Number of Trainable Parameters
Input layer	-	$2 \times 128$	-	0
Pooled convolutional layer 1	$1 \times 3$	$130 \times 256$	ReLU	1024
Pooled convolutional layer 2	$1 \times 3$	$132 \times 256$	ReLU	196,864
Pooled convolutional layer 3	$1 \times 3$	$134 \times 256$	ReLU	196,864
Pooled convolutional layer 4	$2 \times 3$	$136 \times 80$	ReLU	122,960
Full connected layer 1	-	$1 \times 256$	ReLU	2,785,536
Full connected layer 2	-	$1 \times 11$	Softmax	2827
Output layer	-	$1 \times 11$	-	0

Table 12. Recognition accuracy and number of parameters of five models.

Model	ResNet	Inception	CLDNN	CNN	Model (This Paper)
Highest recognition accuracy	87.75%	93.60%	92.82%	92.34%	98.47%
Average recognition accuracy	60.40%	62.43%	65.40%	64.29%	68.25%
Number of parameters	3,425,233	10,142,983	164,233	278,299	3,306,075

Table 13. Confusion matrix.

Confusion Matrix		Predicted		Total
Confusion Matrix		1	0	Total
Actual	1	TP	FN	TP + FN: Actual Positive
Actual	0	FP	TN	FP + TN: Actual Negative
Total		TP + FP: Predicted Positive	FN + TN: Predicted Negative	TP + TN + FP + FN: The total number of samples

Table 14. Accuracy comparison of existing classical methods.

Network Structure	Dataset	Recognition Rate
CLDNN [67]	RadioML2016.10a	88.5%
CM + CNN [32]	RadioML2016.10a	90%
LSTM [57]	RadioML2016.10a	90%
RNN [23]	RadioML2016.10a	91%
A semi supervised approach to confrontation training [65]	RadioML2016.10a	91%
DHN [66]	RadioML2016.10a	93%
ConvLSTMAE [110]	adioML2016.10a	94.51%

Table 15. Advantages and disadvantages of the DL network model.

DL Network Models	Advantages	Disadvantages
CNN	Local perception; Weight sharing; Shift invariance.	The input is a fixed length; One-way non feedback connection.
RNN	Contains the feedback input at the current time; Processing signal sequence.	Gradient vanishing problem; Processing signal sequence unable to solve the long-term dependency problem.
LSTM	Back propagation; With memory function.	The calculation is complex and time-consuming.
DBN	Establish a joint probability distribution; Unsupervised learning.	High complexity; The recognition accuracy is low.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, T.; Yang, G.; Chen, P.; Xu, Z.; Jiang, M.; Ye, Q. A Survey of Applications of Deep Learning in Radio Signal Modulation Recognition. Appl. Sci. 2022, 12, 12052. https://doi.org/10.3390/app122312052

AMA Style

Wang T, Yang G, Chen P, Xu Z, Jiang M, Ye Q. A Survey of Applications of Deep Learning in Radio Signal Modulation Recognition. Applied Sciences. 2022; 12(23):12052. https://doi.org/10.3390/app122312052

Chicago/Turabian Style

Wang, Tiange, Guangsong Yang, Penghui Chen, Zhenghua Xu, Mengxi Jiang, and Qiubo Ye. 2022. "A Survey of Applications of Deep Learning in Radio Signal Modulation Recognition" Applied Sciences 12, no. 23: 12052. https://doi.org/10.3390/app122312052

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Survey of Applications of Deep Learning in Radio Signal Modulation Recognition

Abstract

1. Introduction

2. Related Work

3. Radio Modulation Recognition Methods Based on Deep Learning

3.1. CNN

3.2. RNN

3.3. DBN

3.4. Other Models Based on DL Networks

3.5. Recent Research Trends of AMC Based on DL

4. Datasets

4.1. RadioML2016.04c

4.2. RadioML2016.10a

4.3. RadioML2016.10b

4.4. RadioML2018.01A

4.5. HisarMod2019.1

4.6. Other Datasets

5. Radio Modulation Recognition Model Based on CNN

5.1. Modulation Recognition of Radio Signals by CNN

5.2. Influence of CNN Network Hyperparameters on Modulation Recognition Rate

5.2.1. Influence of the Number of Network Layers on Recognition Results

5.2.2. Influence of the Number of Convolution Kernels on Recognition Results

6. Accuracy of Common Evaluation Parameters and Classical Methods

Common Evaluation Parameters

7. Discussion

8. Limitations

9. Concluding Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI