Time–Frequency-Analysis-Based Blind Modulation Classification for Multiple-Antenna Systems

Blind modulation classification is an important step in implementing cognitive radio networks. The multiple-input multiple-output (MIMO) technique is widely used in military and civil communication systems. Due to the lack of prior information about channel parameters and the overlapping of signals in MIMO systems, the traditional likelihood-based and feature-based approaches cannot be applied in these scenarios directly. Hence, in this paper, to resolve the problem of blind modulation classification in MIMO systems, the time–frequency analysis method based on the windowed short-time Fourier transform was used to analyze the time–frequency characteristics of time-domain modulated signals. Then, the extracted time–frequency characteristics are converted into red–green–blue (RGB) spectrogram images, and the convolutional neural network based on transfer learning was applied to classify the modulation types according to the RGB spectrogram images. Finally, a decision fusion module was used to fuse the classification results of all the receiving antennas. Through simulations, we analyzed the classification performance at different signal-to-noise ratios (SNRs); the results indicate that, for the single-input single-output (SISO) network, our proposed scheme can achieve 92.37% and 99.12% average classification accuracy at SNRs of −4 and 10 dB, respectively. For the MIMO network, our scheme achieves 80.42% and 87.92% average classification accuracy at −4 and 10 dB, respectively. The proposed method greatly improves the accuracy of modulation classification in MIMO networks.


Introduction
The increase in communication demands and the shortage of spectrum resources has caused the cognitive radio (CR) and multiple-input multiple-output (MIMO) techniques to be implemented in wireless communication systems.As one of the essential steps of CR, modulation classification (MC) is widely applied in both civil and military applications, such as spectrum surveillance, electronic surveillance, electronic warfare, and network control and management [1].It improves radio spectrum utilisation and enables intelligent decision-making for context-aware autonomous wireless spectrum monitoring systems [2].However, most of the existing MC methods are focussed on single-input single-output (SISO) scenarios, which cannot be directly applied when multiple transmit antennas are equipped at the transceivers [3].Therefore, it is crucial to research the performance of the MC method for MIMO communication systems.
Traditional MC approaches for the SISO systems discussed in the literature can be classified into two main categories: likelihoodbased (LB) approaches and feature-based (FB) approaches [4].The LB approaches can theoretically achieve optimal performance as they compute the likelihood functions of the different modulated signals to maximise the classification accuracy.However, they have a very high computational complexity and require prior information, such as the channel coefficient [5] [6].Hence, the LB approaches cannot be directly applied in fast modulation classification and blind modulation classification (BMC).By contrast, the FB approaches cannot obtain the optimal result, but they have lower computational complexity and do not require prior information.The FB methods usually include two steps: feature extraction and classifier design.The higher-order statistics, instantaneous statistics, and other features are calculated in the feature extraction.Then the popular classification methods, such as decision tree [7], support vector machine [8] [9], and artificial neural network (ANN) [10] [11] are adopted as the classifiers.
With the rapid rise of artificial intelligence and the emerging requirements of intelligent wireless communication, deep learning-based approaches are now becoming widely studied and used in different aspects of wireless communication, such as the transceiver design at the physical layer [12] and BMC problems [13] [14] [15] [16] [17] [18].As for BMC in SISO scenarios, the raw in-phase and quadrature phase (IQ) data or the time-domain amplitude and phase data can be directly used as the input of the deep learning neural network.More specifically, the authors in [13] presented convolutional long shortterm deep neural network and deep residual network (Resnet) algorithms to identify 10 different modulation types, with a high classification accuracy over a wide range of signal-to-noise ratio (SNR) values.Rajendran et al. [14] proposed a new data-driven model for BMC based on long short-term memory (LSTM), which learnt the features from the time-domain amplitude and phase information of the modulation schemes and yielded an average classification accuracy close to 90% for SNRs from 0 dB to 20 dB.Zhang et al. [15], adopting the Resnet model as the classifier, presented an approach to fuse the time-frequency images and the handcrafted features of the modulated signals to obtain more discriminating features.The experimental results showed that the proposed scheme has a superior performance.The latest research indicates that the deep learning-based MC methods achieve better overall performance than the traditional LB and FB approaches for the SISO systems.
Although the MC (or BMC) method systems on SISO networks are becoming more mature, research into using MC for MIMO networks has just begun [3].The authors in [19] and [20] proposed similar methods for the MC of MIMO transceiver systems, which calculate the higher order statistical moments and cumulants of the received signal.Then the artificial neural network is employed to classify the modulation types.In [21], a clustering classifier based on centroid reconstruction is presented to identify the modulation scheme with unknown channel matrix and noise variance in MIMO systems.The simulation results showed that their algorithm can obtain excellent performance, even at low SNR and with a very short observation interval.To deal with the BMC problem and the two major constraints in the railway transmission environment (i.e. the high speeds and impulsive nature of the noise), Kharbech et al. [22] proposed a feature-based process of blind identification that includes three parts: impulsive noise mitigation, feature extraction, and classification.By analysing the correlation functions of the received signals for certain modulation formats, Mohamed et al. resolved the BMC problem in single and multiple-antenna systems operating over frequencyselective channels in [23], and the BMC problem in Alamouti STBC System [24].
For MIMO systems, it is difficult to directly apply deep learning to the raw IQ data or the time-domain amplitude and phase data, since the overlapped signals at the receiver of the MIMO system destroy the statistical features [25].Hence, it is crucial to extract the distinguishable features or convert the raw signals for BMC in MIMO systems.Due to the time-frequency analysis methods can jointly analyse the time-domain and frequencydomain features of signals, and the different modulation types have distinct time-domain and frequency-domain features.Hence, in this paper, in order to overcome the effect of the overlapped signals at the receiver, we analyse the time-frequency features of the modulated signals to resolve the BMC problem in MIMO systems.First, the time-frequency analysis method based on the windowed short-time Fourier transform (STFT) [26] is employed to generate the spectrum of the MIMO-modulated signals.Then the spectrum in different time windows is converted to a greyscale image, and the greyscale image is converted to a red-green-blue (RGB) spectrogram image [27].Second, a fine-tuned AlexNet-based convolutional neural network (CNN) model is introduced to learn the features from the RGB spectrogram images.The modulation scheme of each receiving stream among the receiving MIMO signals is identified in this stage.Finally, the previously produced decisions are merged to form the final result.In addition, this method can be simplified to di-rectly apply to SISO systems.The simulation results show that the proposed method achieves a superior performance at low SNR scenarios for both MIMO and SISO systems.
This paper is organised as follows.The signal model of the MIMO and SISO systems and the STFT-based time-frequency analysis method are introduced in Section 2. Section 3 presents the BMC scheme for MIMO systems, including the proposed CNN model and the decision method.Then the RGB spectrogram image and the classification performance in different scenarios are analysed in Section 4. Finally, conclusions are drawn in Section 5.

Signal model and time-frequency analysis method
In this section, we define the MIMO signal model, and then the simplified SISO signal model is derived.Then the STFTbased time-frequency analysis method is introduced to generate the spectrogram image of the MIMO modulated signals.

MIMO signal model
We consider a MIMO-based single-carrier wireless communication system with N t transmit antennas and N r receive antennas.The flat-fading and time-invariant MIMO channel is adopted herein.Therefore, the MIMO channel H ∈ C N r ×N t is defined as where h i j represents the channel coefficient between the j-th transmit antenna and i-th receive antenna.The channel matrix H is assumed to be full-column rank and the channel gains remain constant over the observation interval.Let x = [x 1 (t), ..., x j (t), ..., X N t (t)] T denote the transmitted data streams, where x j (t) represent the transmitted modulated signal at the jth transmit antenna.Likewise, let y = [y 1 (t), ..., y i (t), ..., y N r (t)] T represent the received data streams, where y i (t) is the received signal at the ith receive antenna.Then the received signals can be further described by where the vector n = [n 1 (t), ..., n i (t), ..., n N r (t)] T represents the additive white Gaussian noise (AWGN) vector and each element n i (t) of n is an identically and independently distributed (i.i.d.) random variable with zero mean and variance σ 2 (i.e.n i (t) ∼ N(0, σ 2 )).In order to obtain the RGB spectrogram image of y i (t), the datasets generated in this paper are time-domain signals [15], instead of the baseband signals used in [28,29].

SISO signal model
When N r = N t = 1, the MIMO-based signal model in Section 2.1 can be converted into a SISO-based signal model.The received signals corrupted by the AWGN in the SISO system can then be represented as where x(t) represents the original digital modulated signals, y(t) represents the digital modulated signals over the wireless channel, h represents the channel attenuation coefficient, and n(t) denotes the AWGN.In this paper, the original digital modulated signals x(t) may be multiple amplitude-shift keying (MASK), multiple frequency-shift keying (MFSK), multiple phase-shift keying (MPSK) and quadrature amplitude modulation (QAM) signals [30].
The time-domain expression of MASK-modulated signals is described as where A m , T s , f c , and ϕ 0 represent the modulation amplitude, symbol period, carrier frequency, and initial phase, respectively.The value of A m depends on the symbol sequence and the modulation order M. In addition, g(t) is a baseband signal waveform and is usually a square-root raised cosine pulses.
Similarly, the time-domain expressions of MFSK and MPSK are defined as and In ( 5) and ( 6), f m and ϕ m are the modulation frequency and phase, respectively, and the values of these parameters depend on the symbol sequence and the modulation order M.
However, the QAM signal is slightly different from the MXSK (MASK, MFSK, and MPSK) modulated signals, because the QAM-modulated signal has two orthogonal carriers.Therefore, it can be represented as where √ M, and the two carriers are individually modulated by a n and b n [9].

STFT-based time-frequency analysis
In this paper, the STFT is adopted in the modulated signal analysis.That is, we use STFT to analyse the frequency and phase of local sections of the time-varying modulated signals with a time window function [31].Then the spectrogram image (the visual representation of the frequency spectrum of a signal) is constructed.In this subsection, we introduce the theory of the STFT, and then we present the method to generate the STFTbased RGB spectrogram image for the modulated signals.

Theory of the STFT
Consider a signal s(τ) and a real, even window w(τ), whose Fourier transforms (FT) are S ( f ) and W( f ), respectively.To obtain a localised spectrum of s(τ) at time τ = t, the signal is multiplied by the window w(τ) centred at time τ = t, which results in Next, the FT at is taken at time τ, obtaining where F w s (t, f ) is the STFT [26].

Generating the STFT-based RGB spectrogram image for the modulated signals
In order to perform the STFT and obtain the spectrogram image of the modulated signals, we implement the process showed in Fig. 1.Dividing a given discrete modulated signal vector y(n) of length L into highly overlapped frames each with length w s generates the spectral vector f , where y(n) is obtained by sampling the received modulated signal y(t).Hence, the signal in the current frame, y F (n), is where F is the current frame, w(n) is the window function, the window function can be hamming,hanning or blackman, and we choose hamming in this paper [32].Then the δ is the incrementation between two consecutive frames, which is calculated by Herein, the L overlap (L overlap < w s < L) is the length of overlapped signals between two consecutive frames, and the number of frames N F can be calculated by The larger the L overlap , the greater the N F , and hence the higher time resolution of the STFT.The hamming window function w(n) is defined as where R w s (n) is a rectangular window with length w s .
Based on (10), we can obtain the spectral magnitude vector f F of the current frame F, where N/2 − 1 is the number of points of the Fourier transform.The larger the N, the higher the frequency resolution of the STFT.Therefore, the linear value of the spectral magnitude vector is obtained as Received signal y(t) Frame 1 Frame N F Fourier transform Fourier transform The linear value of the spectral magnitude vector can be normalised in the range of [0, 1] as By combining the normalised linear spectral magnitude vector G(k, F) of all the frames as we can obtain the time-frequency matrix G ∈ C (N/2−1)×N F .This matrix is a greyscale image of the spectral magnitude vector, the size of this image is (N/2 − 1) × N F , the horizontal axis of this image represents time, and the vertical axis represents frequency.
Next, the greyscale image is quantised into its RGB components, the mapping type is the jet in matlab r2016b [33].The mapping is expressed as where I c is the RGB spectrogram image and f map is the nonlinear jet quantisation function [27].It is worth noting that, to facilitate the observation and analysis of RGB spectrogram image, we deploy the color mapping in this paper, this step can be omitted in practical applications.For the STFT, by adjusting the values of the window length w s and overlapped signal length L overlap , we can tune the time resolution of the RGB spectrogram image.Moreover, by adjusting the number of points of the Fourier transform N/2 − 1, we can also tune the frequency resolution of the RGB spectrogram image.

Proposed BMC scheme
In this section, a time-frequency analysis is conducted and a deep learning-based BMC scheme is proposed.The block diagram of the proposed BMC scheme is shown in Fig. 2, which shows four modules: signal generator, time-frequency analysis, CNN classifier, and decision fusion.The signal generator outputs the modulated signals x i (t) (with the same modulation type) for each transmit antenna [19].This process was described in subsections 2.1 and 2.2.Then the time-frequency analysis is performed for the received signal y i (t) for each receive antenna, which generates the RGB spectrogram image I ci

Time-frequency analysis for received signals
The flow chart of STFT-based time-frequency analysis is shown in Fig. 1.First, using the ASK signal as an example, the received signal y(t) is divided into N F frames by the hamming window w(n) with length w s , the details of which are described in Eqs. ( 10)- (13).Second, the spectrum of the windowed signal is obtained by its Fourier transform.Third, by normalising and combining the linear spectral magnitude vector, the greyscale spectrogram image G is obtained (the size of the related greyscale matrix is (N/2 − 1) × N F ). Finally, to accommodate the input layer of AlexNet and improve the distinguishability of the spectrogram image, the greyscale spectrogram image is mapped into RGB spectrogram image I c (the size of the related RGB matrix is (N/2 − 1) × N F × 3).Then, the RGB matrix is cut or padded into 227 × 227 × 3 before feeding it into the CNN.

AlexNet based CNN classifier
In our proposed BMC scheme, AlexNet, which is utilised for object detection [35] and was the winner of the 2012 Ima-geNet Large Scale Visual Recognition Challenge (ILSVRC), is adopted as the classifier.The network architecture of AlexNet is shown in Fig. 3 [34].
As depicted in Fig. 3, AlexNet contains eight layers; the first five are convolutional and the remaining three are fully connected .The output of the last fully connected layer is fed to a 1000-way softmax that produces a distribution over the 1000 class labels [35].AlexNet uses the rectified linear unit (ReLU) as the activation function of the CNN.In practice, the dropout and max pooling techniques are applied to the CNN.AlexNet has an excellent performance in visual tracking and object detection due to its capability in sensing the pattern position on the image.Therefore, considering that the spectrogram image has rich pattern position information, it is sensible to choose AlexNet as the classifier network.
The motivation of transfer learning comes from the fact that people can intelligently apply knowledge learned previously to solve new problems faster or with better solutions [36].In order to utilise the pretrained AlexNet, transfer learning is employed to fine-tune AlexNet and accelerate the training process.The last layer of the pretrained AlexNet network in Fig. 3 is configured with 1000 classes, and this layer must be fine-tuned to accommodate the new classification task.First, all layers except the last layer are extracted, then the last layer is replaced with a new fully connected layer that contains eight neurons (i.e. the number of modulation categories in this paper).In the end, the parameters of the activation layer and the classification output layer are set to accommodate the new classification task.Therefore, with such fine-tuning, the output of AlexNet can precisely perform the modulation classification of the received signals.

Decision fusion
Since there are multiple antennas at the receiver of the MIMO network, it is possible for each branch to cooperate with each other to achieve higher identification reliability [19].As shown in Fig. 2, the N r received signals are classified independently because the influences of signal overlapping, interchannel noise, and random phase shifting may cause each received signal to be identified as a different modulation type.This may lead to incorrect identification results.The decision fusion among all the receive antennas aims to improve the average classification accuracy.The decision vector of the i-th received signal, d i , can be defined as where K is the number of modulation types, d ik is the probability of identifying the received signal y i (t) as modulation type k, and d ik meets the following condition, Therefore, the modulation type m i of the received signal y i (t) is the modulation type which has the maximum probability.The modulation type with maximum probability can be defined as a set M as follows, Note that there are two cases for the above equations, 1) the maximum probability is unique, i.e., |M| = 1, the modulation type of i-th received signal is the element of M; 2) the maximum probability is not unique, i.e., |M| ≥ 2, the modulation type of i-th received signal is randomly chosen from M.
Hence, the decision fusion can be converted to the problem of deciding the final modulation type m according to m i , i = 1, 2, ..., N r .The fusion rule at the fusion module can be OR, AND, or majority rule, which can be generalised as the "n-out-of-N r rule".[37].That is, a certain modulation scheme  is identified when a classifier is decided on among the N r classifiers.Take the N r = 4 as an example and the possible modulation types formulate the set M ={2PSK, 4PSK, 8PSK}, if there are more than three classifiers identify the modulation type as 2PSK (4PSK or 8PSK), then the final modulation type is 2PSK (4PSK or 8PSK); if there are two classifiers identify the modulation type as 2PSK and the other two classifiers identify the modulation type as 4PSK and 8PSK, respectively, then the final decision is 2PSK; in addition, if the two classifiers identify the modulation type as 2PSK and the other two classifiers identify the modulation type as 4PSK (or 8PSK), the decision fusion centre will randomly choose a modulation type between 2PSK and 4PSK (or 8PSK) as the final result.

Performance analysis
In this section, the proposed time-frequency analysis and deep learning-based BMC algorithm is tested under different modulation schemes in both the SISO and MIMO scenarios.Specifically, the random channel attenuation assigns a value from [0, 1], and random phase shifts within one symbol interval are considered for the MIMO scenario.The AWGNs with different SNRs are added into the modulated signals for both the SISO and MIMO scenarios.In addition, we consider the following MIMO antenna configurations: N t = 2 and N r = 4.In the simulations, the 2ASK, 2FSK, 2PSK, 4ASK, 4FSK, 4PSK, 8PSK, and 16QAM modulation schemes are considered, unless otherwise stated.The parameters of the modulated signals are assigned as follows.The sampling frequency f s is 16 KHz, the carrier frequency f c is 2 KHz, the symbol rate f b is 100 Hz, and the length of original digital signal is 14 (i.e.

RGB spectrogram image of the modulated signals
In this subsection, in order to simplify the analysis, we select only certain binary and quaternary digital signal sequences (as shown in Fig. 4) to generate the RGB spectrogram image.The binary signal 4(a) is used to generate the 2-order modulated signals (i.e.2ASK, 2FSK, and 2PSK) and the quaternary signal 4(b) is used for the 4-order modulated signals (i.e.4ASK, 4FSK, and 4PSK).First of all, the RGB spectrogram image is a time-frequency distribution image of the modulated signal.The horizontal axis of this image represents time and the vertical axis represents frequency.In addition, the colour of the RGB spectrogram image represents the value of the normalised spectral magnitude (i.e. the values corresponding to blue and red are zero and one, respectively).
Figs. 5(a) and 5(d) show the RGB spectrogram image of the ASK-modulated signals.The power of the ASK-modulated signals concentrate on one frequency band in the image, and the power in the image is discontinuous over time.In addition, the colour in the image is blue when the digital signal sequence is at zero level in Fig. 4, and it is red when the digital signal sequence is at a non-zero level, which corresponds to the values of the spectral magnitude.In addition, compared with the 2ASK signal, the spectral magnitude of the 4ASK signal has a larger average value (i.e. more pixels in the 4ASK RGB spectrogram image have a value of 1).The antenna configuration for the MIMO system is N t = 2 and N r = 4, then the random channel attenuation assigns a value from [0, 1], and random phase shifts within one symbol interval are considered for the MIMO scenario, and the AWGNs with 10dB SNRs are added into the modulated signals.In addition, a multiplexing-based transmission scheme is adopted for the MIMO system.Specifically, two transmit antennas send two independent data streams, but with the same modulation scheme (e.g.2ASK, 2FSK, or 2PSK).The result is shown in Fig. 7.
A comparison of Figs. 7 and 5 shows that, for all the modulated signals, the signal overlapping of the MIMO system has no effect on the power distribution of the modulated signals in the frequency domain, but the power distribution over the time domain is changed.The latter can be explained by the fact that the overlapping of different transmitted signals partly destroys the time-frequency characteristics of raw modulated signals.In spite of this, some crucial time-frequency characteristics are not destroyed by the MIMO signal overlapping, such as the 'ring' that is caused by the phase mutation in the 2PSK signal (shown in Figs.5(c) and 7(c)).Hence, the overlapping of modulated

Classification accuracy in the MIMO scenario
The classification performance of the proposed scheme in the MIMO scenario is now verified.In order to better understand the performance of the proposed scheme, the model is trained and tested with two data sets (as in [19]): one for the modulation set Θ 1 = {2ASK, 2FSK, 2PSK, 4ASK, 4FSK, 4PSK, 8PSK, 16QAM} and another for a smaller modulation set Θ 2 = {2ASK, 2FSK, 2PSK, 4ASK, 4FSK, 4PSK}.In the testing stage , the SNR of the modulated signals is varied from S NR = −4 dB to S NR = 10 dB, and the result is shown in Fig. 10.For both scenarios with and without the decision fusion module, the classification accuracy of the proposed scheme increases as the SNR of the modulated signals increases, which is consistent with the theoretical analysis.However, by introducing the decision fusion module, a 10% performance improvement in the classification accuracy can be achieved.In more detail, the proposed scheme can achieve 80.42% and 87.92% accuracy at -4 and 10dB SNR in Θ 1 , and 87.78% and 93.33% accuracy at -4 and 10dB SNR in Θ 2 .In addition, the average classification accuracy for the MIMO scenario is lower than the SISO scenario.This is due to the fact that, by using multiple antennas in the system, the structure of the original signals is destroyed by overlapping at the receive antenna, as mentioned in section 4.1.
SNRs of -4 dB and 10 dB, respectively.The MFSK-and QAMmodulated signals have the highest classification accuracies at both -4 dB and 10 dB, and the MASK-modulated signals have the second highest .The MPSK signals (especially the 4PSK signals) exhibit the worst classification performance, as shown in Fig. 11(a).Most of the 4PSK are misclassified as 8PSK at S NR = −4 dB, and the performance is improved only slightly at S NR = 10 dB.This result indicates that the MIMO system structure has negative effects on the time-frequency characteristics of the MPSK signals, which is consistent with the theoretical analysis.Hence, our proposed scheme has difficulty identifying the high-order PSK signals in the MIMO system.However, the time-frequency analysis and deep learning-based scheme have excellent performance in classifying the MFSK-, ASK-, and QAM-modulated signals, and it can obtain superior average classification accuracy for the MIMO system.

Conclusion
In this paper, we resolve the problem of blind modulation classification for the MIMO system.Specifically, the windowed STFT was used to analyse the time-frequency characteristics of the modulation signals, and the time-frequency graphs of the modulated signals were converted to RGB spectrogram images.Then transfer learning was utilised to fine-tune AlexNet to adapt to our classification problem, and the generated RGB spectrogram images were fed into the fine-tuned CNN to extract and train the net.Finally, the decision of each received signal from the MIMO receivers were combined by the decision fusion module for the final decision.The STFT-based time-frequency analysis results showed that each modulation type had unique time-frequency characteristics, and that the additive noise had limited influence on the time-frequency characteristics of the modulation signals.The final classification results indicated that the proposed scheme can achieve 92.37% and 99.12% classification accuracy at SNRs of -4 dB and 10 dB in the SISO scenario.For the MIMO system, the proposed scheme still achieved 70% and 80% at -4 dB for the large and small modulation sets, respectively.In future work, we plan to improve the performance of the proposed scheme for the highorder PSK signals.

Figure 1 :
Figure 1: The flow chart of STFT-based time-frequency analysis.

Figure 2 :
Figure 2: Block diagram of the proposed MIMO modulation classification scheme.
each modulated signal contains (16000/100) × 14 = 2240 sample points).In addition, in the training stage, 100 modulated signals for each modulation type and SNR are randomly generated for both the SISO and MIMO scenarios, in which the SNR varies from -4 to 10 dB at intervals of 2 dB [15].In the test stage, 100 modulated signals for each modulation type and SNR are randomly generated.All the signal samples are generated in Matlab 2017b, and the training and testing of AlexNet are based on the Matlab neural network toolbox.Additionally, the parameters to generate the RGB spectrogram image are set as w s = 320, L overlap = 315, δ = 5, and N = 2048.We now discuss how the modulation order, SNR, and overlapping of the MIMO signals influence the RGB spectrogram image of the modulated signals.Then the classification performance of the proposed scheme is validated for different scenarios.

Figs. 5
(b) and 5(e) show the RGB spectrogram image of the FSK-modulated signals at a SNR of 10 dB.The spectral magnitude of the 2FSK modulated signals has a larger value over two subbands, and the spectral magnitude of the 4FSK modulated signals has a larger value over four subbands .For the FSK signals, the modulation order is equal to the number of modulated frequencies, which is the number of subbands in the RGB spectrogram image.The RGB spectrogram images of the PSK-modulated signals are shown in Figs.5(c) and 5(f).The phase mutation of modulated signals is captured in the RGB spectrogram images.Specifically, Figs.4(a) and 5(c) both have the π phase mutation in the 2PSK modulated signal from 0 to 1 and from 1 to 0 in the binary digital signal sequences.The π phase mutation decreases the value of the power spectral density at the modulated frequency, which appears as a 'ring' in the RGB spectro-gram image.Similarly, comparing Figs.4(b) and 5(f), the π/2 and 3π/2 phase mutations also partly decrease the value of the power spectral density at the modulated frequency, but they appear as a 'half-ring' in the RGB spectrogram image.Therefore, modulated signals with different modulation orders have different time-frequency features, and it is reasonable to classify the modulated signals using the time-frequency analysis.4.1.2.RGB spectrogram image of the modulated signals with different SNRs In this paper, only the 2-order modulation schemes are analysed for different SNRs of the RGB spectrogram image.For the 2ASK modulated signals with S NR = 10 dB and S NR = −4 dB, the corresponding RGB spectrograms are shown in Figs.5(a) and 6(a), respectively.For the 2ASK-modulated signals, as the noise power increases, the components of the noise power become more prominent, as shown by the white patches in the RGB spectrogram image.However, the main features of the RGB spectrogram image of the 2ASK modulated signals are not destroyed.That is, the power distribution of the 2ASKmodulated signals is still concentrated in one subband in the RGB spectrogram image.In addition, the distribution of the power values of the power spectral density are almost the same at different SNRs.Similarly, the RGB spectrograms for the 2FSK-and 2PSK-modulated signals with S NR = 10 dB and S NR = −4 dB are shown in Figs.5(b) and 6(b) and Figs.5(c) and 6(c), respectively.From these figures, we can conclude that increases in the noise power do not destroy the main features of the RGB spectrogram image of these modulated signals, and thus they can be used as the features for modulation classification even in the low SNR region.4.1.3.RGB spectrogram image of the modulated signals for the MIMO channels We now analyse how the MIMO channel influences the RGB spectrogram image of the modulated signals.The 2ASK, 2FSK, and 2PSK modulation schemes are discussed herein.