A New Radar Signal Recognition Method Based on Optimal Classiﬁcation Atom and IDCQGA

: Radar electronic reconnaissance is an important part of modern and future electronic warfare systems and is the primary method to obtain non-cooperative intelligence information. As the task requirement of radar electronic reconnaissance, it is necessary to identify the non-cooperative signals from the mixed signals. However, with the complexity of battleﬁeld electromagnetic environment, the performance of traditional recognition system is seriously affected. In this paper, a new recognition method based on optimal classiﬁcation atom and improved double chains quantum genetic algorithm (IDCQGA) is researched, optimal classiﬁcation atom is a new feature for radar signal recognition, IDCQGA with symmetric coding performance can be applied to the global optimization algorithm. The main contributions of this paper are as follows: Firstly, in order to measure the difference of multi-class signals, signal separation degree based on distance criterion is proposed and established according to the inter-class separability and intra-class aggregation of the signals. Then, an IDCQGA is proposed to select the best atom for classiﬁcation under the constraint of distance criterion, and the inner product of the signal and the best atom for classiﬁcation is taken as the eigenvector. Finally, the extreme learning machine (ELM) is introduced as classiﬁer to complete the recognition of signals. Simulation results show that the proposed method can improve the recognition rate of multi-class signals and has better processing ability for overlapping eigenvector parameters.


Introduction
Signal recognition is one of the main research directions in the field of electronic reconnaissance [1,2]. Fast and accurate identification of enemy signals can grasp advantages in battlefield environment perception, information control, and operational command, which has an important impact on the trend of war. Nowadays, radar signals have the characteristics of changeable intra-pulse modulation. It is a reliable way to improve the recognition ability of radar signals to study the extraction of intra-pulse features. The intra-pulse modulation of radar signals can be divided into intentional modulation and unintentional modulation [3]. Intentional modulation is an artificial modulation mode to improve the detection performance and resist enemy reconnaissance. Unintentional modulation refers to inherent parameter differences of radar equipment. This paper mainly studies the recognition of intentional modulation, which consists of feature extraction and classifier design. Feature extraction is equivalent to transforming radar signals from high-dimensional signal space to low-dimensional feature space, while classification or recognition are equivalent to transforming from feature space to decision space.
For feature extraction, delay autocorrelation and phase difference characteristics, instantaneous characteristics, statistical characteristics, and transform domain characteristics are the most widely used [4]. The delay autocorrelation and phase difference between different modulation types select the optimal atoms suitable for classification, and the inner product of each reconnaissance signal and the extracted atoms is used as the eigenvector. Finally, the signal classification and recognition is realized by the extreme learning machine (ELM). The main contributions of this paper are as follows: (1) Under the constraint of distance criterion, an optimal classification atom is proposed to measure the difference of multi-class signals, which is inspired by the traditional OMP algorithm; (2) An IDCQGA is proposed to improve the extraction speed of the optimal classification atom, this is an efficient optimization algorithm, which can be applied to other fields. (3) ELM is introduced to realize radar signal recognition. To a certain extent, the problem of signal recognition in the presence of overlapping characteristic parameters is solved.
The rest of this paper is organized as follows. Section 2 introduces the model and traditional algorithm. In Section 3, the proposed algorithm is derived. The experimental results are given in Section 4, and finally conclusions are drawn in Section 5.

Atom Decomposition Model
In signal decomposition theory, complex signals can be decomposed into linear combinations of basic signal units. Assume there is a signal f (n): where b 1 (n), b 2 (n), · · · , b M (n) is a base of f (n), c k is the base coefficient. If all vectors in the M dimensional space can be expressed by base C, then C is the complete orthogonal basis. Commonly decomposition methods for orthogonal basis include Fourier transform (FT), short-time Fourier transform (STFT) and wavelet transform (WT) et al. Although the complete orthogonal basis has a perfect expression, the adaptive ability of the orthogonal basis is poor, so it is difficult to fully represent the signal characteristics. The over-complete basis (such as TF atoms) adds more redundant bases so that the signal representation in the over-complete basis is sparse. Figure 1 shows the schematic diagram of signal decomposition of FT, STFT, WT and TF atoms. In order to solve these three problems, this paper proposes an optimal classification atom feature extraction and recognition method based on distance criterion. Firstly, the concept of signal separation degree based on distance criterion is established according to the inter-class separability and intra-class aggregation. Then, the improved double-stranded quantum genetic algorithm (IDCQGA) is applied to select the optimal atoms suitable for classification, and the inner product of each reconnaissance signal and the extracted atoms is used as the eigenvector. Finally, the signal classification and recognition is realized by the extreme learning machine (ELM). The main contributions of this paper are as follows: (1) Under the constraint of distance criterion, an optimal classification atom is proposed to measure the difference of multi-class signals, which is inspired by the traditional OMP algorithm; (2) An IDCQGA is proposed to improve the extraction speed of the optimal classification atom, this is an efficient optimization algorithm, which can be applied to other fields. (3) ELM is introduced to realize radar signal recognition. To a certain extent, the problem of signal recognition in the presence of overlapping characteristic parameters is solved.
The rest of this paper is organized as follows. Section 2 introduces the model and traditional algorithm. In Section 3, the proposed algorithm is derived. The experimental results are given in Section 4, and finally conclusions are drawn in Section 5.

Atom Decomposition Model
In signal decomposition theory, complex signals can be decomposed into linear combinations of basic signal units. Assume there is a signal ( ) f n : dimensional space can be expressed by base C, then C is the complete orthogonal basis. Commonly decomposition methods for orthogonal basis include Fourier transform (FT), short-time Fourier transform (STFT) and wavelet transform (WT) et al. Although the complete orthogonal basis has a perfect expression, the adaptive ability of the orthogonal basis is poor, so it is difficult to fully represent the signal characteristics. The over-complete basis (such as TF atoms) adds more redundant bases so that the signal representation in the over-complete basis is sparse. Figure 1 shows the schematic diagram of signal decomposition of FT, STFT, WT and TF atoms. From Figure 1, we can see that the FT decomposition only divides the frequency components equally. STFT decomposition regards the signal as short-time stationary and adds the division of time domain. WT decomposition enhances the local information expression ability by regularly adjusting the size of TF partitioning. However, this regular partitioning method limits the TF expression ability of arbitrary signals. The TF atoms decomposition establishes over-complete base representations of signals, which can clearly express the TF details. From Figure 1, we can see that the FT decomposition only divides the frequency components equally. STFT decomposition regards the signal as short-time stationary and adds the division of time domain. WT decomposition enhances the local information expression ability by regularly adjusting the size of TF partitioning. However, this regular partitioning method limits the TF expression ability of arbitrary signals. The TF atoms decomposition establishes over-complete base representations of signals, which can clearly express the TF details.
Constructing an over-complete atom library can be achieved through a series of transformations of window functions g(t) ∈ L 2 (R). For example, the atom assembly consisting of the displacement, modulation and expansion of Gauss function is the commonly used Gabor atom library. Gabor atom library can realize the sparse representation of most signals, however, due to the lack of frequency modulation (FM) parameters, the description performance of intra-pulse frequency agile signal is poor. In this paper, the radar signal is represented by Chirplet atom library. Compared with Gabor atom, Chirplet atom adds FM parameters of Gauss function: where s, u, ξ, c are parameters, which denote the expansion, displacement, frequency center and frequency modulation slope of atoms, respectively. g(t) = 2 1/4 e −πt 2 , the discretized over-complete atom library can be obtained by discretization of s, u, ξ, c [10]. Figure 2 is a time domain diagram of selected four random atoms. Using Chirplet atom can accurately capture the TF characteristics of the signal and realize the signal feature extraction. Constructing an over-complete atom library can be achieved through a series of transformations of window functions 2 ( ) ( ) g t L R ∈ . For example, the atom assembly consisting of the displacement, modulation and expansion of Gauss function is the commonly used Gabor atom library. Gabor atom library can realize the sparse representation of most signals, however, due to the lack of frequency modulation (FM) parameters, the description performance of intra-pulse frequency agile signal is poor. In this paper, the radar signal is represented by Chirplet atom library. Compared with Gabor atom, Chirplet atom adds FM parameters of Gauss function: where , , , s u c ξ are parameters, which denote the expansion, displacement, frequency center and frequency modulation slope of atoms, respectively.

OMP for Feature Extraction
OMP algorithm searches for the optimal matched TF atoms from the over-complete atom library to achieve feature extraction and sparse representation of the signal [11].The signal decomposition step of OMP algorithm is as follows: for signal f , let TF atom library is Choose the optimal matched atom by: where  represents inner product operation, let  Then, f can be decomposed into projection component 1 1 , γ γ f e e and residual residuals 1 R f : Then, 1 R f is decomposed continuously and the matched atom γ k g is selected in the k-th decomposition:

OMP for Feature Extraction
OMP algorithm searches for the optimal matched TF atoms from the over-complete atom library to achieve feature extraction and sparse representation of the signal [11]. The signal decomposition step of OMP algorithm is as follows: for signal f , let TF atom library is D = {g γ } γ∈Γ . Choose the optimal matched atom by: where · represents inner product operation, let µ γ 1 = g γ 1 (µ γ 1 is the first matched atom). Normalize µ γ 1 by e γ 1 = µ γ 1 / µ γ 1 (e γ 1 indicates the normalization atom of the first matched atom). Then, f can be decomposed into projection component f , e γ 1 e γ 1 and residual residuals R 1 f : Then, R 1 f is decomposed continuously and the matched atom g γ k is selected in the k-th decomposition: Gram-Schmidt orthogonalization is applied to g γ k : Normalize the k-th atom µ γ k , we get e γ k = µ γ k / µ γ k . The residual signal is decomposed into: where R K f is the residual signal. After decomposing the K step, we can get the representation of the signal f: In the iterative matching process, the first matched atom can represent the principal component of the signal. With the increase of K, residual component R K f approaches 0 [11]. Thus, the sparse representation of the radar signal can be obtained by adding all the signal parts of the iterative matching.

Distance Criterion-OMP
By analyzing the feature extraction method based on OMP, we find that there are two problems in atom matching of traditional OMP, which have been mentioned in the Introduction. Firstly, the atoms matched by OMP are the optimal matches for the signal but may not be the optimal classification atoms for several types of signals. For example, conventional pulse (CON) and binary phase-shift keying (BPSK) signals with the same carrier frequency have great similarities in TF domain. The difference is that BPSK has a sudden change in frequency at the phase jump point. This detail feature can easily be ignored in OMP algorithm, resulting in the misidentification of two types of signals. Secondly, in the TF atom matching process of the traditional OMP algorithm, both the training signal and the test signal need to be decomposed in the over-complete atom library, which greatly increases the computational complexity. Therefore, this paper proposes an optimal classification atom extraction method to distinguish different signals by establishing distance criterion-OMP (traditional OMP is based on optimal matched atom). The core idea of this method is to analyze all training signals in the process of atom feature training and use the distance criterion between classes instead of the inner product criterion of the traditional OMP to select the classification atoms. Then the selected classification atoms and the training signals are calculated by inner product operation to get the eigenvectors. The specific methods are described as follows: Assuming signals set is C = f (i,j) , i = 1, 2, . . . c, j = 1, 2, . . . N i , where c is the number of classes, N i is the number of signals in the i-th class, f (i,j) represents the j-th sample in the i-th class. The inner product of atom g γ k and f (i,j) can be expressed as where Θ (i,j) is eigenvector. From this formula, we can obtain the eigenvectors matrix of all kinds of signals for atom g γ k .
In feature space, the larger inter-class separability between classes (the larger the average distance between classes) and the smaller intra-class aggregation in one class (the smaller the average distance between samples), the better the classification recognition of the feature space. Therefore, we need to consider both the intra-class and inter-class distance in the atom selection principle, which we called distance criterion.
Assuming average distance vector of i-th class is: For c classes, the average intra-class distance can be expressed as: Expanding the Equation (10), the average intra-class distance can be obtained after deduction [31]: where u (i) is the mean value of the i-th class and: Average inter-class distance is: where u is the overall mean of the signals. Therefore, according to the distance criterion, the separation degree D is defined as follows: The larger of D, the higher separation degree for signals. Therefore, we use this distance criterion instead of the inner product matching criterion in the OMP algorithm for selecting atoms (Distance criterion-OMP). |D| is the fitness function (objection function). The number of classification atoms can be adjusted according to the requirement. The characteristic parameters of the test signal can be obtained by selecting the atoms corresponding to the larger D value and calculating the inner product with the test signal. At the same time, because the selected atoms may have a large correlation, the selected atoms must be orthogonalized after each extraction, thus ensuring the orthogonality between the extracted atoms.
After determining the selection criterion, it is necessary to solve how to quickly search for the classification atoms with the maximum separation degree D from the TF atom library. In order to improve the searching speed, the TF atoms based on intelligent algorithm are usually used, such as GA, QGA and so on. In this paper, an IDCQGA is proposed to improve the matching speed.

IDCQGA
DCQGA uses double quantum encoding method to increase the search speed and then through quantum gate updating, quantum chromosome mutation evolves and enriches the gene to complete the search purpose.
Quantum encoding: Quantum chromosomes consist of two symmetry gene chains in DCQGA, assume the population is m (i = 1, · · · , m), then quantum chromosomes is: where t ij ∈ (0, 2π) is the phase angle, j = 1, · · · , n is the number of quantum genes in each chain.
The quantum probability amplitude is expressed by cos(t ij ), sin(t ij ) T , whose value varies in (−1, 1).
In IDCQGA, we compress the encoding range by limiting t ij in (0, π) and flipping sin(t ij ) in (π/2, π). The new double chain coding method is expressed as: The coding space map of the two symmetry gene chains is shown in Figure 3. This improvement ensures that the probability amplitude of sin(t ij ) and cos(t ij ) are still within (−1, 1), and guarantees the monotonicity of the probability amplitude. Meanwhile, the new coding method increases the coding density (called high density coding) by reducing the search space, Thus enables the algorithm to get the optimal value faster.
is the phase angle, 1, , j n =  is the number of quantum genes in each chain. The quantum probability amplitude is expressed by ( ) , whose value varies in ( 1,1) − .
In IDCQGA, we compress the encoding range by limiting ij t in (0, ) π and flipping sin( ) ij t in ( 2, ) π π . The new double chain coding method is expressed as: The coding space map of the two symmetry gene chains is shown in Figure 3. This improvement ensures that the probability amplitude of s in ( ) , and guarantees the monotonicity of the probability amplitude. Meanwhile, the new coding method increases the coding density (called high density coding) by reducing the search space, Thus enables the algorithm to get the optimal value faster.   Quantum Gate Updating: All evolutionary algorithms need to be updated in a specific way to improve population quality. In DCQGA, quantum rotation gates are defined as follows where ∆θ is the rotation angle. By multiplying the original probability amplitude with the rotation door, we get the update effect of the quantum gene as follows: Quantum rotation gate mainly completes the quantum renewal process by changing the magnitude of probability. Therefore, ∆θ is particularly important, too large ∆θ will miss the optimum, too small ∆θ will lead to the updated fitness value into the local optimal solution [24].
In IDCQGA, we use an adaptive rotation formula through the gradient of fitness, as shown below: where [α 0 β 0 ] and [α i β i ] are the current and to be updated probability amplitudes; ∆θ 0 is the initial value; F(X j i ) is the fitness function value of the i-th qubit for the j-th chromosome; F j min and F j max are the minimum and maximum fitness function values.
It can be seen that this adaptive updating method is more reasonable, because when the gradient of fitness function is larger, the ∆θ value is smaller, that is to say, the updating step is reduced, which ensures that the algorithm can search the optimal value more accurately. When the fitness function gradient is small, the value of ∆θ is larger, that is, increasing the step size and improving the search speed of the algorithm.

Quantum chromosome mutation:
In order to further increase the diversity of population size, optimization algorithms usually add mutation steps after population updating to avoid falling into local optimum as much as possible. The traditional algorithm uses NOT-gate mutation, and the quantum NOT-gate is defined as follows: However, from the Equation (21), we can see that the principle of NOT-gate mutation is realized by the exchange of probability amplitude sin(t ij ) and cos(t ij ). This exchange method does not increase the diversity of probability amplitude, that is, it has little effect on the diversity of population size.
Therefore, in IDCQGA, we change the angle of the NOT-gate door and define π/6 mutation gate as follows: The effect of π/6 mutation gate is: It can be seen that the π/6 mutation gate changes the value of probability amplitude, which is a true sense of variation, thus enriching the size of the population. The value of π/6 can play a mutating role and avoid falling into local optimum as far as possible. In addition, the size of the value can be set according to the different experiments.
In order to apply the IDCQGA to the matching process of the optimal classification atoms, the signal separation degree D by the distance criterion should be used as the fitness function of the IDCQGA. The iterative process is shown in Figure 4. It can be seen that the 6 π mutation gate changes the value of probability amplitude, which is a true sense of variation, thus enriching the size of the population. The value of 6 π can play a mutating role and avoid falling into local optimum as far as possible. In addition, the size of the value can be set according to the different experiments.
In order to apply the IDCQGA to the matching process of the optimal classification atoms, the signal separation degree D by the distance criterion should be used as the fitness function of the IDCQGA. The iterative process is shown in Figure 4.

Radar Signal Recognition
The proposed system structure of radar signal recognition is shown in Figure 5. IDCQGA can extract the optimal classification atoms of multi-class signals under the constraint of distance criterion-OMP, and then the optimal classification atoms and signals are internally product to get the eigenvectors. After the eigenvectors are determined, the classifier needs to recognize the feature vectors. SVM is suitable for small sample signal classification, but SVM needs to adopt additional intelligent algorithm to search for settings parameters. Therefore, this paper introduces the ELM algorithm with faster speed and less parameter settings to realize signal recognition. In ELM, activation function g(x) selects Sigmoidal function and the number of hidden layer nodes L can be obtained by grid search. Since the introduction of ELM is mainly to verify the effectiveness of the proposed feature extraction algorithm, this paper does not do much discussion on ELM, readers can refer to [29,30] for a detailed analysis of ELM.
The specific process is shown in Algorithm 1.

Radar Signal Recognition
The proposed system structure of radar signal recognition is shown in Figure 5. IDCQGA can extract the optimal classification atoms of multi-class signals under the constraint of distance criterion-OMP, and then the optimal classification atoms and signals are internally product to get the eigenvectors. After the eigenvectors are determined, the classifier needs to recognize the feature vectors. SVM is suitable for small sample signal classification, but SVM needs to adopt additional intelligent algorithm to search for settings parameters. Therefore, this paper introduces the ELM algorithm with faster speed and less parameter settings to realize signal recognition. In ELM, activation function g(x) selects Sigmoidal function and the number of hidden layer nodes L can be obtained by grid search. Since the introduction of ELM is mainly to verify the effectiveness of the proposed feature extraction algorithm, this paper does not do much discussion on ELM, readers can refer to [29,30] i j D f g γ is the fitness function.
Step2: Search the maximum of fitness function and record the n optimal solutions g γ . The n optimal solutions g γ represent the extracted optimal classification atoms.
Step3: Obtain n-dimensional eigenvectors by inner multiplying n optimal classification atoms with all kinds of signals respectively. Then, normalize feature vectors to [0, 1]. Step4: Set randomly the weights w and offset b between the input and the hidden layer in ELM, and determine the number of hidden layer nodes L by grid search.
Step5: Calculate the output matrix H and the output weight β of the network using the training set, and obtain the ELM model. Step6: Use ELM model to test the signal samples and evaluate the performance of the model.

Simulation Results and Analysis
The following experiments are performed in MATLAB 2018(a) using Intel(R) Core i7-6700 + 4.0-GHz processor with the Windows 7 operating system. The proposed algorithm is in the analytical stage in a simulated computational environment.

Performance of IDCQGA
This section first uses the test function Shaffer's F6 of the optimization algorithm to verify. The three-dimensional surface of the Shaffer's F6 function as shown in Figure 6a-c are the profiles of 0 = y and 0 = x . From Figure 6, we know that Shaffer's F6 function has infinite maxima in (−100, 100), the maximum coordinate is (0, 0), the value is 1, the function expression is as follows:  The specific process is shown in Algorithm 1.

Algorithm 1
The specific process of signal recognition method.

Input: training and testing signals
Step1: Initialize population size, quantum gene number, maximum iteration number and mutation probability of IDCQGA. Initialize the iteration times of the distance criterion-OMP algorithm. Take atom g γ defined by the TF parameter γ(s, u, ξ, c) as the optimization parameters, signal separation degree D( f (i,j) , g γ ) is the fitness function.
Step2: Search the maximum of fitness function and record the n optimal solutions g γ . The n optimal solutions g γ represent the extracted optimal classification atoms.
Step3: Obtain n-dimensional eigenvectors by inner multiplying n optimal classification atoms with all kinds of signals respectively. Then, normalize feature vectors to [0, 1]. Step4: Set randomly the weights w and offset b between the input and the hidden layer in ELM, and determine the number of hidden layer nodes L by grid search. Step5: Calculate the output matrix H and the output weight β of the network using the training set, and obtain the ELM model. Step6: Use ELM model to test the signal samples and evaluate the performance of the model. Output: identification result of testing signals.

Simulation Results and Analysis
The following experiments are performed in MATLAB 2018(a) using Intel(R) Core i7-6700 + 4.0-GHz processor with the Windows 7 operating system. The proposed algorithm is in the analytical stage in a simulated computational environment.

Performance of IDCQGA
This section first uses the test function Shaffer's F6 of the optimization algorithm to verify. The three-dimensional surface of the Shaffer's F6 function as shown in Figure 6a-c are the profiles of y = 0 and x = 0. From Figure 6, we know that Shaffer's F6 function has infinite maxima in (−100, 100), the maximum coordinate is (0, 0), the value is 1, the function expression is as follows: Comparing the optimization performance of genetic algorithm (GA) [21], QGA [22], traditional DCQGA [24] and IDCQGA in this paper, Table 1 shows the parameter settings of each optimization algorithm.  The optimization results of the four algorithms are shown in Table 2 and Figure 7. The simulation results show that both GA and QGA are trapped in the local optimal value, only the optimal value of DCQGA and IDCQGA algorithm is greater than 0.99. QGA is the first to fall into local optimum, mainly because the fixed update steering angle limits the performance of the algorithm. The convergence algebra of IDCQGA is significantly lower than that of DCQGA, which shows that the high-density coding method in this paper compresses the search space of the algorithm and improves the convergence speed of the algorithm. At the same time, IDCQGA has the optimal maximum search value, and also verifies the feasibility of 6 π mutation gate. In order to further verify the stability of the four algorithms, each algorithm is tested 10 times, and the simulation results are shown in Figure 8. It can be seen that the biggest performance fluctuation of the algorithm is GA, followed by QGA. DCQGA algorithm achieves 6 times convergence state, and one optimization fluctuation is large. IDCQGA algorithm converges in 10 experiments, further verifying the stability of the proposed algorithm. For the optimization of complex functions, the high-density coding and mutation of the algorithm can improve the robustness and search performance of the algorithm, which is an excellent optimization algorithm. In the process of extracting the optimal classification atoms from the TF atom library, the fitness function of the optimization algorithm is replaced by the absolute value of the signal separation degree function. Comparing the optimization performance of genetic algorithm (GA) [21], QGA [22], traditional DCQGA [24] and IDCQGA in this paper, Table 1 shows the parameter settings of each optimization algorithm. The optimization results of the four algorithms are shown in Table 2 and Figure 7. The simulation results show that both GA and QGA are trapped in the local optimal value, only the optimal value of DCQGA and IDCQGA algorithm is greater than 0.99. QGA is the first to fall into local optimum, mainly because the fixed update steering angle limits the performance of the algorithm. The convergence algebra of IDCQGA is significantly lower than that of DCQGA, which shows that the high-density coding method in this paper compresses the search space of the algorithm and improves the convergence speed of the algorithm. At the same time, IDCQGA has the optimal maximum search value, and also verifies the feasibility of π/6 mutation gate.

Performance of the Optimal Classification Atoms
Firstly, five typical radar signals of linear frequency modulated (LFM), BPSK, quadrature phase-shift keying (QPSK), CON and binary frequency-shift keying (BFSK) are selected for verification. The pulse width of all signals is 10 us, sampling frequency is 100 MHz. For LFM, initial frequency is 5 MHz, bandwidth is 25 MHz; For BPSK and QPSK, the carrier frequency are 25 MHz, with 5-bir Barker coding and 16-bit Frank coding, respectively. For CON, the carrier frequency is 25 MHz; For BFSK, the lowest frequency is 10 MHz, the highest frequency is 25 MHz, with 7-bit Barker coding.
When the signal-noise-ratio (SNR) is 0 dB, 100 training samples and 100 test samples are randomly generated for each type of signal. Firstly, for the training sample set, 20 optimal classification atoms are selected by using the distance criterion-OMP (the number of atoms selected is obtained by many experiments) in Section 3.1. The parameters of the IDCQGA are as same as the experiment 4.1. The 20 optimal classification atoms are internally product with each training sample, and the 20-dimensional eigenvectors are obtained. In order to visualize the eigenvectors, the scatter plots and two-dimensional projection plots are drawn by selecting the first three-dimensional eigenvectors, as shown in Figure 9. In order to further verify the stability of the four algorithms, each algorithm is tested 10 times, and the simulation results are shown in Figure 8. It can be seen that the biggest performance fluctuation of the algorithm is GA, followed by QGA. DCQGA algorithm achieves 6 times convergence state, and one optimization fluctuation is large. IDCQGA algorithm converges in 10 experiments, further verifying the stability of the proposed algorithm. For the optimization of complex functions, the high-density coding and mutation of the algorithm can improve the robustness and search performance of the algorithm, which is an excellent optimization algorithm. In the process of extracting the optimal classification atoms from the TF atom library, the fitness function of the optimization algorithm is replaced by the absolute value of the signal separation degree function.

Performance of the Optimal Classification Atoms
Firstly, five typical radar signals of linear frequency modulated (LFM), BPSK, quadrature phase-shift keying (QPSK), CON and binary frequency-shift keying (BFSK) are selected for verification. The pulse width of all signals is 10 us, sampling frequency is 100 MHz. For LFM, initial frequency is 5 MHz, bandwidth is 25 MHz; For BPSK and QPSK, the carrier frequency are 25 MHz, with 5-bir Barker coding and 16-bit Frank coding, respectively. For CON, the carrier frequency is 25 MHz; For BFSK, the lowest frequency is 10 MHz, the highest frequency is 25 MHz, with 7-bit Barker coding.
When the signal-noise-ratio (SNR) is 0 dB, 100 training samples and 100 test samples are randomly generated for each type of signal. Firstly, for the training sample set, 20 optimal classification atoms are selected by using the distance criterion-OMP (the number of atoms selected

Performance of the Optimal Classification Atoms
Firstly, five typical radar signals of linear frequency modulated (LFM), BPSK, quadrature phase-shift keying (QPSK), CON and binary frequency-shift keying (BFSK) are selected for verification. The pulse width of all signals is 10 us, sampling frequency is 100 MHz. For LFM, initial frequency is 5 MHz, bandwidth is 25 MHz; For BPSK and QPSK, the carrier frequency are 25 MHz, with 5-bir Barker coding and 16-bit Frank coding, respectively. For CON, the carrier frequency is 25 MHz; For BFSK, the lowest frequency is 10 MHz, the highest frequency is 25 MHz, with 7-bit Barker coding.
When the signal-noise-ratio (SNR) is 0 dB, 100 training samples and 100 test samples are randomly generated for each type of signal. Firstly, for the training sample set, 20 optimal classification atoms are selected by using the distance criterion-OMP (the number of atoms selected is obtained by many experiments) in Section 3.1. The parameters of the IDCQGA are as same as the experiment 4.1. The 20 optimal classification atoms are internally product with each training sample, and the 20-dimensional eigenvectors are obtained. In order to visualize the eigenvectors, the scatter plots and two-dimensional projection plots are drawn by selecting the first three-dimensional eigenvectors, as shown in Figure 9. It can be seen that the first three-dimensional eigenvectors extracted from the five types of signals have good separation characteristics, especially the LFM, BPSK, CON have good intra-class aggregation. The intra-class aggregation of QPSK is relatively poor, but the inter-class separation between classes is better.
Then, the number of signal categories and the classification of similar signals are increased, LFM1, LFM2, BPSK, QPSK, CON1, CON2, BFSK, phase 2 (P2) eight types of radar signals are selected for simulation verification. For LFM1, initial frequency is 5 MHz, bandwidth is 10 MHz; For LFM2, initial frequency is 5 MHz, bandwidth is 20 MHz; For BPSK and QPSK, the parameters are same with the last experiment. For CON1 and CON2, the carrier frequency are 15 MHz and 10 MHz, respectively; For P2, the carrier frequency is 30 MHz, the sampling point of each step It can be seen that the first three-dimensional eigenvectors extracted from the five types of signals have good separation characteristics, especially the LFM, BPSK, CON have good intra-class aggregation. The intra-class aggregation of QPSK is relatively poor, but the inter-class separation between classes is better.
Then, the number of signal categories and the classification of similar signals are increased, LFM1, LFM2, BPSK, QPSK, CON1, CON2, BFSK, phase 2 (P2) eight types of radar signals are selected for simulation verification. For LFM1, initial frequency is 5 MHz, bandwidth is 10 MHz; For LFM2, initial frequency is 5 MHz, bandwidth is 20 MHz; For BPSK and QPSK, the parameters are same with the last experiment. For CON1 and CON2, the carrier frequency are 15 MHz and 10 MHz, respectively; For P2, the carrier frequency is 30 MHz, the sampling point of each step frequency is 4, the number of cycles of each phase code is 5; For BFSK; 13-bit Barker coding is used, the lowest frequency is 5 MHz, and the highest frequency is 15 MHz.
In the same way, a total of 800 training samples and 800 test samples are formed under the condition of 0 dB SNR. Firstly, for the training sample, 25 optimal classification atoms are selected by using the distance criterion-OMP in Section 3.1. The 25 optimal classifier atoms are internally product with each training sample, and 25-dimensional eigenvectors are obtained. The scatter plot and two-dimensional projection plot are also drawn by selecting the first three-dimensional eigenvectors, as shown in Figure 10. and two-dimensional projection plot are also drawn by selecting the first three-dimensional eigenvectors, as shown in Figure 10. It can be seen from Figures 9 and 10 that the eigenvectors of the optimal classification atoms extracted by the proposed algorithm are separable to some extent. With the increase of the number of classes, the overlap of eigenvectors becomes more serious, especially for BPSK and QPSK signals. Under the condition of low SNR, the clustering of eigenvectors between classes is worse. This will bring difficulties to signal recognition. Under the condition of overlapping eigenvertors, the performance of recognition algorithm based on hard partition will be greatly affected. ELM can map the indivisible feature space to high-dimensional space to find the optimal partition. Therefore, It can be seen from Figures 9 and 10 that the eigenvectors of the optimal classification atoms extracted by the proposed algorithm are separable to some extent. With the increase of the number of classes, the overlap of eigenvectors becomes more serious, especially for BPSK and QPSK signals. Under the condition of low SNR, the clustering of eigenvectors between classes is worse. This will bring difficulties to signal recognition. Under the condition of overlapping eigenvertors, the performance of recognition algorithm based on hard partition will be greatly affected. ELM can map the indivisible feature space to high-dimensional space to find the optimal partition. Therefore, it is reasonable to use ELM in this paper.

Performance of the Recognition Algorithm
Firstly, five typical radar signals in Experiment 4.2 are selected for simulation. In order to verify the performance of the algorithm under different SNR, each kind of signal is randomly generated 100 samples at −4-10 dB, respectively. The total number of training samples is 4000 (5 × 100 × 8 = 4000). Then, 4000 test samples were regenerated. In the ELM model, the parameter L range is [0, 1000], and L = 800 is obtained through grid search.
The experimental results under all SNR are shown in Figure 11. It can be seen from the graph that the recognition rate of the five kinds of signals basically increases with the increase of SNR. When SNR is −4 dB, the recognition rate of BPSK and QPSK less than 90%, which is consistent with the conclusion of experiment 4.2. When SNR is increased to 10 dB, the recognition rate of all signals is 100%. Thus, the detection signal recognition algorithm based on ELM is valid. In addition, the simulation results show that CON, LFM and FSK keep high recognition rate under the low SNR condition. This is mainly due to the high similarity between the three kinds of signals and the atoms in the Chirplet atom library. The extracted classification atoms can well express the main features of the signals. In the low SNR case, the TF atoms of BPSK and QPSK have a small discrimination, which results in a low signal recognition rate.  Figure 11. It can be seen from the graph that the recognition rate of the five kinds of signals basically increases with the increase of SNR. When SNR is −4 dB, the recognition rate of BPSK and QPSK less than 90%, which is consistent with the conclusion of experiment 4.2. When SNR is increased to 10 dB, the recognition rate of all signals is 100%. Thus, the detection signal recognition algorithm based on ELM is valid. In addition, the simulation results show that CON, LFM and FSK keep high recognition rate under the low SNR condition. This is mainly due to the high similarity between the three kinds of signals and the atoms in the Chirplet atom library. The extracted classification atoms can well express the main features of the signals. In the low SNR case, the TF atoms of BPSK and QPSK have a small discrimination, which results in a low signal recognition rate. Then, eight kinds of radar signals in experiment 4.2 are selected. Each type of signal generated randomly 100 signal samples in −4-10 dB, a total of 6400 (8 × 100 × 8 = 6400) training samples. Then, 100 test samples of each class were regenerated in −4-10 dB, and 6400 samples were taken as test samples. The traditional OMP + SVM [9], recently proposed deep learning algorithm (CWD + CNN) [32], the proposed Distance criterion-OMP + SVM and Distance criterion-OMP + ELM are applied to compare and simulate, and the overall correct recognition rate is shown in Figure 12. Then, eight kinds of radar signals in experiment 4.2 are selected. Each type of signal generated randomly 100 signal samples in −4-10 dB, a total of 6400 (8 × 100 × 8 = 6400) training samples. Then, 100 test samples of each class were regenerated in −4-10 dB, and 6400 samples were taken as test samples. The traditional OMP + SVM [9], recently proposed deep learning algorithm (CWD + CNN) [32], the proposed Distance criterion-OMP + SVM and Distance criterion-OMP + ELM are applied to compare and simulate, and the overall correct recognition rate is shown in Figure 12.
randomly 100 signal samples in −4-10 dB, a total of 6400 (8 × 100 × 8 = 6400) training samples. Then, 100 test samples of each class were regenerated in −4-10 dB, and 6400 samples were taken as test samples. The traditional OMP + SVM [9], recently proposed deep learning algorithm (CWD + CNN) [32], the proposed Distance criterion-OMP + SVM and Distance criterion-OMP + ELM are applied to compare and simulate, and the overall correct recognition rate is shown in Figure 12.   Figure 12 shows that the overall correct recognition rate of SVM and ELM algorithms is close, which verifies the feasibility and effectiveness of the proposed algorithm. Surprisingly, the effect of CWD + CNN is not ideal. In the case of low SNR, the anti-noise ability of the proposed algorithm is the best (0 dB, 92.6%). When the SNR rises to 8 dB, the recognition rate of the three algorithms for eight types of signals can reach more than 99%. The main reason is that the optimal classification atom can still better extract the atom features suitable for classification at low SNR. In the CWD + CNN algorithm [32], the signal is first converted into TF image. In order to make the size of TF image consistent, this algorithm first uses a series of image processing and clipping methods to get 32 × 32 pixel black-and-white image, and then sends it to the CNN classifier to get the recognition results. CWD + CNN algorithm works well for signals with frequency variation like LFM and Costas signals, but in this paper, the TF images of both CON1 and CON2 signals are a straight line, which makes the same 32 × 32 samples for the two types of signals after image clipping. Therefore, the recognition effect is not good for the single frequency component signals (such as CON1 and CON2). In this paper, the optimal classification atoms extracted by the proposed algorithm can clearly distinguish these signals with different carrier frequencies, which can avoid the confusion between the CON1 and CON2 signals.
Although the performance of ELM is similar to that of SVM, the complexity of ELM is lower than that of SVM (the CPU test time of SVM in this paper is three times as long as that of ELM). Meanwhile, the CWD + CNN algorithm is about 2 times the CPU test time of the proposed algorithm. The mainly reason is that in the CWD + CNN algorithm, the image preprocessing increases the complexity of the algorithm.
In order to analyze the proposed algorithm more accurately, which kind of signal has the lower recognition rate, the confusion matrixes of eight kinds of signal when extracting 0 dB are shown in Tables 3-5.  Comparing Tables 3 and 5, we can see that CWD + CNN algorithm is easy to confuse CON1 and CON2 signals, which also validates our previous discussion. Meanwhile, From Table 5, it can be seen that BPSK has 7 signals mistakenly identified as QPSK, and QPSK is mistaken for BPSK with 6 signals. Compared with Table 4, the confusing BPSK and QPSK signals have been reduced to some extent. Similar to the five types of signals, BPSK and QPSK still have lower recognition rate, mainly because the TF atoms of the same kind of signals (LFM2 and CON2) are similar to BPSK and QPSK, which further increases the difficulty of signal recognition. Experiments 4.2 and 4.3 show that the proposed optimal classification atom can be used as an element to construct eigenvectors, and it still has higher recognition than OMP algorithm in the low SNR case. At the same time, considering the fast convergence speed in the training process of ELM algorithm and the recognition effect similar to SVM algorithm, it is reasonable to select ELM as the classifier. Especially for the overlapping data, the ELM algorithm can map the indivisible feature space to the high-dimensional separable space to realize complex signal recognition.

Conclusions
In this paper, the waveform feature extraction and recognition method of radar electronic reconnaissance signal is mainly studied. Firstly, by analyzing the problems existing in traditional atom matching decomposition algorithm, an optimum classification atom feature extraction method is proposed, which establishes the signal separation degree between different signal waveforms using distance criterion. Then, TF atoms with larger signal separation degree are extracted to construct the classification atom set, and the inner product of signal and selected atom is applied as the eigenvectors. Finally, ELM is introduced as the classifier to complete the signal recognition, which effectively improves the recognition performance of signal in the presence of overlapping eigenvectors. Simulation results show that the proposed algorithm has better stability and anti-noise performance than the traditional OMP matching pursuit and SVM recognition algorithm. When the SNR is 0 dB, the signal recognition rate of the proposed algorithm reaches 92.6%, which provides a new scheme for the classification and recognition of radar signals in electronic reconnaissance.
It is important to note that we explore a wide radar signal recognition system to classify, but not identify, in this paper. In radar field, the meaning of "classify" is that distinguish the different types of signals, while "identify" means recognition particular copies of the same radar type. The latter is called Specific Emitter Identification (SEI). SEI is based on intra-pulse analysis, use of out-of-band radiation, use of fractal theory, and other methods by which it is possible to extract additional features from the radar signal [33][34][35][36]. Since SEI technology is also an important subject in the field of electronic reconnaissance, we intend to conduct in-depth analysis of this subject in the next step.
Author Contributions: J.W. and G.R. conceived of and designed the study. G.R. and Q.G. performed the experiments. G.R. wrote the paper. Q.G. and X.G. reviewed and edited the manuscript. All authors read and approved the manuscript.

Conflicts of Interest:
The authors declare that there are no conflict of interests regarding the publication of this paper.