Radar Waveform Recognition Based on Time-Frequency Analysis and Artiﬁcial Bee Colony-Support Vector Machine

: In this paper, a system for identifying eight kinds of radar waveforms is explored. The waveforms are the binary phase shift keying (BPSK), Costas codes, linear frequency modulation (LFM) and polyphase codes (including P1, P2, P3, P4 and Frank codes). The features of power spectral density (PSD), moments and cumulants, instantaneous properties and time-frequency analysis are extracted from the waveforms and three new features are proposed. The classiﬁer is support vector machine (SVM), which is optimized by artiﬁcial bee colony (ABC) algorithm. The system shows well robustness, excellent computational complexity and high recognition rate under low signal-to-noise ratio (SNR) situation. The simulation results indicate that the overall recognition rate is 92% when SNR is − 4 dB.


Introduction
Low Probability of Intercept (LPI) radar waveforms possess many characteristics.They are widely used in many fields, such as air defense radar, airborne fire control radar and beyond visual range radar and so forth.However, the detection of them is a difficult thing.Therefore, it is important to study LPI radar waveform recognition.
Many recognition methods have been put forward to recognize radar waveforms.For instance, extracting the bispectrum cascade feature to identify six kinds of waveforms (CW, LFM, NLFM, BPSK, QPSK and FSK) in [1].Atomic Decomposition (AD) and Expectation Maximization (EM) algorithms are used to detect radar waveforms [2].But only LFM and BPSK can be identified in this method.In [3], power spectrum accumulation and similarity theory are utilized to detect the noise frequency modulation signal.When SNR is −3 dB and the duration is 1 ms, the recognition rate is 100%.The method based on the statistics of energy for detecting noise frequency modulation signal is proposed [4].But it has good performance only for the signals of long time.The paper [5] presents Bispestra Diagonal Slice (BDS) to distinguish four types of waveforms (including FMCW, Frank code, Costas codes and FSK/PSK).The ratio of successful recognition (RSR) is more than 93.4% when signal-to-noise ratio (SNR) ≥8 dB.In [6], random projections and sparse classification are adopted to recognize LFM, FSK and PSK.The RSR is about 90% when SNR is 0 dB.Lunden explored the system based on Choi-Williams distribution (CWD) and Wigner-Ville distribution (WVD) to distinguish eight kinds of waveforms [7].However, the algorithm requires prior information and the RSR is 98% at SNR of 6 dB.In [8], Zhang proposes the system to recognize eight radar waveforms.The RSR is 94.7% at SNR of −2 dB and the number of features is 23.Therefore, it is a challenge for the existing methods to distinguish more radar waveforms with fewer features under high noise environment.
The system for recognizing eight LPI waveforms is proposed in this paper.The recognizable radar waveforms contain BPSK, Costas codes, LFM, P1-P4 and Frank codes.The classifier is the support vector machine (SVM) of artificial bee colony (ABC) algorithm.The features what we use are roughly divided into two types.Some features are calculated by time-domain signals and we call them time features.The other features are from time-frequency analysis and we call them time-frequency (T-F) features.Time features are extracted from the power spectral density (PSD), second order moments and cumulants and instantaneous properties.T-F features are related to WVD and CWD.Simulation results indicate that recognition rate can reach up to 92% when SNR is −4 dB.
The contributions of this paper are mainly in four aspects: (1) the recognition system is proposed and getting better results than other methods; (2) proposing three new features, that is, the ratio of peak values of WVD ( Mw ), the number of extreme points ( Nm ) and the oscillation amplitude of the sidelobe ( âs ); (3) the number of features of this article is less than other methods; (4) and prior information is not needed in the system.
The framework of this article is shown below.Section 2 mainly presents the structure of system.The classifier is explained in Section 3.All the features and the principle of each feature are introduced in Section 4. Section 5 conveys the simulation results and the conclusions are drawn in Section 6.

Recognition System Structure
Recognition system is employed to classify the intercepted waveforms and we can obtain the category of them.Recognition System can be divided into two parts, that is, features and classifier.First of all, the features of the intercepted radar waveforms are extracted, that is, time features and T-F features.Then, to select part of the data as the training set and get the model through training.Finally, to take the rest of data as the test set and put the test set and model into the classifier to get the classified results.For more details, see Figure 1.
Electronics 2018, 7, x FOR PEER REVIEW 2 of 19 94.7% at SNR of −2 dB and the number of features is 23.Therefore, it is a challenge for the existing methods to distinguish more radar waveforms with fewer features under high noise environment.
The system for recognizing eight LPI waveforms is proposed in this paper.The recognizable radar waveforms contain BPSK, Costas codes, LFM, P1-P4 and Frank codes.The classifier is the support vector machine (SVM) of artificial bee colony (ABC) algorithm.The features what we use are roughly divided into two types.Some features are calculated by time-domain signals and we call them time features.The other features are from time-frequency analysis and we call them time-frequency (T-F) features.Time features are extracted from the power spectral density (PSD), second order moments and cumulants and instantaneous properties.T-F features are related to WVD and CWD.Simulation results indicate that recognition rate can reach up to 92% when SNR is −4 dB.
The contributions of this paper are mainly in four aspects: (1) the recognition system is proposed and getting better results than other methods; (2) proposing three new features, that is, the ratio of peak values of WVD ( ˆw M ), the number of extreme points ( ˆm N ) and the oscillation amplitude of the sidelobe ( ˆs a ); (3) the number of features of this article is less than other methods; (4) and prior information is not needed in the system.The framework of this article is shown below.Section 2 mainly presents the structure of system.The classifier is explained in Section 3.All the features and the principle of each feature are introduced in Section 4. Section 5 conveys the simulation results and the conclusions are drawn in Section 6.

Recognition System Structure
Recognition system is employed to classify the intercepted waveforms and we can obtain the category of them.Recognition System can be divided into two parts, that is, features and classifier.First of all, the features of the intercepted radar waveforms are extracted, that is, time features and T-F features.Then, to select part of the data as the training set and get the model through training.Finally, to take the rest of data as the test set and put the test set and model into the classifier to get the classified results.For more details, see Figure 1.The intercepted signal in system can be given by where   x k is the radar waveform.  n k is the noise and we assume it is additive white Gaussian noise (AWGN).A and   k  are the amplitude and phase sequence, respectively.The phase   k  of BPSK, Costas, LFM and polyphase codes is listed in Table 1.The intercepted signal in system can be given by where x(k) is the radar waveform.n(k) is the noise and we assume it is additive white Gaussian noise (AWGN).A and φ(k) are the amplitude and phase sequence, respectively.The phase φ(k) of BPSK, Costas, LFM and polyphase codes is listed in Table 1.
Table 1.The phase of 8 radar waveforms.

Radar Waveforms Phase φ(k)
In Table 1, f c is the carrier frequency.For BPSK, θ is the initial phase and its value is 0 or π.For Costas, f s is the frequency sequence and it is given by For LFM, µ is the slope of instantaneous frequency and it equals to the ratio of bandwidth B to pulse width τ.For polyphase codes, N c is the number of symbols, In addition, √ N c must be the even for P2 codes.
The classifier in the system is composed of network1 and network2.It can classify eight waveforms.BPSK, Costas and LFM can be directly classified by network1 and network2 does not work at this time.When the signal is deemed to be polyphase codes (P1-P4 and Frank codes) by network1, then network2 is used to classify polyphase codes in detail.The purpose of adopting two classification networks is to reduce input features and improve the classification accuracy.For more details, see Figure 2.
Table 1.The phase of 8 radar waveforms.

Radar Waveforms
In Table 1, c f is the carrier frequency.For BPSK,  is the initial phase and its value is 0 or  .
For Costas, s f is the frequency sequence and it is given by For LFM,  is the slope of instantaneous frequency and it equals to the ratio of bandwidth B to pulse width  .For polyphase codes, c N is the number of symbols, In addition, c N must be the even for P2 codes.
The classifier in the system is composed of network1 and network2.It can classify eight waveforms.BPSK, Costas and LFM can be directly classified by network1 and network2 does not work at this time.When the signal is deemed to be polyphase codes (P1-P4 and Frank codes) by network1, then network2 is used to classify polyphase codes in detail.The purpose of adopting two classification networks is to reduce input features and improve the classification accuracy.For more details, see Figure 2.

Classifier
The classifier of the system is ABC-SVM and it is applied in two networks.In SVM, parameter selection has a significant impact on the classification accuracy of model.Therefore, ABC algorithm is adopted to find the optimal parameters of SVM.
In SVM [9], the hyperplane is used to separate different types of data because the symbols on both sides of the hyperplane are different.When the interval between hyplane and data is greater, the confidence of classification is greater.Therefore, the key question for SVM is to find the hyperplane that can separate the training data without mistakes and make the geometric interval (margin) is the largest, that is optimal separating hyperplane.The SVM about two classes and linear separable problems is shown in Figure 3.We assume that the number of samples is l , the training

Classifier
The classifier of the system is ABC-SVM and it is applied in two networks.In SVM, parameter selection has a significant impact on the classification accuracy of model.Therefore, ABC algorithm is adopted to find the optimal parameters of SVM.
In SVM [9], the hyperplane is used to separate different types of data because the symbols on both sides of the hyperplane are different.When the interval between hyplane and data is greater, the confidence of classification is greater.Therefore, the key question for SVM is to find the hyperplane that can separate the training data without mistakes and make the geometric interval (margin) is the largest, that is optimal separating hyperplane.The SVM about two classes and linear separable problems is shown in Figure 3.We assume that the number of samples is l, the training vectors x i ∈ R n , i = 1, . . ., l, an indicator vector y ∈ R l and y i ∈ {1, −1}.The hyperplane is (ω•x) + b = 0. Therefore, the problem of solving the optimal hyperplane is turned into an optimization problem [10]: where ω is the normal vector of hyperplane, ξ and b mean the slack variable and threshold, respectively.C is the error penalty factor and it need to be designed.φ(x i ) is the result of non-linear transformation of x i in the new space.Then the optimization problem is converted to its dual problem by the method of Lagrange [10]: min where α is the Lagrange multiplier, e is a vector of all ones and Q is an positive semidefinite matrix, that is, Q ij = y i y j K x i , x j .K x i , x j is the kernel function and it is influenced by the kernel function parameter σ.It can be given by When the Formula ( 3) is calculated, the solution α * can be obtained and the vector ω is also be confirmed, that is, b is estimated by and the classified decision function is [10] Thus, the choice of C and σ has an important impact on the classification results of SVM [11,12].
. Therefore, the problem of solving the optimal hyperplane is turned into an optimization problem [10]: where ω is the normal vector of hyperplane,  and b mean the slack variable and threshold, respectively.C is the error penalty factor and it need to be designed.   i x is the result of non-linear transformation of i x in the new space.Then the optimization problem is converted to its dual problem by the method of Lagrange [10]: where α is the Lagrange multiplier, e is a vector of all ones and Q is an positive semidefinite matrix, that is, is the kernel function and it is influenced by the kernel function parameter  .It can be given by When the Formula ( 3) is calculated, the solution  α can be obtained and the vector ω is also be confirmed, that is, b is estimated by and the classified decision function is [10]  Thus, the choice of C and  has an important impact on the classification results of SVM [11,12].ABC algorithm proposed by Karaboga [13] is utilized to evaluate the parameters of SVM in this paper.ABC is a global optimization algorithm and its purpose is to search the optimum solution of problems by imitating the behavior of honey bee swarm in nature.ABC algorithm is composed of scouts, onlookers and employed bees.The quantity of employed bees is one half of the populations and the amount of food sources (solutions) is equal to the employed bees.The specific process is described as below [14]: the first step is to initialize the food sources.Then, employed bees search around the food sources in some way to find new sources and choose the better ones according to the nectar amount (fitness).Next, onlookers pick good food sources further on the basis of the message of employed bees and confirm the quantity.If nectar amount of that food source is not improved within given steps, the employed bee will turn into the scouts.The task of scouts is to find new food sources.When the cycle reaches the terminating condition, the optimal food source will be received.For more details, see Table 2.In initialize stage, we assume the number of solutions is SN and the solutions are randomly generated.The quantity of employed bees is SN, as well.Initial solutions are given by where x i (i = 1, 2, . . ., SN) is a D-dimensional vector, D represents the amount of optimized parameters and j ∈ {1, 2, . . . ,D}. x ij is the jth parameter of x i .
In employed bees stage, employed bees calculate the fitness and search around the initial values to find new solutions.If new fitness is more than the original, employed bees will remember the new solutions and forget the original ones.When employed bees finish seeking and return to the beehives, they share information with onlookers.The formula of searching new solutions is given by where j ∈ {1, 2, . . . ,D}, k ∈ {1, 2, . . . ,SN} and In the onlooker bees stage, onlookers select solutions according to the received message with a certain probability.Then, onlookers search the solutions in the same way of employed bees to produce new solutions.If new fitness is better, we will replace the previous.The probability of onlookers to choose the solutions is given by where f it(x i ) is the fitness of the ith solution.
In the scout bees stage, if the solution x i is not improved within limit, it will be abandoned.The employed bees of that position will turn into the scouts and a new solution is produced by Formula (8).When the number of cycles reaches the maximum cycle number (MCN), the best solution will be obtained.

Features Extraction
Feature extraction is an important step in the recognition system and it is closely related to the classification results.These features are divided into time features and T-F features.Time features means these features are based on the power spectral density (PSD), second order statistics and instantaneous characteristics.T-F features are bound up with WVD and CWD.The process of extracting T-F features need to be explained.The whole features of two networks are illustrated in Table 3.
where N is the size of signals.m is the amount of conjugated sections.M10 and M20 are obtained by this formula.The nth order cumulant can be computed through [15,16] Ĉnm where M10 is calculated by Formula (11).Ĉ20 can be received by Formula (12).

Features Related to Power Spectral Density (PSD)
PSD expresses the frequency domain distribution of signal energy.Hence the PSD features can be estimated as [6] where y(k) is given by where σ 2 ε is the variance of noise.M21 is the third moment and can be received by Formula (11).γ 1 and γ 2 can be explored by Formula (13).γ 1 applies to distinguish BPSK from Costas codes.The squared complex envelope is constant for BPSK, therefore, γ 2 is beneficial to differentiate between BPSK and other waveforms.σφ means the standard deviation of the absolute of instantaneous phase and it can be given by [17] σφ where φ(k) indicates the instantaneous phase of radar waveforms and N still implies the amount of samples.The range of φ(k) is between −π and π. σf means the standard deviation of the absolute of normalized centered instantaneous frequency and it can be estimated as where f (k) indicates the normalized centered instantaneous frequency and the size of f (k) is N f .It can be obtained through the following process: 1.
to calculate the instantaneous frequency to calculate the mean of f (k), that is, µ f ;

T-F Features
where t is time variable and ω is frequency variable.The feature based on the cross terms of WVD is proposed.It is utilized to discriminate LFM from Frank code and P3 code.For a single-component LFM of finite length, there are no cross terms and the amplitude of the WVD is related to the sampling signal [20].Therefore, the absolute of the ratio of the minimum value to the maximum value is about 0.25.For Frank code and P3 code, the amplitude of the cross terms is about two times the signal terms in WVD [21].That means the maximum and minimum value are both related to the cross terms.The amplitude of cross terms contains cosine function [22].Therefore, the absolute of the ratio of the minimum to the maximum is about 1.For LFM, Frank and P3, the maximum, minimum and the ratio of 500 signals at SNR = 6 dB are displayed in Figure 4.
where t is time variable and ω is frequency variable.
The feature based on the cross terms of WVD is proposed.It is utilized to discriminate LFM from Frank code and P3 code.For a single-component LFM of finite length, there are no cross terms and the amplitude of the WVD is related to the sampling signal [20].Therefore, the absolute of the ratio of the minimum value to the maximum value is about 0.25.For Frank code and P3 code, the amplitude of the cross terms is about two times the signal terms in WVD [21].That means the maximum and minimum value are both related to the cross terms.The amplitude of cross terms contains cosine function [22].Therefore, the absolute of the ratio of the minimum to the maximum is about 1.For LFM, Frank and P3, the maximum, minimum and the ratio of 500 signals at SNR = 6 dB are displayed in Figure 4.The ratio of LFM, Frank and P3 at diverse SNR is displayed in Figure 5.When SNR is 6 dB, the ratio of LFM is about 0.24 and it is very close to 0.25.The ratio of Frank and P3 is about 0.9 and it is very close to 1.When SNR is −4 dB, the ratio of LFM is about 0. The ratio of LFM, Frank and P3 at diverse SNR is displayed in Figure 5.When SNR is 6 dB, the ratio of LFM is about 0.24 and it is very close to 0.25.The ratio of Frank and P3 is about 0.9 and it is very close to 1.When SNR is −4 dB, the ratio of LFM is about 0.3.It is also close to 0.25.The ratio of Frank and P3 is about 0.9.It is also close to 1.That indicates the differences between LFM and the other waveforms (Frank and P3).The feature related to this ratio is proposed, that is, Mw .
Frank and P3 is about 0.9.It is also close to 1.That indicates the differences between LFM and the other waveforms (Frank and P3).The feature related to this ratio is proposed, that is, ˆw M .

 
, x W t  can be calculated by Formula (17).

Choi-Williams Distribution (CWD)
CWD belongs to the Cohen time-frequency distribution [23].It is given by: where t indicates the time variable and  indicates the frequency variable.  ,    means the kernel function and it can reduce the cross terms.  ,    is given by [24]     where   0    is the scaling factor and its value is set to 1 in this paper.The two-dimensional results of CWD of eight radar waveforms are displayed in Figure 6.Mw is given by: where W min and W max are the minimum and maximum amplitudes in W x (t, ω), respectively.W x (t, ω) can be calculated by Formula (17).

Choi-Williams Distribution (CWD)
CWD belongs to the Cohen time-frequency distribution [23].It is given by: where t indicates the time variable and ω indicates the frequency variable.φ(ξ, τ) means the kernel function and it can reduce the cross terms.φ(ξ, τ) is given by [24] φ(ξ, τ) = e −ξ 2 τ 2 /σ , (20) where σ(σ > 0) is the scaling factor and its value is set to 1 in this paper.The two-dimensional results of CWD of eight radar waveforms are displayed in Figure 6.
Electronics 2018, 7, x FOR PEER REVIEW 9 of 19 Frank and P3 is about 0.9.It is also close to 1.That indicates the differences between LFM and the other waveforms (Frank and P3).The feature related to this ratio is proposed, that is, ˆw M .

 
, x W t  can be calculated by Formula (17).

Choi-Williams Distribution (CWD)
CWD belongs to the Cohen time-frequency distribution [23].It is given by: where t indicates the time variable and  indicates the frequency variable.  ,    means the kernel function and it can reduce the cross terms.  ,    is given by [24]     where   0    is the scaling factor and its value is set to 1 in this paper.The two-dimensional results of CWD of eight radar waveforms are displayed in Figure 6.

Preprocessing of the CWD Image
The next step is to calculate the features of the CWD image.Before doing that, the results of CWD need to be processed.The specific operations are as follows [8]: make the CWD transformation of 1024 points for the signals whose sampling numbers are less than 1024 and zero padding is used to deal with waveforms; 2.
resize the size of the CWD image to N × N to reduce the computation cost; 3.
the global threshold algorithm [25] is used to transform the resized CWD image into binary image and the procedure is illustrated in Figure 7; 4.
remove part of the binary image that is less than 10% of the size of the largest part to eliminate noise and then get the processed image.
For P4 code, the processing procedure of the CWD image at SNR = −6 dB is displayed in Figure 8.

CWD Features
The region consisting of pixels with the same pixel value and adjacent positions is called connected domain.For the CWD image of each waveform, P3 and Frank possess two connected domains, P1, P4 and LFM possess one domain and Costas has different results.Therefore, the amount of connected domains in binary image is a very helpful feature.In this article, that feature is obj N .It means the amount of domains that greater than 20% of maximum component.
Next feature indicates the position of the maximum energy in the time domain of image, that is, where the resized CWD image.

CWD Features
The region consisting of pixels with the same pixel value and adjacent positions is called connected domain.For the CWD image of each waveform, P3 and Frank possess two connected domains, P1, P4 and LFM possess one domain and Costas has different results.Therefore, the amount of connected domains in binary image is a very helpful feature.In this article, that feature is

CWD Features
The region consisting of pixels with the same pixel value and adjacent positions is called connected domain.For the CWD image of each waveform, P3 and Frank possess two connected domains, P1, P4 and LFM possess one domain and Costas has different results.Therefore, the amount of connected domains in binary image is a very helpful feature.In this article, that feature is N obj .It means the amount of domains that greater than 20% of maximum component.
Next feature indicates the position of the maximum energy in the time domain of image, that is, where C N×N (t, ω) is the resized CWD image.σobj and θmax represent the standard deviation of the length of domains and the rotate angle of the greatest domain, respectively.σobj has a good effect on separating "stepped waveform" (Frank and P1 codes) from "linear waveform" (P3, P4 and LFM).θmax can select P2 code from others because only the slope of P2 is negative.The calculation process can be illustrated as below [8]: for every domain, k = 1, 2, . . ., N obj ; 1. extract the domain of k in binary image; 2.
rotate the domain of k through the nearest neighbor interpolation until it is vertical to the lateral axis and denoted by B r (x, y); 3.
compute the row sum r(x), that is, compute standard deviation of r(x), that is, • calculate and output the angle of rotation of the largest object θmax ; • calculate and output the mean σobj of σk,obj1 , that is, Next step is to extract the framework of maximum domain in binary image and estimate its linear tendency by the minimal square error method.The vector f n can be obtained by removing its linear tendency in framework.Then to compute the autocorrelation function of f n , that is, where c(m) is the autocorrelation function of f n and M is the length of f n .Some characteristics of c(m) can be utilized to distinguish P1 code from P4 code.The framework of the maximum domain, the linear tendency of framework, f n and c(m) of P4 and P1 at SNR = 6 dB are displayed in Figure 9.  has a good effect on separating "stepped waveform" (Frank and P1 codes) from "linear waveform" (P3, P4 and LFM).max  can select P2 code from others because only the slope of P2 is negative.The calculation process can be illustrated as below [8]:  calculate and output the angle of rotation of the largest object max  ;  calculate and output the mean ˆobj Next step is to extract the framework of maximum domain in binary image and estimate its linear tendency by the minimal square error method.The vector n f can be obtained by removing its linear tendency in framework.Then to compute the autocorrelation function of n f , that is, where   where max s and min s are the maximum and minimum of sidelobe in   c m ,respectively.ˆm N is the number of extreme points in, that is,

Create Signals
The simulation conditions of eight radar waveforms are introduced in this section.Before creating the waveforms, we suppose the noise is additive white Gaussian noise (AWGN).The definition of SNR is We present two new features which are bound up c(m), that is, âs and Nm .They are used to distinguish P1 from P4.
âs is the oscillation amplitude of the sidelobe in c(m), that is, âs = abs(s max ) + abs(s min ), where s max and s min are the maximum and minimum of sidelobe in c(m),respectively.Nm is the number of extreme points in, that is, where N max means the amount of maxima in c(m) and N min is the number of minima in c(m).

Create Signals
The simulation conditions of eight radar waveforms are introduced in this section.Before creating the waveforms, we suppose the noise is additive white Gaussian noise (AWGN).The definition of SNR is SNR = 10 log 10 (P s /P n ), where P s indicates the signal power and P n indicates the noise power.U(•) means the ratio of frequency.For example, the initial frequency f 0 is 1500 and the sampling frequency f s is 12000.So, the ratio of frequency is U( f 0 / f s ) = U(1/8).For BPSK, the range of carrier frequency f c is from U(1/8) to U(1/4) and the barker codes is 7, 11 or 13; for Costas codes, the amount of frequency N s changes from 3 to 6, the basic frequency f min is U(1/24) and the number of samples N is from 512 to 1024; for LFM, the initial frequency f 0 is U(1/16, 1/8), the range of bandwidth ∆ f is the same as f 0 and the number of samples N is as large as Costas codes; For polyphase codes, the carrier frequency f c is also U(1/8, 1/4), M indicates the step frequency and N c indicates the coding length of signals.More details are illustrated in Table 4.

Experiment Results with SNR
In order to make experiments on the classification of eight kinds of radar waveforms, 1000 sets of each waveform are obtained.90% of them are employed to practice and the remaining is regarded as the test set.The range of SNR is from −8 dB to 14 dB.The RSR in this article is compared with Zhang [8] and Lunden [7] system.
Figure 10 shows the recognition accuracy at different SNR from this paper and from Zhang's and Lunden's studies.For BPSK, LFM and P2 code, the performance of this paper is obviously higher than the other two systems.For Costas codes, the recognition rate of Zhang's is the highest but this paper is very close to Zhang system.For P1, P3, P4 and Frank codes, the recognition rate of Zhang's is only a little higher than this paper in some cases and Lunden's is the worst.On the whole, the recognition accuracy of this paper is obviously higher than Zhang's and Lunden's, especially under the low SNR.When SNR is −4 dB, the overall RSR of this paper is 92%, the RSR of Costas and BPSK is 100% and the RSR of LFM and P2 is above 95%.The CWD images of P1, P3, P4 and Frank are similar, especially in low SNR conditions.Therefore, the effects of them are not excellent and the RSR is about 85%.What's more, the untrained waveforms T4 code [26] with the number of phase states of 2 is used to test the performance of system.After calculation, the probability that T4 is divided into unclassified waveforms is 89%.The specific results of this paper at SNR = −4 dB are displayed in When SNR is −4 dB, the overall RSR of this paper is 92%, the RSR of Costas and BPSK is 100% and the RSR of LFM and P2 is above 95%.The CWD images of P1, P3, P4 and Frank are similar, especially in low SNR conditions.Therefore, the effects of them are not excellent and the RSR is about 85%.What's more, the untrained waveforms T4 code [26] with the number of phase states of 2 is used to test the performance of system.After calculation, the probability that T4 is divided into unclassified waveforms is 89%.The specific results of this paper at SNR = −4 dB are displayed in Table 5.

Experiment Results with Robustness
The following is to discuss the stability of system.The point is to ensure how recognition accuracy varies with the number of training samples.The training samples vary from 100 to 900 and the interval is 200.We draw the conclusions by comparing with Zhang's [8] at SNR of −8 dB and −2 dB.
As Figure 11 shows, the recognition rates of this paper and Zhang's are both increasing when the training data is from 100 to 900.The RSR of this paper is higher than Zhang's under any training samples.When the number of training samples is 500, RSR is basically unchanged.That means the two systems are both stable at this time.When SNR is −8 dB, the growth of RSR is 18.5% for Zhang's and the growth of this paper is 4.4%.When SNR is −2 dB, the growth of Zhang's is 16.9% and the growth of this paper is only 0.7%.That means the growth of this paper is smaller than Zhang system.Therefore, the system of this paper has better robustness.

Experiment Results with Time
In order to measure the property of system, we calculate runtime to represent the computational complexity.Eight waveforms are tested when SNR is −8 dB and −2 dB.The simulation is repeated ten times and take the average.The time of this paper, Lunden's [7] and Zhang's [8] is shown in Table 6 and the unit is seconds.The operation environment is in Table 7.

Experiment Results with Time
In order to measure the property of system, we calculate runtime to represent the computational complexity.Eight waveforms are tested when SNR is −8 dB and −2 dB.The simulation is repeated ten times and take the average.The time of this paper, Lunden's [7] and Zhang's [8] is shown in Table 6 and the unit is seconds.The operation environment is in Table 7.In order to make the data in Table 6 more clearly, the ratio of system's (this paper and Zhang) time to Lunden's time is calculated.The results can be seen in Table 8.As Table 8 shows, this paper's time is about 0.595 of Lunden's.Zhang's time is about 0.645 of Lunden's.That means the time of this paper is the least and the time of Lunden's is the most.In Lunden system, some methods spend a lot of time but they are not utilized in this article.For example, data driven and peak search and so forth.In Zhang system, the number of features is 23 and there are only 14 features in this article.Therefore, the computational complexity of this paper is less than Zhang's and Lunden's.

Conclusions
In this paper, the system for recognizing 8 LPI radar waveforms is explored.The identifiable waveforms are BPSK, Costas codes, LFM, P1-P4 and Frank codes.There are only 14 features in the system and they are divided into time features and T-F features.Time features are obtained from the time domain signals.T-F features are related to the WVD and CWD and 3 new T-F features are proposed.The classifier of the system is SVM based on ABC algorithm.ABC is adopted to optimize the parameters of SVM and it can avoid the problem of local optimal solution to some extent.
The system has good performance and the details are as follows: Firstly, it has high recognition rate.When SNR at −4 dB, the overall RSR is 92% and the RSR of each waveform is all over 84%.Especially for LFM and P2 code, the RSR is about 97% which is far better than other systems.Secondly, it shows well robustness.When the training data is reduced to 100, the recognition rate is only dropped by 0.7% at SNR of −2 dB.Lastly, it possesses excellent computational complexity.The runtime of this paper is far less than other methods.

Figure 1 .
Figure 1.The structure of classification system is displayed in this figure.It is formed by time features, T-F features and classifier.

Figure 1 .
Figure 1.The structure of classification system is displayed in this figure.It is formed by time features, T-F features and classifier.

Figure 2 .
Figure 2. The classifier is displayed in this figure.It includes two parts, i.e., network1 and network2.

Figure 2 .
Figure 2. The classifier is displayed in this figure.It includes two parts, i.e., network1 and network2.

Figure 3 .
Figure 3.This figure shows the support vector machine (SVM) about two classes and linear separable problems.Margin means the geometric interval.

2 Figure 3 .
Figure 3.This figure shows the support vector machine (SVM) about two classes and linear separable problems.Margin means the geometric interval.

4. 1 . 3 .
Features Related to Instantaneous PropertiesIt has good effect to discriminate the frequency modulation signals from the phase modulation signals by the instantaneous properties of radar waveforms.Therefore, two features based on the instantaneous frequency and instantaneous phase are presented.

4. 2 . 1 .
Feature Related to Wigner-Ville Distribution (WVD) WVD[18] was proposed by Wigner.It is used to describe the signal distribution in time and frequency domain.Hence the WVD of x(t) is given by[19]

Figure 4 .
Figure 4.In this figure, (a), (c) and (e) are the minimum and minimum values of Wigner-Ville distribution (WVD) of linear frequency modulation (LFM), Frank and P3 at signal to noise ratio (SNR) = 6 dB, respectively.(b), (d) and (f) are the absolute of the ratio of minimum to maximum of LFM, Frank and P3, respectively.There are 500 signals of each waveform.

3 .Figure 4 .
Figure 4.In this figure, (a), (c) and (e) are the minimum and minimum values of Wigner-Ville distribution (WVD) of linear frequency modulation (LFM), Frank and P3 at signal to noise ratio (SNR) = 6 dB, respectively.(b), (d) and (f) are the absolute of the ratio of minimum to maximum of LFM, Frank and P3, respectively.There are 500 signals of each waveform.

Figure 5 .
Figure 5.In this figure, (a) is the ratio of LFM, Frank and P3 at SNR = 6 dB; (b) is the ratio at SNR = −4 dB.ˆw M is given by:

Figure 5 .
Figure 5.In this figure, (a) is the ratio of LFM, Frank and P3 at SNR = 6 dB; (b) is the ratio at SNR = −4 dB.

Figure 5 .
Figure 5.In this figure, (a) is the ratio of LFM, Frank and P3 at SNR = 6 dB; (b) is the ratio at SNR = −4 dB.ˆw M is given by:

Figure 6 .
Figure 6.The Choi-Williams distribution (CWD) results of eight waveforms are displayed in this figure.

σ
is set to 1 in this paper.(a) CWD image of LFM, (b) CWD image of Costas, (c) CWD image of BPSK, (d) CWD image of Frank, (e) CWD image of P1, (f) CWD image of P2, (g) CWD image of P3, (h) CWD image of P4.

Figure 7 .Figure 8 .
Figure 7.The procedure of transforming the resized image into binary image is displayed in this paper.

Figure 7 .Figure 7 .Figure 8 .
Figure 7.The procedure of transforming the resized image into binary image is displayed in this paper.

objN.
It means the amount of domains that greater than 20% of maximum component.Next feature indicates the position of the maximum energy in the time domain of image, that is,

Figure 8 .
Figure 8. CWD image processing of P4 code at SNR of −6 dB.

Electronics 2018, 7 ,
x FOR PEER REVIEW 12 of 19 ˆobj  and max  represent the standard deviation of the length of domains and the rotate angle of the greatest domain, respectively.ˆobj

Figure 9 .
Figure 9.In this figure, (a) and (b) are the framework of the maximum domain and its linear tendency of P4 and P1.(c) and (d) are the n f of P4 and P1.(e) and (f) are the   c m of the two

Figure 9 .
Figure 9.In this figure, (a) and (b) are the framework of the maximum domain and its linear tendency of P4 and P1.(c) and (d) are the f n of P4 and P1.(e) and (f) are the c(m) of the two waveforms.

Figure 10 .
Figure 10.Recognition accuracy at different SNR of each waveform.The overall RSR is 92% at SNR of −4 dB for this paper.(a) The RSR of BPSK (b) The RSR of Costas, (c) The RSR of LFM, (d) The RSR of P1, (e) The RSR of P2, (f) The RSR of P3, (g) The RSR of P4, (h) The RSR of Frank, (i) The overall RSR.

Figure 10 .
Figure 10.Recognition accuracy at different SNR of each waveform.The overall RSR is 92% at SNR of −4 dB for this paper.(a) The RSR of BPSK (b) The RSR of Costas, (c) The RSR of LFM, (d) The RSR of P1, (e) The RSR of P2, (f) The RSR of P3, (g) The RSR of P4, (h) The RSR of Frank, (i) The overall RSR.

Figure 11 .
Figure 11.In this figure, (a) is the RSR at different training samples at SNR = −8 dB.(b) is the result at SNR = −2 dB.

Figure 11 .
Figure 11.In this figure, (a) is the RSR at different training samples at SNR = −8 dB.(b) is the result at SNR = −2 dB.

Table 2 .
The main steps of artificial bee colony (ABC) algorithm.
4.1.1.Features Related to Moments and CumulantsSecond order moments and cumulants are suitable for discriminating BPSK from others.Therefore, two features about Moments and Cumulants are proposed.The nth moment of the signal y(k) is given by Mnm

Table 4 .
Parameters of each waveform.

Table 5 .
Classification results of this paper at SNR = −4 dB.

Table 6 .
The time of this paper, Zhang's and Lunden's.

Table 7 .
The operation environment.

Table 8 .
The ratio of system's (this paper and Zhang) time to Lunden's time.