Specific Radar Recognition Based on Characteristics of Emitted Radio Waveforms Using Convolutional Neural Networks

With the increasing complexity of the electromagnetic environment and continuous development of radar technology we can expect a large number of modern radars using agile waveforms to appear on the battlefield in the near future. Effectively identifying these radar signals in electronic warfare systems only by relying on traditional recognition models poses a serious challenge. In response to the above problem, this paper proposes a recognition method of emitted radar signals with agile waveforms based on the convolutional neural network (CNN). These signals are measured in the electronic recognition receivers and processed into digital data, after which they undergo recognition. The implementation of this system is presented in a simulation environment with the help of a signal generator that has the ability to make changes in signal signatures earlier recognized and written in the emitter database. This article contains a description of the software’s components, learning subsystem and signal generator. The problem of teaching neural networks with the use of the graphics processing units and the way of choosing the learning coefficients are also outlined. The correctness of the CNN operation was tested using a simulation environment that verified the operation’s effectiveness in a noisy environment and in conditions where many radar signals that interfere with each other are present. The effectiveness results of the applied solutions and the possibilities of developing the method of learning and processing algorithms are presented by means of tables and appropriate figures. The experimental results demonstrate that the proposed method can effectively solve the problem of recognizing raw radar signals with agile time waveforms, and achieve correct probability of recognition at the level of 92–99%.


Introduction
In radio-electronic reconnaissance systems we receive and then measure the basic time, frequency and spatial parameters (related to the scanning of the antenna) in order to recognize their emission sources (in our case, radars), and we do not visualize the spatial situation using signals. The radio-electronic reconnaissance systems extract the basic characteristic parameters from measured radar signals. Based on these parameters, we can obtain information such as the system, application, type and platform of the radar, and further deduce the battlefield situation, threat level, activity rule, tactical intention, etc., and provide important intelligence support for our own decision-making system. The modern electromagnetic environment is considered complex due to a multitude of signals originating from a number of different radars (emitters), and in the case of signals coming from the same radar their parameters (features) are measured with low accuracy. In many cases the radar may change one or several signal parameters in order to perform its task more efficiently [1,2]. Since each radar has limited parameter ranges (e.g., transmits within a limited frequency band) and often identifiable characteristics, it is assumed that radar signals with similar characteristics originate from the same device [3,4].
Opponents of artificial intelligence methods, especially deep learning methods, will point out here that the learning process may take too long in relation to the use of manually programmed filters/processing algorithms. However, in the case of deep neural networks, it is possible to use already learned filters from convolutional layers [40][41][42][43][44]. The only problem that remains then is teaching a three-layer neural network, which is usually learned quite quickly in relation to convolutional layers [35,42,[45][46][47]. Another cause for optimism is that modern graphics cards make it possible to reduce the training time of ANN classifiers from several weeks to hours or even minutes (here the leaders in this field are NVIDIA and AMD) [30,31,48]. The ANN processing methods discussed in this article will operate on digital signals, and will show the possibility of optimizing the methods of classifying these radar signals depending on the receiver that produces a given signal, as well as the acceleration of the processing itself. The nature of ANN structures and their operation can be used for data processing, and for signals at higher frequencies directly or indirectly (after rescale or transforming high-frequency signals to the lower baseband frequency) in which the telecommunications or radiolocation devices operate [8,39,49].
In our work we chose deep learning (DL) with convolutional neural networks for signal processing because we wanted to create an agile and adjustable (to radio background) radar signal recognition system [50][51][52]. DL for signal processing can be superior in comparison to the standard models of neural networks with full connections (i.e., as used in [53]) or processing with hand-crafted features (e.g., eigenvector analysis methods [54], random forest (RF), support vector machines (SVM), k-nearest neighbors [55]), because they do not require neither the use of preliminary signal processing, nor additional feature extraction, nor the standardization of measured vectors. The neural network acquires all the features of extraction during the learning process. However, as it is stated in other publications [55,56], to achieve better results in comparison to algorithms with hand-crafted extraction features we need enough data patterns to learn the deep model otherwise we do not achieve better performance but can lead the deep model to overfitting. A better approach to improve system performance can be achieved by supporting the DL method with hand-crafted pre-processing so we can speed up the learning process and outperformance application of each method separately [55]. In our work such preliminary processing can be called the use of the FIR filter before the entry of the classifier. Without the FIR filter, the convolutional neural network would still be able to classify the raw signals, except that we would probably have to add one layer or two to the network in order to teach it how to filter the desired signals from the frequency domain.
In paper [54] eigenvector analysis is used, which is quite efficient, however, as the number of analyzed devices increases, the eigenvector matrix grows, and this can gradually lead to deteriorating system performance. In addition, the methods based on the analysis of eigenvectors degrade some information when converting the classifying vectors to the matrix of eigenvectors. The degradation of some information may lead to deterioration of the classification under disturbance conditions or when geometrical changes of the input samples are inserted. When using artificial neural networks (especially convolutional networks), adding further radar classifying vectors, it is not required to enlarge the structure of the neural network, but we can consider use of learning transfer, which means disconnecting the last layer of the neural network and connecting another one with the number of required classes and training this layer with the use of the pre-trained first layers [56]. Another argument regarding the use of convolutional neural networks is that there is no need to lose information in the pre-processing signals in the form of reducing the dimensions or averaging the available processing channels. The network, during the learning process through the use of layers such as convolutional and max pooling, decides on the basis of the training set which signal features are unimportant and what will be relevant in the process of proper classification [56].
It also should be noted that in the methods shown in other works [53,54] we process only post-detection data without introducing any significant disturbances. In the work presented here we process only the raw signals without complicated pre-processing, which is mostly carried out by layers of the neural network. Processing signals of the size 355 × 10 3 samples with the use of the method of eigenvector analysis would force us to create huge matrices, the calculation of which in real time may sometimes prove impossible.
In the following parts of this article, Section 2 contains a description of an electromagnetic environment, the measurement method and the processing of radar signal parameters, as well as the method of constructing the training dataset applied in the proposed model for recognizing emitter sources. Section 3 presents a detailed structure of CNN designed for radar signals recognition based on the measured parameters, i.e., pulse repetition interval, pulse duration, radio frequency and antenna scan period. The learning method designed for a CNN structure, the software used and the mathematical formulas are described in Section 4. The simulation environment used for testing the efficiency of radar signal recognition is described in Section 5. Analysis of experiment results, shown in the form of tables and figures, is presented in Section 6. Section 7 contains a summary of the obtained simulation results and the conclusion.

Description of Radar Signal Parameters
The electronic support measures (ESM) systems measure the basic parameters of incoming radar signals (frequency, amplitude, bearing and elevation angles, pulse width, time of arrival and sometimes polarization). The data collected are sorted into groups considered to be from a single emitter and subsequently used to compute the time-dependent parameters (pulse repetition frequency, antenna rotation period, etc.) [1][2][3]. Finally, the system matches the "signal signatures", composed from the average parameters from each group, with the characteristics of known emitters stored in the emitter database (EDB). This action enables the system to identify and classify the incoming radar signals which may have a high degree of inherent uncertainty arising from the methods of data gathering and processing [2,3]. In the electromagnetic environment a great deal of information collected by the receiver is processed in real time ( Figure 1).  Figure 1. General structure of ESM system for measurement, recording, analysis and radar signals recognition [53].
The basic measured radar signals in mobile systems for identification of radar signals MUR-20 or in the ELINT system of recognition of onboard RF emitters, produced by PIT-RADWAR, are the following: -Automatic detection direction that finds and monitors the emission sources with a frequency ranging from 500 MHz to 18 GHz; -Signal parameters measured: frequency, pulse width, amplitude, direction of arrival, pulse repetition frequency, antenna rotation period; -Deinterleaving; -Acousto-optical channel of spectrum analyzer 500 MHz and channel of compression spectrum analyzer 40 MHz; -Radio frequency measurement with 1 MHz accuracy; -Instantaneous time parameters measurement with 25 ns accuracy.
The signal structure generated by a single radar varies across time and depends on parameters such as pulse repetition interval (PRI), pulse duration (PD), radio frequency (RF) and antenna scan period (SP). The intervals of individual radar signal parameters may overlap, therefore the given signals may be more or less similar to each other at cer-Figure 1. General structure of ESM system for measurement, recording, analysis and radar signals recognition [53].
The basic measured radar signals in mobile systems for identification of radar signals MUR-20 or in the ELINT system of recognition of onboard RF emitters, produced by PIT-RADWAR, are the following: -Automatic detection direction that finds and monitors the emission sources with a frequency ranging from 500 MHz to 18 GHz; -Signal parameters measured: frequency, pulse width, amplitude, direction of arrival, pulse repetition frequency, antenna rotation period; -Deinterleaving; -Acousto-optical channel of spectrum analyzer 500 MHz and channel of compression spectrum analyzer 40 MHz; -Radio frequency measurement with 1 MHz accuracy; -Instantaneous time parameters measurement with 25 ns accuracy.
The signal structure generated by a single radar varies across time and depends on parameters such as pulse repetition interval (PRI), pulse duration (PD), radio frequency (RF) and antenna scan period (SP). The intervals of individual radar signal parameters may overlap, therefore the given signals may be more or less similar to each other at certain times. The recognition of measured radar signal parameters is also based on the analysis of their temporal structure, which will be reconstructed using the simulation Sensors 2021, 21, 8237 5 of 25 environment, because unlike the PRI, PD, RF and SP parameters, we do not have signal temporal structures assigned to particular classes of radar signals.
The radar signals characteristics presented in this section belong to 18 different types of radars (classes). Table 1 presents the confidence intervals of radar signal parameters calculated based on their measurement data.  tain times. The recognition of measured radar signal parameters is also based on the analysis of their temporal structure, which will be reconstructed using the simulation environment, because unlike the PRI, PD, RF and SP parameters, we do not have signal temporal structures assigned to particular classes of radar signals.
The radar signals characteristics presented in this section belong to 18 different types of radars (classes). Table 1 presents the confidence intervals of radar signal parameters calculated based on their measurement data.

Constructing a Set of Data for Training a Neural Network
As noted in the previous section, based on the description and parameters intervals assigned to the individual radars, these classes in certain ranges can overlap each other, which can significantly affect the effectiveness of proper signal source recognition. To reduce the risk of a false recognition, the final decision concerning the recognition of a given signal source is announced only after receiving all the parameters described above. For this purpose, three training sets were created.
(a) The first training dataset consists of time waveforms (TW) of the signals with variable PD, RF and intra-pulse modulation. An example of the time waveforms of a signal simulated with the use of a simulation environment (which is described in more detail in Section 5), based on the parameters presented in Table 1, is depicted in Figure 4. (b) The second training dataset consists of variable PRI waveforms which change depending on the applied inter-pulse modulation. Below, in Figure 5, these changes of PRI are shown.

Constructing a Set of Data for Training a Neural Network
As noted in the previous section, based on the description and parameters intervals assigned to the individual radars, these classes in certain ranges can overlap each other, which can significantly affect the effectiveness of proper signal source recognition. To reduce the risk of a false recognition, the final decision concerning the recognition of a given signal source is announced only after receiving all the parameters described above. For this purpose, three training sets were created.
(a) The first training dataset consists of time waveforms (TW) of the signals with variable PD, RF and intra-pulse modulation. An example of the time waveforms of a signal simulated with the use of a simulation environment (which is described in more detail in Section 5), based on the parameters presented in Table 1, is depicted in Figure 4.

Constructing a Set of Data for Training a Neural Network
As noted in the previous section, based on the description and parameters intervals assigned to the individual radars, these classes in certain ranges can overlap each other, which can significantly affect the effectiveness of proper signal source recognition. To reduce the risk of a false recognition, the final decision concerning the recognition of a given signal source is announced only after receiving all the parameters described above. For this purpose, three training sets were created.
(a) The first training dataset consists of time waveforms (TW) of the signals with variable PD, RF and intra-pulse modulation. An example of the time waveforms of a signal simulated with the use of a simulation environment (which is described in more detail in Section 5), based on the parameters presented in Table 1, is depicted in Figure 4. (b) The second training dataset consists of variable PRI waveforms which change depending on the applied inter-pulse modulation. Below, in Figure 5, these changes of PRI are shown. (c) The third training dataset consists of variable PD waveforms changing from pulse to pulse.

The Similarity between the Classes of Signals
From the waveforms of changes of individual parameters presented in Section 2.1 (Figures 2 and 3), the overlapping of individual classes of signals can be noted, taking into account time and frequency dependencies. Table 2 below depicts the classes with overlapping ranges of changes in time (PRI, PD, SP) and frequency (RF).

Proposed Model
The CNN model applied in this paper is a multi-layer (deep) structure containing inter-area convolutional connections in the first layers, batch normalization layers [57] applied during the learning process to accelerate and stabilize it, a sub-sampling layer (maxpooling) [58] and dropout layers reducing the probability of overfitting [59]. Figure 6 below depicts the structure of the CNN used for training the signals of changeable intrapulse modulation TW (RF and PD), PRI and PD signals with inter-pulse modulation.

The Similarity between the Classes of Signals
From the waveforms of changes of individual parameters presented in Section 2.1 ( Figures 2 and 3), the overlapping of individual classes of signals can be noted, taking into account time and frequency dependencies. Table 2 below depicts the classes with overlapping ranges of changes in time (PRI, PD, SP) and frequency (RF).

Proposed Model
The CNN model applied in this paper is a multi-layer (deep) structure containing inter-area convolutional connections in the first layers, batch normalization layers [57] applied during the learning process to accelerate and stabilize it, a sub-sampling layer (maxpooling) [58] and dropout layers reducing the probability of overfitting [59]. Figure 6 below depicts the structure of the CNN used for training the signals of changeable intra-pulse modulation TW (RF and PD), PRI and PD signals with inter-pulse modulation. The first CNN layers in this structure, in this case the convolutional layers 1D, are designed to extract the features from the signals tested [60]. Traditional convolutional 2D layers operate on images and use the 2D filters to extract features of input signal, where convolutional 1D layers have 1D filters and operate directly on the 1D signal without transformations (such as transforming signal 1D to a spectrogram). Extraction of selected signal features is carried out by the means of a convolution operation of the input signal with feature maps (filters) obtained during the learning process. Figure 7 shows examples of signal filters (visualization of weight maps) obtained during the learning process. Due to the CNN structure used together with the one-dimensional convolution layers, the illustration of these filters was presented in the form of sampled time waveforms. In the designed structure feature extraction has been performed three times. After each extraction, the maps of neuron responses obtained at the output of the convolution layer were down-sampled with the help of the max pooling layer [58] in order to facilitate the features' extraction in the successive convolution layers. The last layers-the dense (full connected) layers [4,61]-are the layers deciding on the basis of the extracted features which class of signals we deal with. Table 3 presents a detailed description of the structure of each layer in the CNN structure described above. B -- Figure 6. Structure of the CNN used for radar signal recognition, where OT is the output tensor, SM is the size map or number output of the dense layer, and NM is the number of maps.
The first CNN layers in this structure, in this case the convolutional layers 1D, are designed to extract the features from the signals tested [60]. Traditional convolutional 2D layers operate on images and use the 2D filters to extract features of input signal, where convolutional 1D layers have 1D filters and operate directly on the 1D signal without transformations (such as transforming signal 1D to a spectrogram). Extraction of selected signal features is carried out by the means of a convolution operation of the input signal with feature maps (filters) obtained during the learning process. Figure 7 shows examples of signal filters (visualization of weight maps) obtained during the learning process. Due to the CNN structure used together with the one-dimensional convolution layers, the illustration of these filters was presented in the form of sampled time waveforms. Figure 6. Structure of the CNN used for radar signal recognition, where OT is the output tensor, number output of the dense layer, and NM is the number of maps.
The first CNN layers in this structure, in this case the conv designed to extract the features from the signals tested [60]. Trad layers operate on images and use the 2D filters to extract feature convolutional 1D layers have 1D filters and operate directly on transformations (such as transforming signal 1D to a spectrogram signal features is carried out by the means of a convolution oper with feature maps (filters) obtained during the learning process. F of signal filters (visualization of weight maps) obtained during th to the CNN structure used together with the one-dimensional co lustration of these filters was presented in the form of sampled ti In the designed structure feature extraction has been perfo each extraction, the maps of neuron responses obtained at the ou layer were down-sampled with the help of the max pooling layer the features' extraction in the successive convolution layers. Th (full connected) layers [4,61]-are the layers deciding on the basis which class of signals we deal with. Table 3 presents a detailed de of each layer in the CNN structure described above. Table 3. Detailed structure of designed CNN for separate signals recogn amined parameters (PRI, PD, TW).

Layer Number
Layer Type Layer Dimension In the designed structure feature extraction has been performed three times. After each extraction, the maps of neuron responses obtained at the output of the convolution layer were down-sampled with the help of the max pooling layer [58] in order to facilitate the features' extraction in the successive convolution layers. The last layers-the dense (full connected) layers [4,61]-are the layers deciding on the basis of the extracted features which class of signals we deal with. Table 3 presents a detailed description of the structure of each layer in the CNN structure described above. The second CNN structure uses three structures similar to the model described above. The structures are connected by concatenate layer [63] and work together simultaneously processing PRI, PD parameters and TW to determine the class of input signals. The last layer of this structure is the dense type and makes final recognition of the signal (Figure 8 and Table 4). The model described here is depicted in more detail in Figure 8. The second CNN structure uses three structures similar to the model described above. The structures are connected by concatenate layer [63] and work together simulta neously processing PRI, PD parameters and TW to determine the class of input signals The last layer of this structure is the dense type and makes final recognition of the signa (Figure 8 and Table 4). The model described here is depicted in more detail in Figure 8.
For parameters PD, PRI, TW, the structure described in Table 3 was used.
The third CNN model is intended to recognize raw samples of vector signals. The first processing layers are comparable to structures described above, although input of the model is quite large with 335,544 samples and the last layers are transpose convolution layers [64]. Figure 9 and Table 5 presented below describe the model in more detail.  [64]. Figure 9 and Table 5 presented below describe the model in more detail.   Table 6 presents the memory requirement parameters and performance while processing of single input vector data for the described above CNN architecture.    Table 6 presents the memory requirement parameters and performance while processing of single input vector data for the described above CNN architecture.

CNN Learning
The learning of CNN structures presented in this article was carried out using the TensorFlow [4] library, which enables the acceleration of the learning process using generalpurpose computing in graphics processing units (GP-GPU) processors (graphics cards), the use of modern learning algorithms and constructing structures of artificial neural networks.
The learning process was carried out using Adam Optimizer, a modification of the stochastic gradient descent (SGD) algorithm [65], which allows for the efficient solving of the optimization problems for multidimensional objective functions. Due to its efficiency, this algorithm is implemented by default in ANN training libraries such as TensorFlow or Keras [66]. The Adam Optimizer gradient descent algorithm is an adaptive learning algorithm, which is an extension and combination of two methods, i.e., the AdaGrad and RMSProp methods [65]. Basic Equations (1-6) of the Adam Optimizer algorithm are presented below, according to which the successive values of changes of weight vector in the ANN are calculated in the following way: where i is a number of the current epoch in the learning process, g(i) is the gradient of the objective function, m(i) is the first-order moment of the estimation of changes in the value of the weight vector, v(i) is the second-order moment of the estimate of changes in the value of the weight vector, m(i) denotes the normalization of the moment m(i), v(i) denotes the normalization of the moment v(i), β 1 (i) is the moment decay factor m(i), β 2 (i) is the moment decay factor v(i), W(i) denotes the weight vector of CNN, α is the learning coefficient or the step of changes in the weight vector updating, and is the small value ensuring stability of calculations. In the first step the Adam Optimizer program calculates the gradient of the objective function (multivariate error function) g(i) in Equation (1). Then, respectively, the first m(i) and second-order moments v(i) are calculated. These are the values of the estimation of the weight vector changes W(i) in Equations (2) and (3). Before the actual updating of the weights vector, Equation (6), the correction (normalization) of the moments v(i) and m(i) is performed. During each update, along with the learning progress, the effects of the moments of the first and second order of the estimates are minimized on the basis of the β 1 and β 2 coefficients.
Due to the fact that the presentation of the learning patterns is carried out using an indirect method between online (updating weights after presenting one learning pattern) or total batch (updating weights after presenting the entire training dataset), i.e., the learning dataset is divided into packages (mini-batches) with the number of patterns N. An additional element accelerating the achievement of the global minimum of the objective function of the problem under consideration (signal recognition) was the use of training data normalization (and the transmitted signals between CNN layers) using the batch normalization method [57].
A batch normalization algorithm normalizes the input training patterns and successive signal vectors propagated between successive ANN layers [57]. The normalization introduced by the discussed algorithm reduces or even eliminates possible oscillations of the ANN error minimization process and possibility to become stuck in the local minimum, resulting from the fact that small changes in the weight vector in the layers preceding subsequent layers may cause large changes in the weight vectors in subsequent layers (the deeper has an ANN structure), the so-called exploding gradient [67] or too-small changes to the weight vector value that will cause the weight vector gradient to fade away on subsequent layers, the so-called vanishing gradient [67].
In addition to the abovementioned stabilizing properties of the learning process, the batch normalization algorithm normalizes (providing the mean value and variance of the training pack: 0 and 1, respectively) in such a way that the training dataset that the range of values taken by ANN is approximately not variable, which is another aspect that reduces the probability of the occurrence of oscillations of objective function between successive epochs during the learning process.
The normalized input vectors for the successive ANN layers are calculated in the following way: where B is the number of the current training packet (mini-batch [57] number), m is the number of elements in the packet, µ B is the average value of the training packet, σ 2 B is the variance of the teaching packet, x i is the normalized input training vector, and y i is the normalized and scaled training vector. Algorithm 1 presents the exemplary of batch normalization procedure for an input vector X. In the first step, the normalizing algorithm calculates the mean value µ B and the variance σ 2 B of the presented training set (Equations (7) and (8)); then, based on these values, it calculates the normalized and scaled input vectors y i of the training set (Equations (9) and (10)), [57]. The operation described above is used not only for the input layer, but for each successive output layer in relation to the next input layer (Figure 10) of the entire CNN structure.
Sensors 2021, 21, x FOR PEER REVIEW Figure 10. Scheme of normalizing the input vectors for successive layers using the b tion algorithm.

Simulation Environment
In order to test the designed structures of neural networks, a simulat using C++ [68,69] language, with the FFTW [70] library for signal proce OpenCL [71] library to speed up the process of generating signals. The sim one to generate many digital signals, introducing interferences and noises operation of the simulator is based on the generation of signals on a one-tim of signal samples of length N), which consists of N samples, and all availabl with variable parameters PD, PRI, RF and SP. These signals parameters var tion to generation of successive signals in the given class. The simulator wo real mode and allows one to view the currently generated signals, the modulu trum in the frequency domain, and the vector of samples of the signal space

Simulation Environment
In order to test the designed structures of neural networks, a simulator was created using C++ [68,69] language, with the FFTW [70] library for signal processing and the OpenCL [71] library to speed up the process of generating signals. The simulator enables one to generate many digital signals, introducing interferences and noises into them. The operation of the simulator is based on the generation of signals on a one-time base (vector of signal samples of length N), which consists of N samples, and all available signal classes with variable parameters PD, PRI, RF and SP. These signals parameters vary from generation to generation of successive signals in the given class. The simulator works in a quasi-real mode and allows one to view the currently generated signals, the modulus of their spectrum in the frequency domain, and the vector of samples of the signal space ( Figure 11). operation of the simulator is based on the generation of signals on a one-time base (vecto of signal samples of length N), which consists of N samples, and all available signal classe with variable parameters PD, PRI, RF and SP. These signals parameters vary from genera tion to generation of successive signals in the given class. The simulator works in a quasi real mode and allows one to view the currently generated signals, the modulus of their spec trum in the frequency domain, and the vector of samples of the signal space ( Figure 11).  In the first step, the simulator reads the signal parameters for the given radar devices from the configuration file. Then it should check whether the given hardware configuration allows one to process signals with the maximum sampling frequency obtained from the configuration file. If the system's maximum sampling rate is less for random access memory (RAM) than the highest-frequency signal of all signal classes, then the signals (their waveform form) are scaled to the system's acceptable sampling rate. For faster learning and processing purposes, the scaling rate was chosen manually. After calculating these scaling factors (or setting them manually), the simulator creates the filters with a finite impulse response (FIR) [72,73] for each radar signal class. Filter coefficients for individual classes were calculated based on the Hann's time window [74] using the formulas presented in Equations (11)- (13).
where N is the number of FIR filter coefficients, x is the vector of discrete argument values for which the FIR filter coefficients are calculated, s is the discrete shift value between the sample x[i] and x[i + 1], c is the vector of coefficients calculated according to the sinc(x) relationship, h is the vector of coefficients using the relationship for the Hann window, i is the successive index of the FIR filter coefficient and i ∈ (0, N − 1), l B means the cut-off value of the FIR filter corresponding to the upper and lower cut-off frequency. The block diagram of simulation and recognition radar signals is presented in Figure 12. Sensors 2021, 21, x FOR PEER REVIEW 15 of 26 Figure 12. Diagram of the simulation and recognition of radar signals. Figure 12. Diagram of the simulation and recognition of radar signals.
After the FIR filters are created, the main simulation loop follows. The operation of the simulation loop consists of the following four steps described below.
The first step is the digital generation of signals on the one vector of the output signals S of real numbers with the number of elements N = F max (number of samples). A single-class signal is added to the S vector at intervals depending on the PRI. Initially, the signal vector contains approximately K = 1 PRI[s] of the waveforms of a signal for the given class, where the exact number depends on the drawing of individual values of PRI and PD in the successive signal generations for the given class. The values of PRI, PD and RF at each successive signal generation are randomly selected in accordance with the uniform distribution or PRI, with PD inter-pulse modulation in intervals characterizing the allowable range of signal parameter changes for a given class of signals. Below, Algorithms 1-3 in the form of a pseudocode are presented, describing the process of adding subsequent signals to the output vector of signals S. N-Length output vector of signals 1: initialize: Adding signal to S vector space 11: end for The second step is to introduce the interference and noise to the vector of the output signals. The noise introduced is additive. It is worth mentioning that the signals from different classes added to the same sample vector disturb each other by interfering with each other.
The third simulation step is the filtering step of the entire resulting vector from the output signals S, each of the 18 filters, and each of them is assigned to a given class signal. The block diagram of the output vector filter for each class of signals is shown in Algorithm 4. As can be deducted from the above-presented algorithms in the form of pseudocode, an output vector is given as an argument for each of the 18th FIR filters. The filtering function of the firFilter instance of FIR class returns the signal vector filtered against the i-th class, which is then sent to the CNN input, and an attempt is made to detect the given signal class. DNN_isClassSignal is a function which analyzes the occurrence of a given signal class and returns the probability value in the situation where the recognized signal is located in the given input vector.

Experiment Results
The calculation results of the effectiveness of radar signal recognition based on the post-detection PRI, PD determination and the sampled time form of the signal are presented in Tables 7-12.    Where the disruption level means the amplitude level of the interfering signals. The characteristic in Figure 13 indicates that teaching the CNN to recognize the correct class on the basis of PD parameter alone is unlikely: about 10% of the achieved effectiveness with a training dataset of 40-70 elements. This is due to large overlapping of the signal operation intervals for the PD parameter. The characteristic in Figure 13 indicates that teaching the CNN to recognize the correct class on the basis of PD parameter alone is unlikely: about 10% of the achieved effectiveness with a training dataset of 40-70 elements. This is due to large overlapping of the signal operation intervals for the PD parameter.
In the case of CNN analysis for the PRI parameter ( Figure 14) in relation to the PD analysis (Figure 13), the results achieved are much better: approximately 70% of the effectiveness of the training package with the size of 20 elements and 60 epochs in the learning process. With larger packages at the level above 70, the quality of learning and CNN performance showed a descending trend. of the changes in the size of the training dataset and the number of iterations in the training algorithm.
The characteristic in Figure 13 indicates that teaching the CNN to recognize the correct class on the basis of PD parameter alone is unlikely: about 10% of the achieved effectiveness with a training dataset of 40-70 elements. This is due to large overlapping of the signal operation intervals for the PD parameter. In the case of CNN analysis for the PRI parameter ( Figure 14) in relation to the PD analysis (Figure 13), the results achieved are much better: approximately 70% of the effectiveness of the training package with the size of 20 elements and 60 epochs in the learning process. With larger packages at the level above 70, the quality of learning and CNN performance showed a descending trend.
The analysis of the results of the CNN recognition of TW samples ( Figure 15) in terms of effectiveness was similar to the effectiveness of CNN in the PRI samples analysis. However, here the downward trend along with the growth of the learning dataset is much   An important parameter that influences the qualitative operation of the structure (Figure 7, Table 4) is the size of the input vector accepted by the CNN. In this case, the size of CNN inputs, which analyzed the PRI and PD parameters, changed at the same time. The analysis of the results of the CNN recognition of TW samples ( Figure 15) in terms of effectiveness was similar to the effectiveness of CNN in the PRI samples analysis. However, here the downward trend along with the growth of the learning dataset is much more noticeable. The optimal learning point turned out to be a package of 20 learning patterns and 140 epochs of the learning process. The combination of three CNN structures with the concatenation operation and an additional classifier with dense connections (Figure 7) which analyzes the combined tensor allowed one to achieve an efficiency of radar signal classes recognition at the level of 90-99% ( Figure 16).
An important parameter that influences the qualitative operation of the structure (Figure 7, Table 4) is the size of the input vector accepted by the CNN. In this case, the size of CNN inputs, which analyzed the PRI and PD parameters, changed at the same time. The characteristics clearly indicate that the possibility of accepting a larger vector of samples (longer observation time) improved the efficiency of signal classification. The size of the package also turned out to be important. In this case, it should be at the level of about 350 elements for the CNN to achieve high effectiveness in the learning process.  An important parameter that influences the qualitative operation of the structure (Figure 7, Table 4) is the size of the input vector accepted by the CNN. In this case, the size of CNN inputs, which analyzed the PRI and PD parameters, changed at the same time. The characteristics clearly indicate that the possibility of accepting a larger vector of samples (longer observation time) improved the efficiency of signal classification. The size of the package also turned out to be important. In this case, it should be at the level of about 350 elements for the CNN to achieve high effectiveness in the learning process.
The last examined structure (Table 5, Figure 8) contains efficiency of classification when the raw unprocessed sampled signal is considered. The characteristics of CNN operation in the case of disturbances are presented in Figure 17. The results of CNN operation without The last examined structure (Table 5, Figure 8) contains efficiency of classification when the raw unprocessed sampled signal is considered. The characteristics of CNN operation in the case of disturbances are presented in Figure 17. The results of CNN operation without interference were at the level of 99% probability of correct recognition, therefore the tables and graphs with the operation of the CNN without interference were omitted. interference were at the level of 99% probability of correct recognition, therefore the tables and graphs with the operation of the CNN without interference were omitted. The blue curve in Figure 17 shows the effectiveness of the CNN in the case of recognizing radar signals when there are a certain number of radio sources at different distances around the reconnaissance station. The noise levels (horizontal line) illustrate the signal amplitude level of interfering sources. The interfering signals, in this case, were signals for the 18 classes of radar signals examined earlier. To simulate the disturbance effect, the generated signals for the 18 classes to be recognized were superimposed on the remaining signals with a given amplitude (0.1-0.9). The red curve shows the disturbance noise.
A CNN working on the raw signals showed a high resistance to the noise of interference. However, it started to cope worse in the case of interference with other signals where the probability of correct recognition tended to decline along with the increase in the amplitude of the disturbing signals. To overcome this problem, the observation time (the number of samples received by the CNN at the input) was tested and increased, which resulted in an effectiveness equal to 92.2% (Table 11).

Conclusions
On the basis of the obtained results of the effectiveness measurements, it is possible to note the high effectiveness of the CNN in the process of recognizing digital signals The blue curve in Figure 17 shows the effectiveness of the CNN in the case of recognizing radar signals when there are a certain number of radio sources at different distances around the reconnaissance station. The noise levels (horizontal line) illustrate the signal amplitude level of interfering sources. The interfering signals, in this case, were signals for the 18 classes of radar signals examined earlier. To simulate the disturbance effect, the generated signals for the 18 classes to be recognized were superimposed on the remaining signals with a given amplitude (0.1-0.9). The red curve shows the disturbance noise.
A CNN working on the raw signals showed a high resistance to the noise of interference. However, it started to cope worse in the case of interference with other signals where the probability of correct recognition tended to decline along with the increase in the amplitude of the disturbing signals. To overcome this problem, the observation time (the number of samples received by the CNN at the input) was tested and increased, which resulted in an effectiveness equal to 92.2% (Table 11).

Conclusions
On the basis of the obtained results of the effectiveness measurements, it is possible to note the high effectiveness of the CNN in the process of recognizing digital signals when they are analyzed post-detection based on the PRI, and PD parameters and the form of the time waveform (RF and PD), or when we analyze the raw signal (not processed sampled signal). Simultaneous analysis of the set of three signal parameters was possible thanks to the concatenation operation of the three CNN network models (PD, PRI, TW), and the analysis of the response tensor obtained by the ANN with the dense full-connected architecture.
The achieved probabilities of correct signal recognition were high, ranging from 90-99%. However, in order to achieve such network efficiency in the case of post-detection analysis, it is required that the CNNs analyze more than two parameters of the radar signal. Otherwise, if each signal parameter is analyzed separately, then the radars' signals cannot be properly classified. This is due to the overlapping of the ranges of the operating parameters of the classes of individual signals (Table 2), and was confirmed by the results of the signal recognition efficiency by the CNN (Tables 7-9 and Figures 13-15), where the analysis of the parameters of each separately allowed one to achieve the maximum CNN values of the appropriate probability diagnosis at the level of 70-72% for PRI and TW, and a maximum of 11-16% for the PD parameter.
The CNN designed has the ability to classify signals on the basis of the analysis of raw data, also in the presence of interference at a 92-99% probability of correct recognition (Tables 11 and 12). In the case of working with interference, the effectiveness of our CNN largely depends on the capabilities and sensitivity of the receiver, i.e., the ability to process signals such as a sampling frequency. Parameters such as package size and number of iterations (epochs) of the training algorithm were important for convergence by a particular CNN, and depend on both the architecture of the given CNN and the training dataset (Tables 7-10) and (Figures 13-16).
Increasing the size of the CNN input vector, which is basically the observation time or sampling rate, significantly improved the performance of the combined CNN (Table 4, Figure 7) and the CNN analyzing the raw signal (Table 5, Figure 8), especially in the event of disturbances (Table 11).
The results of the operation presented here were achieved in a relatively short time; about 2 h for a single learning process for post-detection analyzing networks, and about 24 h for a network working on the raw signals thanks to the scaling of high-frequency signals to lower frequencies. In the case of direct operation at high frequencies, changing the size of the CNN inputs and selecting an appropriate number of convolutional filters will be required. The learning time available and having the appropriate hardware architecture to carry out such a learning process should also be taken into account.