Novel Unconventional-Active-Jamming Recognition Method for Wideband Radars Based on Visibility Graphs

Radar unconventional active jamming, including unconventional deceptive jamming and barrage jamming, poses a serious threat to wideband radars. This paper proposes an unconventional-active-jamming recognition method for wideband radar. In this method, the visibility algorithm of converting the radar time series into graphs, called visibility graphs, is first given. Then, the visibility graph of the linear-frequency-modulation (LFM) signal is proved to be a regular graph, and the rationality of extracting features on visibility graphs is theoretically explained. Therefore, four features on visibility graphs, average degree, average clustering coefficient, Newman assortativity coefficient, and normalized network-structure entropy, are extracted from visibility graphs. Finally, a random-forests (RF) classifier is chosen for unconventional-active-jamming recognition. Experiment results show that recognition probability was over 90% when the jamming-to-noise ratio (JNR) was above 0 dB.


Introduction
Electronic countermeasure (ECM) techniques, such as active deceptive jamming, can generate false targets imitating real ones, and even disturb target detection or tracking [1,2]. Active barrage jamming is another ECM technique, which uses noise or noise-like signal to barrage the target echo, preventing the radar from detecting targets or measuring target parameters. Its basic principle is to use high-power random or pseudo-random signals, thereby reducing radar-detection probability. On the other hand, with the rapid development of radar electronic counter-countermeasure (ECCM) techniques, ECCM techniques, including side-lobe cancellation and blanking [3], space-time adaptive processing [4] and adaptive beamforming [5], have gradually weakened the jamming performance of radar conventional active jamming.
Unconventional active jamming is generated by replicating the transmitted signals of the victim radar, followed by a series of parameter modulations based on digital-radio-frequency memory (DRFM) technology. Compare with conventional active deceptive and barrage jamming, it is coherent with the radar transmitted signals and can achieve some processing gain from signal processing that includes pulse compression and coherent accumulation. Therefore, conventional ECCM techniques become ineffective, and the jamming always enters the radar receiver. One of the most effective schemes to weaken the jamming is the so-called signal-processing based scheme, which is designed to recognize and suppress jamming based on the difference between the echo and the jamming. Commonly used ways to recognize radar active jamming can be divided into two steps, feature extraction and classifier design. The distinguishability of jamming-signal features directly affects the recognition probability of subsequent classifier design. The majority of feature extraction operates on the time/frequency/spatial [6], polarization [7], statistical [8,9], and multiscale-joint domains [10]. These features show the information of the jamming from different perspectives, and we consider a new perspective to extract jamming features in this paper.
Time series analysis is the use of statistical methods to analyze time series data and extract meaningful statistics and characteristics of the data. Therefore, nonlinear time-series analysis can be characterized by the complexity in the signal owing to jamming signals always carrying some fingerprint features [11,12]. At the end of the 19th century, with the rapid development of computer and Internet technology, the theory of complex networks [13,14] was proposed to outline the real-world network. Complex-network theory could be a useful way to characterize time series, and a series of related achievements, including visibility graphs and horizontal visibility graphs [15,16], recurrence networks [17,18], and correlation networks [19], were presented for analyzing the structural properties of time series by complex networks. The visibility algorithm was originally applied to computational geometry and robot motion planning. In 2008, Reference [15] used the visibility algorithm for time-series analysis by converting time series into graphs, called visibility graphs. Up to now, the visibility algorithm has been used in many fields, such as financial [20], traffic [21], and geographic time series analysis [22].
In this paper, we propose a novel method for radar unconventional-active-jamming recognition based on visibility graphs. The representations for different types of unconventional active jamming were derived for illustration purposes in Section 2. In Section 3, the visibility algorithm that converts a time series into a graph is given and the concept of visibility graphs is introduced; then the feasibility of the visibility algorithm for radar signals is analyzed. Section 4 extracts four features on visibility graphs that are valid for jamming recognition. Then, Section 5 outlines numerical results and analysis for unconventional-active-jamming recognition based on a random-forests classifier. Finally, in Section 6, the main conclusions of the paper are summarized.

Unconventional Deceptive Jamming
At present, most unconventional deceptive jamming uses DRFM technology to achieve accurate the interception of radar signals to generate various types of deceptive jamming. These interception methods mainly include three modes: Full pulse, partial pulse and interrupted sampling. Unconventional deceptive jamming is commonly generated by partial-pulse and interrupted-sampling modes, which include interrupted-sampling directly jamming (ISDJ), interrupted-sampling repeater jamming (ISRJ), interrupted-sampling circularly jamming (ISCJ), and partial-pulse dense transmitted jamming (PDTJ).
The generation mechanism of ISDJ is to use the DRFM technology to perform fast retransmission directly after intercepting part of the radar signal, and repeats the "interception-retransmission" process until it detects the falling edge of the radar signal. ISDJ is expressed as follows: where s(t) stands for the intercepted radar signal, N J is the number of slices, and T W is the slice width. The mechanism of ISRJ is similar to ISDJ, but every intercepted signal is retransmitted multiple times. The model of ISRJ can be written as: where M J is the number of retransmissions. ISCJ also belongs to interrupted-sampling modes. Not only is intercepted signal retransmitted, but the previously intercepted signal is also retransmitted in reverse order after primary sampling. ISCJ can be indicated as: where α(m) = [m(m − 1)/2 − 1]T W represents the time delay in intercepting the m-th slice, β(m, n) = [n(n + 1)/2 + m(n − 1)]T W represents the time delay when the m-th slice is retransmitted for the n-th time. PDTJ uses the interception method of the partial pulse. After intercepting a portion of the radar signal, it starts to continuously retransmit this sampled pulse, which is densely formed false targets directly on the radar range profile, without having to wait for the jammer to intercept and store the complete signal. PDTJ is expressed as follows:

Unconventional Barrage Jamming
According to different modulation modes, conventional barrage jamming can be divided into radio-frequency noise jamming, noise-amplitude modulation jamming, noise-frequency modulation jamming, etc. Their time domain models can be uniformly modeled as: where A J (t), ω J (t), and ϕ J (t) are the instantaneous amplitude, frequency and phase function of the jamming signal, respectively. Regardless of the modulation method, the contradiction between power and frequency band coverage cannot be solved. In contrast, unconventional barrage jamming uses noise modulation on the intercepted radar signal to improve coherence with the radar signal, while preserving random and non-stationary jamming characteristics. It has the advantage of aiming at the radar carrier frequency with in-band high noise power. It can be mainly divided into noise productive jamming (NPJ) and noise convolutional jamming (NCJ).
NPJ is obtained by multiplying the intercepted radar signal by local noise in the time domain which can be derived as: where n(t) is the local noise. Local noise shifts the intercepted radar signal in the frequency domain, and the maximum spectral offset is the bandwidth of the local noise. When the radar signal is a pulse compression signal, since the jamming is related to the radar signal, jamming also obtains part of the gain through the matched filter. Therefore, the time domain of NPJ is similar to the noise, and its frequency domain automatically aligns with the center frequency of the radar signal, effectively reducing the power required for jamming under the premise of ensuring noise barrage.
Similarly, NCJ is to convolve the intercepted radar signal with local noise in the time domain, which can be written as: NCJ is essentially the result of delay addition of the radar signal multiplied by different coefficients, so it can also obtain the matched filter gain, and barrage the signal in time domain with less power. Sparse noise convolutional jamming (SNCJ) is an improvement based on NCJ, but combined with the interception method of interrupted-sampling. The intercepted radar signal is first sampled using a rectangular pulse series and then convolutionally modulated onto the local noise. Compare with NCJ, SNCJ can produce the same effect with less power. The model of the SNCJ is expressed as follows: We can consider ISDJ as a special case of ISRJ when the number of retransmissions is one, and we can consider NCJ as a special case of SNCJ if the SNCJ sampling window is the entire radar signal. Therefore, this paper focuses on the recognition of radar unconventional active jamming, including ISRJ, ISCJ, PDTJ, NPJ, and SNCJ.

Mathematic Principle
A graph is a system that contains a large number of individuals and interactions between individuals. If individuals are regarded as vertices, and interactions between individuals are regarded as a connection between vertices, then any complex system can be represented as a graph.
A graph can be defined as a binary set, which is referred to as G = {V, ε}. V = {v 1 , v 2 , . . . , v N } and ε = {e 1 , e 2 , . . . , e M }, say, the vertex set and the edge set. The elements of V and ε are the vertices and edges, and the number of elements is denoted as order N = |V| and size M = |ε|. Each edge has a corresponding pair of vertices. The graph is called an undirected graph if vertex pair v i , v j and v j , v i corresponds to the same edge; otherwise, it is a directed graph, and v i , v j represents an edge from v i to v j . Moreover, the graph is called a weighted graph if each edge is given a corresponding weight; otherwise, it is an unweighted graph.
The visibility algorithm converts a time series into a graph and is used for time-series analysis. Reference [15] applied the visibility algorithm and used the following definitions: Suppose S = {s i }, i = 1, 2, · · · , N is a signal containing N sampled data, two arbitrary data (a, s a ) and (b, s b ) have visibility and consequently become two connected vertices v a and v b of the associated graph G, if any other data (c, s c ) placed between them satisfy: The associated graphs derived from the visibility algorithm are called visibility graphs, and the number of vertices on visibility graphs is the same as the number of the data in a time series. For illustrative purposes, we plotted an example of the visibility algorithm in Figure 1. The upper part of Figure 1 is the first 20 data of a periodic time series; values are specifically 0.8, 0.5, 0.4, 0.6, 0.8, 0.5, 0.4, 0.6, 0.8, 0.5, 0.4, 0.6, 0.8, 0.5, 0.4, 0.6, 0.8, 0.5, 0.4, and 0.6. Each datum is represented as a column, and the height of the column represents the numerical value of the data. Here two data have visibility or invisibility if the tops of the corresponding two columns can or cannot be connected with a straight line. The lower part of Figure 1 shows the results of visibility graphs in a more concise form, and every vertex corresponds to series data in the same order on visibility graphs.
It is easy to verify that the visibility graphs obtained by the visibility algorithm have the following properties:

1.
Connectivity: Each vertex is connected to at least its left and right neighbors. If the vertex has only left (right) neighbors, it is at least connected to its left (right) neighbor; 2.
undirected and unprivileged: The generated network is an undirected and unprivileged network; 3.
affine transformation invariance: After rescaling both horizontal and vertical axes or after horizontal and vertical translations, the topology of the network does not change.
1. Connectivity: Each vertex is connected to at least its left and right neighbors. If the vertex has only left (right) neighbors, it is at least connected to its left (right) neighbor; 2. undirected and unprivileged: The generated network is an undirected and unprivileged network; 3. affine transformation invariance: After rescaling both horizontal and vertical axes or after horizontal and vertical translations, the topology of the network does not change. In complex-network theory, the degree of a vertex is defined as the number of connections the vertex has to other vertices, and degree distribution is then defined by the fraction of vertices in the network. It is noted that numerous real-world networks and theoretical networks satisfy certain degree distribution. Reference [15] found that the visibility algorithm can convert periodic time series into regular networks with Dirac degree distribution, which is the fingerprint of time-series periods. That means that visibility graphs can structurally conserve or inherit the regularity of periodic time series. Moreover, the visibility algorithm also converts random time series into random networks with exponential degree distribution and convert fractal time series into scale-free networks with power law degree distribution that can be utilized to detect the difference between random and chaotic series.
Therefore, it is a natural idea that we can use the visibility algorithm to establish a natural bridge between the jamming signal and visibility graphs. The key question is to study the degree distribution characteristics that the signal may retain after being converted to visibility graphs.

Sinusoidal Signal
In order to facilitate the degree-distribution characteristics of wideband-signal analysis, we first researched the sinusoidal signal on visibility graphs and assumed that a discrete sinusoidal signal is expressed as .We mainly discuss three regions in one period, since a sinusoidal signal is periodic, and defined them as Then, two data points that had visibility or invisibility with each other were defined to be written as In complex-network theory, the degree of a vertex is defined as the number of connections the vertex has to other vertices, and degree distribution is then defined by the fraction of vertices in the network. It is noted that numerous real-world networks and theoretical networks satisfy certain degree distribution. Reference [15] found that the visibility algorithm can convert periodic time series into regular networks with Dirac degree distribution, which is the fingerprint of time-series periods. That means that visibility graphs can structurally conserve or inherit the regularity of periodic time series. Moreover, the visibility algorithm also converts random time series into random networks with exponential degree distribution and convert fractal time series into scale-free networks with power law degree distribution that can be utilized to detect the difference between random and chaotic series.
Therefore, it is a natural idea that we can use the visibility algorithm to establish a natural bridge between the jamming signal and visibility graphs. The key question is to study the degree distribution characteristics that the signal may retain after being converted to visibility graphs.

Sinusoidal Signal
In order to facilitate the degree-distribution characteristics of wideband-signal analysis, we first researched the sinusoidal signal on visibility graphs and assumed that a discrete sinusoidal signal is expressed as s 1 (t) = cos(2π f 0 t). We mainly discuss three regions in one period, since a sinusoidal signal is periodic, and defined them as A : Then, two data points that had visibility or invisibility with each other were defined to be written as v i ↔ v j or v i v j . Any two data points on region P that had visibility or invisibility were defined to be written as P ↔ or P .
For any two points t B1 , t B2 on region B, the following equations were satisfied: Then we can find out that region B is a convex region: Therefore, any two points on region B always have visibility, that is, B . Similarly, regions A and C are concave regions; consequently, A and C . For any point on region A, pass data point t A for a tangent to s 1 (t) at data point t LA . It is easy to find that, for any point t M on region B or region C, when t M < t LA and t A < t P < t M are satisfied, then We define t LA as the left limit visible point (LVP) of t A . Left or right LVP means it is the farthest point of all the points, which has visibility on the left or right side of t A . t LA can be solved by tangential properties: cos Equation (15) is a transcendental equation. Its solution cannot be found analytically. For the t A on region A, the approximate solution was obtained by the graphical method. The change law of t LA with t A is shown in Figure 2a.
In addition, we should also find the right LVP t RA of t A , which passes t A and is the tangent point of a tangent line on s 1 (t). It is easy to find that for any point t N on region C (because the right LVP must be in the region C), when t N < t RA and Similarly, in order to get the solution of t RA , the tangential property is available: Equation (17) is a transcendental equation and its analytical solution cannot be obtained. The approximate solution obtained by the graphical method was also used. The change law of t RA with t A is shown in Figure 2b. For normalization, the axes in the figure are multiplied by f 0 .
In Figure 2a, as t A moves from the left end to the right end of region A, the position of t LA moves from the right end of region C to the left end of region B, and moving speed is gradually slowed down. When t A moves to the end of region A, the intersection of the tangent coincides with the vertex itself. In Figure 2b, as t A moves along the same route, t RA moves from the right end to the left of region C, but moving speed was almost negligible. Until t A stopped moving, t RA moved only about 3.5% (which we can see from the Y-axis of Figure 2b) of the total number of vertices in the entire period. slowed down. When A t moves to the end of region A, the intersection of the tangent coincides with the vertex itself. In Figure 2b, as A t moves along the same route, RA t moves from the right end to the left of region C, but moving speed was almost negligible. Until A t stopped moving, RA t moved only about 3.5% (which we can see from the Y-axis of Figure 2b) of the total number of vertices in the entire period.
(a) (b) Then the visible region of A t is defined between LA t and RA t , which means any point in the visible region has visibility with A t . A more intuitive schematic diagram is shown in Figure 3.  Then the visible region of t A is defined between t LA and t RA , which means any point in the visible region has visibility with t A . A more intuitive schematic diagram is shown in Figure 3. slowed down. When A t moves to the end of region A, the intersection of the tangent coincides with the vertex itself. In Figure 2b, as A t moves along the same route, RA t moves from the right end to the left of region C, but moving speed was almost negligible. Until A t stopped moving, RA t moved only about 3.5% (which we can see from the Y-axis of Figure 2b) of the total number of vertices in the entire period.
(a) (b) Then the visible region of A t is defined between LA t and RA t , which means any point in the visible region has visibility with A t . A more intuitive schematic diagram is shown in Figure 3.  Similarly, for any point t C on region C, there is also a left LVP t LC on region A, and the right LVP t RC is on region A or B. The change laws of t LC and t RC are shown in Figure 4 where the change laws of t LC and t RC are mirrored with the change laws of t RA and t LA .
Similarly, for any point C t on region C, there is also a left LVP LC t on region A, and the right LVP RC t is on region A or B. The change laws of LC t and RC t are shown in Figure 4 where the change laws of LC t and RC t are mirrored with the change laws of RA t and LA t .   Finally, consider any point t B on convex region B, and its left LVP t LB and right LVP t RB are on region A and region C, respectively. Similar to the derivation of t A , we can obtain the change laws of t LB and t RB with t B , which are shown in Figure 5. It is worth noting that t LB , t RB , t RA , and t LC , are tangent points, while t LA and t RC are not. It is obvious that the point on region C and the point on region A of the next period have invisibility (because has the largest data value) and do not add a new degree. Following the above discussion, we can solve degree 1 d of visibility graphs 1 G corresponding to ( ) 1 s t : As can be seen from Figure 5, as t B moves from left to right on region B, t LB moves from right to left at a substantially constant speed in region A. When t B moves to the end, t LB does not reach the left end of region A. As t B moves from right to left on region B, the movement laws of t RA and t LB mirror each other.
It is obvious that the point on region C and the point on region A of the next period have invisibility (because t = 1/ f 0 has the largest data value) and do not add a new degree. Following the above discussion, we can solve degree d 1 of visibility graphs G 1 corresponding to s 1 (t): where f s represents the sampling frequency.
Assuming that s 1 (t) owns 300 data, which corresponds to three periods. The degree and its distribution of G 1 are shown in Figure 6.
where s f represents the sampling frequency.
Assuming that ( ) 1 s t owns 300 data, which corresponds to three periods. The degree and its distribution of 1 G are shown in Figure 6. As shown in Figure 6, the degree of vertices on region B change slightly, so the peak of the degree distribution is mainly related to the number of vertices on region B, and the peak position is  As shown in Figure 6, the degree of vertices on region B change slightly, so the peak of the degree distribution is mainly related to the number of vertices on region B, and the peak position is f s (t RB − t LB ) ≈ 3 f s /4 f 0 . Near the peak position, there is about a 50% probability in total because region B accounts for half of the entire period. Therefore, the sinusoidal signal s 1 (t) can be converted into a Sensors 2019, 19, 2344 9 of 18 regular graph by the visibility algorithm, and the degree distribution of G 1 can be roughly expressed as single-spike distribution, which is given by:

Linear Frequency Modulation Signal
Linear frequency modulation (LFM) signal is a typical radar wideband signal. Suppose baseband LFM signal is s 2 (t) = cos πµt 2 and we define that the maximum repetition period (MRP) is from t m1 to t m2 , where t m1 is the data point corresponding to the maximum value, and t m2 is the data point corresponding to the next maximum value. Although s 2 (t) is a non-periodic signal, it can be divided into three regions, X, Y, and Z, in one MRP. Region X is defined from t m1 to data point t 01 corresponding to the first zero value, region Y is from t 01 to data point t 02 corresponding to the second zero value, and region Z is from t 02 to t m2 .
Similar to a sinusoidal signal, it is easy to prove that regions X and Z are concave regions, and region Y is a convex region, i.e., X , Y ↔ , Z . In three regions, degree d 2 of any vertex is also related to left LVP t L and right LVP t R , as expressed by: Assuming that s 2 (t) owns 500 data corresponding to five MRPs. The degree and its distribution of s 2 (t) after being converted to visibility graphs G 2 by the visibility algorithm are shown in Figure 7. In Figure 7, every MRP has some vertices whose degrees are approximately equal, so degree distribution has a plurality of peaks with decreasing amplitude. Therefore, non-periodic baseband LFM signal ( ) 2 s t can still be converted to a regular network by the visibility algorithm, and degree distribution can be roughly expressed as multi-spike distribution as follows: where R N is the number of MRPs, i P is the probability corresponding to the i-th MRP, i d is the degree corresponding to the i-th MRP. Actually, the received signal includes not only the echo but also the noise signal obeying Gaussian distribution. In visibility graphs, it is easy to find that data with large values always have a large degree, and the degree of their adjacent data is relatively small. According to this property, the degree distribution of the Gaussian noise begins to exhibit Poisson distribution, but because of the existence of some vertices with large degree values (called hubs), the tail of the degree distribution gradually satisfies the exponential distribution. Therefore, Gaussian noise can be converted to an exponential random network by the visibility algorithm, whose degree distribution is different from sinusoidal and LFM signals which proves that degree distribution can distinguish radar signals and In Figure 7, every MRP has some vertices whose degrees are approximately equal, so degree distribution has a plurality of peaks with decreasing amplitude. Therefore, non-periodic baseband LFM signal s 2 (t) can still be converted to a regular network by the visibility algorithm, and degree distribution can be roughly expressed as multi-spike distribution as follows: where N R is the number of MRPs, P i is the probability corresponding to the i-th MRP, d i is the degree corresponding to the i-th MRP. Actually, the received signal includes not only the echo but also the noise signal obeying Gaussian distribution. In visibility graphs, it is easy to find that data with large values always have a large degree, and the degree of their adjacent data is relatively small. According to this property, the degree distribution of the Gaussian noise begins to exhibit Poisson distribution, but because of the existence of some vertices with large degree values (called hubs), the tail of the degree distribution gradually satisfies the exponential distribution. Therefore, Gaussian noise can be converted to an exponential random network by the visibility algorithm, whose degree distribution is different from sinusoidal and LFM signals which proves that degree distribution can distinguish radar signals and noises. Figure 8 illustrates the degree and its distribution of Gaussian noise with 300 data after converted to visibility graphs G 3 .
In additive Gaussian noise environment, the actual received signal is randomly increased or decreased by a certain amount based on the original signal, which can be converted into the problem of the change of time-series local values. For a datum on the convex or concave region, the change (increase or decrease) of a local value causes a degree change of more than two vertices, indicating that network-structure changes amplify time-series changes to some extent. Hence, features on visibility graphs are more sensitive to noise than other features on the time or frequency domain. We first consider the LFM signal with signal-to-noise ratio (SNR) is 10 dB, which owns 500 data. The degree and its distribution are shown in Figure 9. Comparing Figure 9 with Figure 7, it can be seen that the degree of each vertex changed significantly after the introduction of Gaussian noise. The head and tail of degree distribution still satisfy Poisson distribution and exponential distribution which meet the degree distribution of Gaussian noise. The middle of the degree distribution had a spike, just like the LFM signal. Accordingly, we can understand the degree distribution of the actual LFM signal as a transition from multi-spike distribution to the exponential distribution.  We first consider the LFM signal with signal-to-noise ratio (SNR) is 10 dB, which owns 500 data. The degree and its distribution are shown in Figure 9. We first consider the LFM signal with signal-to-noise ratio (SNR) is 10 dB, which owns 500 data. The degree and its distribution are shown in Figure 9. Comparing Figure 9 with Figure 7, it can be seen that the degree of each vertex changed significantly after the introduction of Gaussian noise. The head and tail of degree distribution still satisfy Poisson distribution and exponential distribution which meet the degree distribution of Gaussian noise. The middle of the degree distribution had a spike, just like the LFM signal. Accordingly, we can understand the degree distribution of the actual LFM signal as a transition from multi-spike distribution to the exponential distribution.  Comparing Figure 9 with Figure 7, it can be seen that the degree of each vertex changed significantly after the introduction of Gaussian noise. The head and tail of degree distribution still satisfy Poisson distribution and exponential distribution which meet the degree distribution of Gaussian noise. The middle of the degree distribution had a spike, just like the LFM signal. Accordingly, we can understand the degree distribution of the actual LFM signal as a transition from multi-spike distribution to the exponential distribution.

Jamming Signal
ISRJ, ISCJ, and PDTJ have a different number of MRPs while NPJ and SNCJ have noise characteristics similar to Gaussian noise in the time domain. That means that the degree distribution of different unconventional active jamming should have different characteristics owing to the above analysis. Because of the limitation of written space, we only show the degree distribution of ISRJ and NPJ in Figure 10 to verify the differences between the jamming. Therefore, we have the following conclusions: The degree distribution of the sinusoidal and LFM signals satisfies single-spike distribution and multi-spike distribution, respectively, and the degree distribution of Gaussian noise is an exponential distribution. Moreover, the degree distribution of the actual received signal is between the ideal signal and the noise signal. Most importantly, different types of unconventional active jamming have different degree distributions on visibility graphs, which prepares for extracting more features (not only degree features) on visibility graphs, as shown below.

Feature Extraction on Visibility Graphs
An adjacency matrix is a matrix used to characterize a graph adjacency relation.

Average Degree
In addition to degree distribution, the average degree is another simple and important concept that describes the properties of the whole network. It averages the degree of all vertices in the graph to get the average degree d , and is shown as: where i d is the degree of i v . In addition, visibility graphs are undirected and unweighted, and the sum of the elements of the row or column of the adjacency matrix is also called the degree, so that the average degree is the sum of the diagonal elements of 2 Α : 2 Figure 10. Degree distribution of jamming signals on visibility graphs (jamming-to-noise ratio (JNR) = 10 dB). (a) Interrupted-sampling repeater jamming (ISRJ); (b) noise productive jamming (NPJ).
Therefore, we have the following conclusions: The degree distribution of the sinusoidal and LFM signals satisfies single-spike distribution and multi-spike distribution, respectively, and the degree distribution of Gaussian noise is an exponential distribution. Moreover, the degree distribution of the actual received signal is between the ideal signal and the noise signal. Most importantly, different types of unconventional active jamming have different degree distributions on visibility graphs, which prepares for extracting more features (not only degree features) on visibility graphs, as shown below.

Feature Extraction on Visibility Graphs
An adjacency matrix is a matrix used to characterize a graph adjacency relation. For a given graph G of order N, its adjacency matrix A is an N × N square matrix. If v i and v j on graph G have connected edges, the corresponding elements (A) ij of matrix A represent the weights of the edges; otherwise, by definition, (A) ij = 0. Because visibility graphs are unweighted, (A) ij = 1 if v i and v j on G have edges. According to this definition method, the adjacency of vertices on the graph can be completely represented by its adjacency matrix.

Average Degree
In addition to degree distribution, the average degree is another simple and important concept that describes the properties of the whole network. It averages the degree of all vertices in the graph to get the average degree d, and is shown as: where d i is the degree of v i . In addition, visibility graphs are undirected and unweighted, and the sum of the elements of the row or column of the adjacency matrix is also called the degree, so that the average degree is the sum of the diagonal elements of A 2 : where tr A 2 represent the trace of matrix A 2 .
The average degree can reflect the stability of the signal, and the larger the average degree is, the more stable the signal, since Gaussian noise can cause a sharp drop at the end of the degree distribution.

Average Clustering Coefficient
There is a phenomenon that, if a vertex has multiple vertices directly connected to it, then these neighbor vertices may also be directly connected to each other. This feature that is used to represent the clustering situation of vertices in the network is the clustering coefficient [23]. The clustering coefficient C i of a vertex v i is defined as: where e i reflects the number of edges. When considering the undirected and unweighted graphs, the clustering coefficient of their vertices can also be obtained from: Averaging the clustering coefficients of all vertices, we obtain the average clustering coefficient of the network: obviously, 0 ≤ C ≤ 1. When C = 0, all vertices in the network are isolated vertices, and no edges are connected; when C = 1, the network is a complete graph, that is, all vertices have edge connections between them. For a completely random network, C → O(1/N) when the number of vertices is large enough. For jamming signals, the denser the vertices on visibility graphs are, the larger the average clustering coefficient.

Newman Assortativity Coefficient
In addition to the completely random network, there is always a correlation between different vertices, and degree distribution does not fully describe the characteristics of the network. In social networks, vertices tend to be connected with other vertices with similar degree values, which is referred to as assortativity; on the other hand, in biological networks, high degree vertices tend to attach to low degree vertices, referred to as disassortativity.
The assortativity coefficient calculation method based on the Pearson correlation coefficient was proposed by Newman [24]. This idea assumes that two vertices can be found by any one edge, and then two degrees are obtained. All edges are traversed, so that two sequences are obtained, and the Pearson correlation of the two sequences is analyzed, which is specifically defined as: where M represents the total number of edges of the network. Obviously, 0 ≤ |r| ≤ 1. When r > 0, the network is perfectly assortative; when r < 0, the network is completely disassortative; when r = 0, the network is non-assortative. Therefore, the Newman assortativity coefficient can reflect the autocorrelation of data after the signal is converted to visibility graphs to some extent.

Normalized Network-Structure Entropy
Entropy is a measure used to describe the degree of disorder in a system and is often used in thermodynamics to characterize the state of a substance. In random networks, the importance of each vertex is equivalent, and its structure is considered to be disordered and has large entropy; in scale-free networks, there are a small number of vertices with large degree values (called hubs) and many vertices with small degree values, that is, its degree-distribution curve tends to decrease. Therefore, its structure is considered to be ordered and has small entropy.
Network-structure entropy is used to more succinctly measure the order state of complex networks. It is defined as: where I i indicates the importance of vertex v i . It is easy to prove that a homogeneous network (I i = 1/N) has maximal entropy E max = ln N; for a completely inhomogeneous network (I 1 = 1/2, I i = 1/[2(N − 1)](i > 1)), the network has minimal entropy. Therefore, in order to eliminate the influence of number of vertices N on the entropy of the network structure, we normalize it to obtain normalized network-structure entropy E norm :

Simulation and Discussion
In order to verify the effectiveness of the above features on visibility graphs, simulation experiments were performed with the parameters of the radar signal and the jamming signal, shown in Tables 1 and 2. After the received signal is converted to visibility graphs: (a) Average degree; (b) average clustering coefficient; (c) Newman assortativity coefficient; and (d) normalized network-structure entropy is selected to perform unconventional-active-jamming recognition. JNR was set in steps of 1 dB, and 100 Monte Carlo simulations were performed in the range of 0-25 dB. Figure 11 shows the values of these features, respectively. From Figure 11a, it can be seen that the average degree of unconventional barrage jamming (SNCJ, NPJ) was almost independent (always near 6) of the JNR, and the average degree of unconventional deceptive jamming (ISRJ, ISCJ, and PDTJ) increased with the increase of the JNR. Each jamming showed good separability when the JNR was greater than 10dB, and separability was greater when the JNR increases. In fact, in order to exhibit dense false-target characteristics after matching the filter, the JNR of the above-mentioned jamming was always more than 10 dB, so the average degree could be used well to distinguish unconventional active jamming. Figure 11b shows that the average clustering coefficient of ISRJ increased with the increase of JNR while PDTJ decreased, and the average clustering coefficient of ISCJ was little affected by JNR. Through numerical analysis of the Newman assortativity coefficient of Figure 11c, it can be seen that the Newman assortativity coefficient of unconventional barrage jamming was always stable and SNCJ could be distinguished from NPJ. Moreover, unconventional barrage jamming on visibility graphs is assortative, since it has a positive Newman assortativity coefficient. Therefore, unconventional deceptive jamming can be distinguished through the average clustering coefficient and the Newman assortativity coefficient can be used as a visibility-graph feature of unconventional barrage jamming recognition.  Figure 11b shows that the average clustering coefficient of ISRJ increased with the increase of JNR while PDTJ decreased, and the average clustering coefficient of ISCJ was little affected by JNR. Through numerical analysis of the Newman assortativity coefficient of Figure 11c, it can be seen that the Newman assortativity coefficient of unconventional barrage jamming was always stable and SNCJ could be distinguished from NPJ. Moreover, unconventional barrage jamming on visibility graphs is assortative, since it has a positive Newman assortativity coefficient. Therefore, unconventional deceptive jamming can be distinguished through the average clustering coefficient and the Newman assortativity coefficient can be used as a visibility-graph feature of unconventional barrage jamming recognition.
As can be seen from Figure 11d that, after conversion to visibility graphs by the visibility algorithm, all jamming signals had very large normalized network-structure entropy. We can analyze unconventional deceptive jamming and barrage jamming separately: Unconventional deceptive jamming can be well-distinguished under the condition that JNR is greater than 10 dB through normalized network-structure entropy; unconventional barrage jamming changes little with JNR and is always distinguishable. Therefore, normalized network-structure entropy is a suitable feature for radar unconventional-active-jamming recognition. Random forests (RF) is a machine-learning method proposed by Leo Breiman and Adele Cutler for classification [25]. RF combines decision-tree classifiers with the Bagging algorithm, specific for using each sample subset obtained by the Bagging algorithm to construct the decision trees. The significance of constructing the RF classifier is to randomly generate multiple decision trees, and each decision tree does not need to have high classification accuracy (a weak classifier). At the same time, by using the Bagging algorithm to combine multiple decision trees, the over-fitting problem is wellsolved, and overall generalization ability is improved. As can be seen from Figure 11d that, after conversion to visibility graphs by the visibility algorithm, all jamming signals had very large normalized network-structure entropy. We can analyze unconventional deceptive jamming and barrage jamming separately: Unconventional deceptive jamming can be well-distinguished under the condition that JNR is greater than 10 dB through normalized network-structure entropy; unconventional barrage jamming changes little with JNR and is always distinguishable. Therefore, normalized network-structure entropy is a suitable feature for radar unconventional-active-jamming recognition.
Random forests (RF) is a machine-learning method proposed by Leo Breiman and Adele Cutler for classification [25]. RF combines decision-tree classifiers with the Bagging algorithm, specific for using each sample subset obtained by the Bagging algorithm to construct the decision trees. The significance of constructing the RF classifier is to randomly generate multiple decision trees, and each decision tree does not need to have high classification accuracy (a weak classifier). At the same time, by using the Bagging algorithm to combine multiple decision trees, the over-fitting problem is well-solved, and overall generalization ability is improved.
Therefore, we realized a radar unconventional-active-jamming recognition scheme by using the RF classifier, and an RF structure diagram was illustrated in Figure 12. The radar unconventional-active-jamming recognition method based on the RF algorithm is summarized as follows: 1.
Converting unconventional active jamming from the time domain to visibility graphs using the visibility algorithm; 2.
Extracting the average degree, average clustering coefficient, Newman assortativity coefficient, and normalized network-structure entropy as four-dimensional features; 3.
Applying the Bagging algorithm to extract training samples from four-dimensional features and corresponding jamming category labels. Training decision trees until the number of decision trees reaches the preset threshold; 4.
Generating RF structures according to Figure 12a; 5.
Sending the test samples to the RF classifier, and the classification results of all decision trees are voted according to Figure 12b. The category of jamming with the largest number of votes is taken as the final output of the algorithm.
In the RF classifier, we can further improve recognition performance by (a) adding number of decision trees N DT , (b) increasing maximum depth of decision trees D DT , (c) adding number of splits N S , and (d) changing the decision-tree splitting algorithm. Table 3 shows recognition performance based on the RF classifier under different parameters. According to Table 3, we found that the performance of the RF classifier was hardly affected by adding the number of splits, and could be improved well by adding the number of decision trees. Although the computational time of the algorithm increases rapidly with the deepening of the decision trees, it obviously improves the recognition probability under a different JNR, especially at a low JNR. Moreover, the ID3 algorithm is always used as the splitting algorithm of decision trees because other algorithms negligibly improve for small training samples and need the extra computational time. Therefore, in the actual situation, we prefer to choose a minor number of decision trees and splits, and appropriate decision-tree depth to ensure a balance between recognition probability and computational time.
We chose Groups 6 and 7 as the ultimate simulation parameters. The recognition probability of unconventional active jamming with a JNR setting from 0 to 25 dB, in 1 dB steps, is shown in Figure 13. In Figure 13a,b it is shown that, when the JNR was higher than 0 dB, the average recognition probability of the algorithm was over 90%. In Figure 13b, we can see that the recognition probability for each type of jamming was always higher than 95% when the JNR was 5 dB or higher. This simulation proves that the unconventional-active-jamming recognition method proposed in this paper is robust to the presence of noise in the jamming, and effective in identifying the type of jamming after the JNR is higher than 5 dB, which can often be satisfied.
3. Applying the Bagging algorithm to extract training samples from four-dimensional features and corresponding jamming category labels. Training decision trees until the number of decision trees reaches the preset threshold; 4. Generating RF structures according to Figure 12a    13. In Figure 13a,b it is shown that, when the JNR was higher than 0 dB, the average recognition probability of the algorithm was over 90%. In Figure 13b, we can see that the recognition probability for each type of jamming was always higher than 95% when the JNR was 5 dB or higher. This simulation proves that the unconventional-active-jamming recognition method proposed in this paper is robust to the presence of noise in the jamming, and effective in identifying the type of jamming after the JNR is higher than 5 dB, which can often be satisfied.

Conclusions
In this paper, a visibility-graph based unconventional-active-jamming recognition scheme is proposed. Theoretical analyses first showed that different types of jamming exhibit different degree distributions on visibility graphs. Then, other valid features on visibility graphs were calculated: Average degree, average clustering coefficient, Newman assortativity coefficient, and normalized network-structure entropy. Finally, an RF classifier based on these four features was proposed to achieve the recognition of unconventional active jamming. Numerical simulations demonstrated that average recognition probability is always greater than 97%, regardless of JNR, by reasonably selecting the classifier parameters. Further studies will be carried out on the combination of different types of jamming.