Knowledge-Enhanced Compressed Measurements for Detection of Frequency-Hopping Spread Spectrum Signals Based on Task-Specific Information and Deep Neural Networks

The frequency-hopping spread spectrum (FHSS) technique is widely used in secure communications. In this technique, the signal carrier frequency hops over a large band. The conventional non-compressed receiver must sample the signal at high rates to catch the entire frequency-hopping range, which is unfeasible for wide frequency-hopping ranges. In this paper, we propose an efficient adaptive compressed method to measure and detect the FHSS signals non-cooperatively. In contrast to the literature, the FHSS signal-detection method proposed in this paper is achieved directly with compressed sampling rates. The measurement kernels (the non-zero coefficients in the measurement matrix) are designed adaptively, using continuously updated knowledge from the compressed measurement. More importantly, in contrast to the iterative optimizations of the measurement matrices in the literature, the deep neural networks are trained once using task-specific information optimization and repeatedly implemented for measurement kernel design, enabling efficient adaptive detection of the FHSS signals. Simulations show that the proposed method provides stably low missing detection rates, compared to the compressed detection with random measurement kernels and the recently proposed method. Meanwhile, the measurement design in the proposed method is shown to provide improved efficiency, compared to the commonly used recursive method.


Introduction
In both military and civilian secure communications, the spread spectrum (SS) techniques have been widely used [1]. In these techniques, the spectra of base-band signals are spread into a much wider band. Thus, the SS signals are more resistant to interference or jamming, as common interference or jamming signals can only affect a small fraction of the spread spectrum. Moreover, to catch the entire spread spectrum, the conventional non-cooperative receivers must operate at a high sampling rate. This, in turn, makes the SS signals resistant to non-cooperative detection or interception. The frequency-hopping spread spectrum (FHSS) is one of the most used SS techniques. In the FHSS, the carrier frequency of the signal altered rapidly in a pseudo-random manner, so that the base-band signal is spread into the frequency-hopping range.
The non-cooperative detection of the FHSS signal is the first step of the entire signal interception procedure [2]. Although various methods has been rendered since the 1990s (e.g., methods based on time-frequency analysis [3][4][5][6][7][8], wavelet analysis [4,[9][10][11][12][13], auto-correlation analysis [9,[14][15][16], likelihood analysis [17][18][19][20][21], etc.), energy thresholding is the most commonly used in FHSS signal detection [22][23][24]. To reduce sampling rate while observing the entire frequency-hopping range, some strategies, such as the channelized filter bank [23,25] and the sweeping spectrum analyzer [20,[26][27][28][29], were rendered by dividing the entire spectrum into subbands and observing the individual subbands with relatively lower sampling rates. The signal is claimed to be detected if the energy is higher than the threshold in at least one observed subband. However, as the signals only fall in a relatively narrow band at any given observation time in this scenario, most of the measurements are normally done to background noises, leading to relatively high false-alarm rates.
In 2006, the theory of compressed sensing (CS) was rendered [30,31]. The CS theory states that a signal can be recovered from sub-Nyquist samples with overwhelmed probability, if it can be sparsely represented based on a transform or overcomplete dictionary. In most of the existing literature on the CS-based signal detections, the sparse representations of the signals were exploited, where the dictionaries were assumed to be known [32][33][34][35] or established based on the signal properties. However, an intermediate step of signal reconstruction [36,37] was also proposed in some of these methods. Other detection methods from compressed measurements assume the precise knowledge of the signal expressions. More recently, Liu et al., proposed a method to directly detect the FHSS signals from compressed samples [38,39]. Besides the random measurement kernels (i.e., the non-zero coefficients in the measurement matrix) used in most of the CS literature, a strategy to design the measurement matrix prior to the measurements was also rendered with improved detection performance. However, the measurement kernel design was made based on expensive recursive optimizations of Shannon information. Therefore, it was not practical for online adaptive implementations, as a quick response is usually required. Later, the information optimized compressed measurements [40] and informationbased pattern recognition of FHSS signals [41] were also studied by researchers. In 2020, Wang et al., proposed a partial discrete Fourier transform (DFT)-based method to design the measurement matrix for FHSS signal detection, where the mutual information between the signals and the measurements was also considered [42]. However, the measurement matrix was designed prior to the measurement process. Therefore, the measurements could not be adjusted adaptively according to the posterior knowledge of the signals. In addition, the signal samples or priori knowledge of the communication protocol are also required in the training the measurement matrices in that method.
To efficiently extract the key features from the signals, fast signal feature extraction methods were studied over decades. Although methods such as principal component analysis, linear component analysis, independent component analysis, supporting vector machine, etc. were studied in various scenarios, the logics that can be represented within such methods were constrained. In the 1970s, artificial neural networks (ANNs) were proposed to model the logics between the input data and their features or processed results. The ANN was originally rendered to solve the signal classification problem. The model optimizations were done using the training procedure. In recent decades, with the development of the parallel computing and the graphic processing unit (GPU) techniques, the deep neural networks (DNNs) are enabled and is now adopted in various areas in signal processing, such as parameter estimations [43], audio and image encoding [44,45], etc. Efforts were also devoted on the study of the robust DNN training [46]. Recently, the DNNs were also proposed to estimate the parameters of the FHSS signals [47,48]. However, as a common problem with these methods, there is a lack of adaptivity in their implementations.
In this paper, we propose an efficient and adaptive method to detect the FHSS signals non-cooperatively. The adaptivity of this method is achieved with the fusion of the posterior knowledge enhancement algorithm and the DNNs. The posterior knowledge is gained with the gradually increased task-specific information (TSI) [49] in the detection task, while the DNNs are trained to adaptively design the measurement kernels given the updated knowledge of the measured signal. Our proposed method includes several novel contributions: (1) The FHSS signal-detection method proposed in this paper is achieved directly from the low-rate sampling results without the reconstruction of the original signal. (2) The quantitative Shannon information is analyzed based on the posterior information of the channel output and is used in the measurement kernel design for the following measurements, which ensures improvement in the FHSS signal-detection accuracy. (3) More importantly, with an effective combination of the TSI optimization theory and the DNNs in this paper, the inefficiency in the existing information-based method of the compressed measurement matrix design is solved. In particular, in contrast to the iterative optimizations of the measurement matrices in the literature, the DNNs are trained once based on the TSI optimization and are repeatedly implemented to detect the FHSS signals in an efficient manner. Thus, the practical online adaptivity of the FHSS measurement and detection can be achieved with the method proposed in this paper. From the signal processing aspect, the adaptivity in the FHSS signal processing based on the DNNs is also achieved.
The remainder of this paper is organized as follows: In Section 2, the problem formulation, including the signal and compressed measurement and detection models, are first rendered. In Section 3, the principles of the energy detection and the adaptive measurement kernel design based on the TSI optimization are described. Then in Section 4, the proposed adaptive measurement and detection method of the FHSS signals based on the fusion of the posterior information optimization and the DNNs is detailed. In Section 5, we provide the simulation results to verify the proposed method. Finally, in Section 6, the conclusions are drawn. It is worth mentioning that although the analysis and simulations were performed with the assumptions that single FHSS signal is present in the frequency band of interest, the proposed method can also be used for the multiple FHSS signal case.

Problem Formulation
In this paper, we propose a method to measure and detect the FHSS signals compressively, adaptively and efficiently. The proposed method was verified through simulations, where the Gauss binary frequency shift keying FHSS signal of the Bluetooth standard [50] were used, as a representative of the FHSS signals. Within each hopping period, the expression of the Bluetooth signal is given as follows: where T s and E s represent the symbol period and the energy of the signal in a symbol period, respectively. N s represents the number of symbol periods in a hopping period. h is the modulation index of the FHSS signal. c r ∈{−1, 1} (1 ≤ r ≤ N s ) is the rth symbol content in the hopping period. The function g(t) is the Gauss filtering item and can be expressed as: where is the Q-function and α = π T s √ log (2) . As specified in the Bluetooth standard, the carrier frequency of the signal randomly hops among 79 channels from 2.402 to 2.480 GHz, with 1 MHz bandwidth for each channel. The symbol period takes the value of 1 µs and the frequency-hopping period is 625 µs, i.e., 625 symbol periods.
In this paper, the proposed adaptive measurements and detection of the FHSS signals are done compressively and non-cooperatively using the framework described in Figure 1. In Figure 1, the wireless channel output is first passed through an input band-pass filter to remove the frequency components that are out of the range of interest. The output from the band-pass filter is then multiplied with the measurement kernels, and then integrated using a low-pass filter. The result from the low-pass filter is sampled at a compressed sampling rate, compared to the Nyquist rate respective to the entire FHSS hopping range. The measurement results are collected, and the measurement energy is calculated. Finally, the resulting energy is used to determine if the signal is present. The measurement kernels are designed sequentially and adaptively based on posterior knowledge of the channel output in a row-by-row manner, given the existing measurement data. Therefore, feedback is added from the measurement data to the measurement kernel design module.
The compressed measurement using the framework described in Figure 1 can be modeled as: where the N × 1 vector x represents the Nyquist sampled result from the input lowpass filter, regarding to the entire FHSS hopping range. The M × 1 (M << N) vector y represents the vector of the measurement data. Let us denote the FHSS frequency-hopping range as B. Then, using the framework in Figure 1, the M × N measurement matrix Φ is a block diagonal matrix with each block as an 1 × CR vector, where is defined as the compression ratio (CR) in the compressed measurement. With the system framework described in Figure 1, each row block in the measurement matrix is defined as the measurement kernel of a single measurement. In this work, we normalize each row of Φ to unit energy before using them in the measurements. Thus, the rows of Φ are orthonormal to each other. Based on the compressed measurements, the detection decision is made from the two following hypotheses: where H 0 and H 1 denote the signal absent and signal present hypotheses, respectively. n denotes the channel noise, which is modeled as complex Gaussian white noise. The noise variance is denoted by σ 2 n in this paper. s denotes the Nyquist rate sampled FHSS signal of s(t) in Equation (1) regarding to the entire FHSS hopping range.
In this paper, the signal detection is conducted based on the energy of the measurement data, and the measurement matrix is designed adaptively based on the gradually obtained measurement data. The principles of the proposed methods are detailed in the following sections.

The Theory of the Adaptive FHSS Signal Compressed Measurement and Detection
According to the noise folding theory [51], as the rows of the measurement matrix for Figure 1 are orthonormal to each other, the noise components in the measurement data are identically and independently distributed (i.i.d.) zero-mean complex Gaussian components with the variance σ 2 n . Then, the probability density function (PDF) of the measurement energy in the signal absent case can be modeled as follows: where λ denotes the energy of the measurement data, and M! represents the factorial operation of the number of compressed measurements M.
The signal detection is conducted by energy thresholding. More specifically, given a positive threshold T, the theoretical false positive rate (FPR, i.e., false-alarm rate) is: The measurement kernels in Figure 1 is adaptively and sequentially designed for each single measurement at a time, based on the measurements that have been already obtained and the TSI optimization. In this paper, we assume that no frequency hops happen during a FHSS detection process. Then, we define the TSI in the signal-detection task as the mutual information between the pre-filtered channel output and the measurement result, conditional on the existing measurement kernels and data. With the kernel of the first measurement randomly initialized, the measurement kernel in the kth (2 ≤ k ≤ M) measurement is designed by solving the following problem: represent the measurement kernel and the measurement result during the vth measurement, respectively. Φ k , x k and y k represent the measurement kernel, prefiltered channel output and the measurement result, which are to be designed, observed and obtained at the kth measurement, respectively. · l 2 represents the l − 2 norm operation.
According to the information theory, where h(·|·) denotes the conditional entropy. To simplify, we further assume that the measurements made at different times are independent of each other. Then, h(y k |x k , Λ k−1 ,Φ k ) = h(y k |x k ,Φ k ) only depends on the variance of the channel noise, and thus is a constant. Therefore, the optimization problem in Equation (8) is equivalent to: To solve the optimization problem in Equation (10), we model the pre-filtered channel output during the kth compressed measurement, i.e., x k , using the mixture of Gaussian (moG) models, which were usually implemented to solve signal processing problems with information analysis and optimizations in the literature [39,[52][53][54]. In this paper, we uniformly divide the entire FHSS frequency-hopping range into L subbands and model the posterior distribution of x k as: are the probabilities of signal absent, signal present cases and the probability that the lth subband is occupied in the signal present case, respectively, given the measurement kernels and data from the 1st through the (k − 1)th measurements. The item f 0 (x k ) = CN(0, C nn ) represents the zero-mean complex Gaussian component of the signal absent case, where C nn is the diagonal covariance matrix with the diagonal entries as σ 2 n . f l (x k ) = CN(0, C xx,l ) represent the Gaussian component where the lth subband is occupied in the signal present case. The covariance matrix in this case can be expressed as: where C ss,l represents the covariance matrix of the noise-free FHSS signal falling in the lth subband, modeled as complex Gaussian white noise within the lth subband. With the signal model in Equation (11), it can be proved that the item for the signal absent case does not affect the result of the measurement optimization problem, and Equation (10) is equivalent to: where The posterior probabilities of the subband usages given the signal present hypothesis, i.e., P r (B l |H 1 , Λ k−1 ) (1 ≤ l ≤ L) in Equation (11), are updated as the adaptive measurements proceeds, based on the Bayes rule. With the measurement results modeled as independent to each other, the Bayes update can be expressed as: where P r (B l |H 1 , Λ 0 ) = P r (B l |H 1 ) represents the prior probability that the lth subband is used in the signal present case. The likelihood function p r (y k−1 |B l , Φ k−1 ) is given by: where (·) H represents the Hermitian operation.
In the literature, the Shannon information-based optimization problems are usually solved with recursive gradient methods [41,50,51]. For the optimization problem in Equation (13), an update step for the measurement kernel in the recursive optimization process can be expressed as: where Φ (u) k and Φ (u+1) k represents the resulting measurement kernel at the kth row of the measurement matrix at the uth and (u + 1) t h iterations, respectively. µ is the optimization step. According to Equation (14), the gradient can be found by: For interested readers, the derivations to Equations (13), (14) and (18) are provided in Appendix A.

Knowledge-Enhanced Compressed Detection of Frequency-Hopping Spread Spectrum Signals with Deep Neural Networks
Theoretically, the signal detection with a measurement matrix from the recursive optimization in Equation (17) can acquire improved detection accuracy, compared to the compressed detection with random measurement kernels. However, according to simulations, it usually needs more than 20,000 iterations to converge the optimization process and result in significantly improved detection performance, which usually leads to a time-consuming process. Therefore, it is not feasible for online adaptive measurement kernel design and signal-detection implementations. To improve the efficiency of the method, we propose an DNN-based method to conduct the adaptive measurements and detection of the FHSS signals. In contrast to the recursive method described in Equation (17), the neural networks in the proposed method are trained once and used for adaptive measurement kernel design, repeatedly.

The Structure and the Training of the Deep Neural Networks
The structure of the DNNs in the proposed method is shown in Figure 2. As described in Figure 2, the architecture of the proposed DNNs is fully connected. The nodes in the input layer represent posterior probabilities of the subband usage given the designed coefficients in the measurement matrix and the measured results in the signal present case, i.e., P r (B l |H 1 , Λ k−1 ) (1 ≤ l ≤ L) in Equation (15). The width of the input layer is equal to the number of subbands divided in the moG model of the FHSS signals. The output layer of the DNN contains the CR designed coefficients in the measurement kernel of a measurement. Considering that signals and the coefficients of the measurement kernels are in the complex form, the width of output layer is 2CR, where CR nodes represent the real part, and the others represent the imaginary parts. If the depth of neural network is too high, the training process becomes difficult to converge; while if the depth of the neural network is too low, the resulting measurement matrix may not be effective enough. With simulation trials on various structures of the DNNs, we find that the 8-layer deep neural networks are efficient to train, and meanwhile effectively improve the accuracy in the adaptive FHSS signal detection. For the six hidden layers in the proposed DNN, the width of 1st through the 5th hidden layers is 512 and the width of the 6th hidden layer is 1024. With the nodes in the input layer denoted with the row vector q 0 , the nodes in the mth (1 ≤ m ≤ 6) hidden layers are then calculated by: where q m and q m−1 (m≤1) are row vectors, representing the nodes in the mth and (m − 1)th layers of the DNN, respectively. W m is the matrix of weights in the matrix multiplication to obtain nodes of the mth layer. tanh(·) is the entry-wise hyperbolic tangent activation function, which can be expressed as: In our study, we found that deterministic input-output relationships in the feedback route could lead to a wrong convergence for the adaptive measurement and decision procedure of the entire system, if there exists even a little ideality during the training of the DNN. To solve this problem, we add a dropout layer after the 6th hidden layer, with a dropping rate of 0.95. The dropout layer works in both the training process of the DNN and the adaptive FHSS measurement processing afterwards. Finally, a full-connection operation is added to generate the output layer. Therefore, we have: where Dropout(·, 0.95) represents the dropout operation that replaces 95% of the entries in the vector with zeros. W m is the matrix of weights in the matrix multiplication to obtain nodes of the output layer. q out represents the nodes in the output layer. In this work, the training of the DNNs was conducted using the gradient-based backpropagation method. The posterior subband usage probabilities from the simulations of the adaptive measurement and detection process using the conventional recursive optimization method were collected and used as the training data. The negativity of the conditional differential entropy in Equation (13), −h(y k |H 1 , Λ k−1 , Φ k ), was used for the training penalty, which could be approximated as a function of the data to input layer and the result from output layer, i.e., P r (B l |H 1 , Λ k−1 ) and Φ k , according to Equation (14). The number of subbands in Equation (14) was taken to be L = 20.
The DNN training was conducted using the TensorFlow 2.0 GPU version [55] based on Python 3.7 on a computer with the NVDIA Quadro P2000 GPU. Each of the proposed DNNs was trained using 20,000 training samples with the batch size of 100 and 400 training epochs in total. To ensure that the noise components in the measurement results were i.i.d. Gaussian components, the designed coefficients from the DNNs were further normalized to unit l − 2 norm before used in the measurement.

Combination of Knowledge-Enhanced Compressed Detection Architecture and the Deep Neural Networks
With the trained DNNs, the proposed procedure of the adaptive compressed measurement and detection of the FHSS signals can be described in Figure 3. In the adaptive procedure described in Figure 3, the coefficients of the measurement kernel in the first measurement are initialized using complex Gaussian identically independent distributions, and then normalized to unit l − 2 norm. The prior probabilities of the subband usage in the signal present case, i.e., P r (B l |H 1 ) (1 ≤ l ≤ L) are taken to be equal to each other, as no prior knowledge of the subband usage is assumed.
The first measurement result is obtained using the initialized measurement kernel. Then the posterior probabilities of the subband usages are calculated according to Equation (15), and passed to the trained neural network to design the measurement kernel for the next measurement. In this manner, the measurements, the subband usage posterior probability updates and the measurement kernel design steps with the DNN are done sequentially and iteratively until the entire measurement procedure is finished. Finally, the energy of the measurement data, i.e., the resulting λ in Figure 3, is used to make the detection decision. As described in Section 3, if measurement energy is smaller than the decision threshold T, the signal absent decision is made; otherwise, the signal present decision is made.

Results
In this section, we provide Monte-Carlo simulation results to evaluate the performance of the proposed method. We first studied the detection accuracy performance of the proposed adaptive compressed detection method. In comparison, the non-compressed detection and the conventional compressed detection methods using the system in Figure 1 were simulated. In the case of conventional compressed detection method, the mea-surement kernels were selected according to the identical independent complex Gaussian distributions, and then each row of the measurement matrix was normalized to unit energy. In addition, the FHSS signal compressed detection method [42] rendered in 2020, which was conducted based on partial DFT and maximum energy thresholding of the compressed samples, with each measurement kernel taking up an entire row of the measurement matrix, was also studied in comparison. To obtain fair comparisons of the signal-detection performances for different methods, we took energy thresholds T for the proposed method, the non-compressed method and the conventional compressed method according to Equations (6) and (7) by taking the theoretical FPR as 0.01 (In fact, any FPR value between 0 and 1 is valid). The threshold for the partial DFT-based method at each CR and each SNR was determined by taking the simulated FPR as 0.01 from 10,000 simulations.
The simulated curves of missing detection rates versus the signal-to-noise ratio (SNR) are shown in Figures 4 and 5, where the CRs were taken to be 10 and 20, respectively. The DNNs to design the measurement kernels were trained using the TensorFlow 2.0 GPU version based on Python 3.7 on a computer with the NVIDIA Quadro P2000 GPU. To train the DNN for the adaptive measurement kernel design at CR = 10, it took 5.36 days. To train the DNN for the adaptive measurement kernel design at CR = 20, it took 6.61 days. The SNR is defined as the ratio between the signal power and the noise power. As specified above, the number of Nyquist samples during the measurement procedure were N = 6400, resulting in M = 640 and M = 320 for the two CR cases in the compressed measurement methods. To generate the curves in Figures 4 and 5, the SNR varied from −30 dB to 20 dB. Each point in the curves was generated using 100,000 simulations. The FHSS carrier frequency at each simulation was randomly selected from the 79 channels with equal probabilities. From the two figures, we observe that at any CR and certain false-alarm rate (i.e., false positive rate), the missing rates (i.e., false negative rate) from all the four methods in comparison decrease with increased SNRs. Non-compressed detection achieves the lowest missing rate for most cases. The partial DFT-based compressed detection method can obtain good detection performance at median SNR values. However, for higher SNR cases, the missing detection rates can be even significantly higher than the conventional compressed detection with random measurement kernels. The adaptive compressed detection method, although obtaining higher missing rates than the non-compressed detection method and obtaining higher missing rates than the partial DFT-based compressed detection method at some median SNR values, outperforms the compressed method with random measurement kernels for low and median SNR values at any CR. This improvement in terms of missing rate can even be around an order at some SNRs. For high SNR values (above −6.5 dB at CR = 10 and above −5 dB at CR = 20), the missing detection rates of the proposed method fall below 0.001, are close to those of the conventional compressed detection method with random measurement kernels. To have a deeper insight into the procedure of the adaptive measurements in the proposed method, we performed a further analysis on the power spectra of the designed measurement kernels. A larger value of the measurement kernel power spectrum value within the true subband that the FHSS signal falls in indicates a higher SNR resulted in the measurement data. In turn, higher detection accuracy would be expected. As a representative, Figure 6 shows the averaged power spectrum value of the adaptive measurement kernels on the true FHSS subbands versus the measurement index at CR = 20 and SNR = −10 dB over 100,000 adaptive compressed detection simulations. In each of the simulations, the carrier frequency was randomly selected from the 79 channels. In comparison, the averaged power spectrum value of the measurement kernels on the true subbands in the partial DFT-based method and that of the conventional random measurement kernels versus the measurement index at CR = 20 over 100,000 compressed detection simulations are also plotted in Figure 6. In Figure 6, we observe that the averaged power spectrum value of the adaptive measurement kernels within the true subbands that the FHSS signals fall in increases gradually as more measurements are done. In this case, a gradual reduction of the subband usage uncertainty is obtained. In contrast, there are no incremental trends of averaged power spectrum value for the random measurement kernels and the measurement kernels from partial DFT-based method during the measurement process, as those measurement kernels are designed prior to the measurement processes.
In addition to the simulations on the detection accuracy and the measurement kernel components above, we also conducted simulations to compare the time costs of the proposed adaptive detection method using the DNNs and that using recursive measurement kernel optimization method discussed in Section 2. In addition, the time cost partial DFTbased compressed detection method in [42] was also compared. The cases at CR = 10 and CR = 20 were both studied, and 100 FHSS detection simulations at SNR = −10 dB were done for each of the methods at each CR case. To achieve the detection accuracy of the proposed method at such CR values, 20,000 iterations are usually needed for the recursive method to design the measurement kernel of a single measurement, which was implemented in the simulations. Similar to the theoretical analysis and simulation discussed above, 6400 Nyquist samples respective to the entire FHSS hopping range were included to decide whether the signal is present or absent for each simulation, resulting in 640 and 320 measurements in each detection simulation for CR = 10 and CR = 20, respectively. A computer with the CPU of Intel Xeon E3-1225 v5 @ 3.30 GHz and the RAM size of 16.0 GB was implemented to run these simulations. The statistics of the timing results to conduct each of these 100 detection simulations are shown in Table 1.
From Table 1, we observe that as the measurement matrix for the partial DFT-based method is designed non-adaptively prior to the measurement process, the partial DFT-based method cost the least time in the measurement and detection process. More importantly, from the timing results of the two adaptive methods, we observed that the efficiency of the proposed adaptive method is significantly improved, compared to the method with recursively optimized measurement kernels. In the case of CR = 10, this improvement can be more than 320 times on average; while in the case of CR = 20, the efficiency improvement can reach more than 380 times on average. Although the time costs of the proposed method as stated in Table 1 still seems relatively long for the practical FHSS detections, the implementation efficiency can be expected to be further improved considerably with specifically designed hardware and software modules to implement the DNN and the signal detection in this paper. The designing of such modules will be studied in our future work.

Conclusions
In this paper, a knowledge-enhanced compressed measurement method was proposed for adaptive and non-cooperative detection of the FHSS signals using the DNNs. In contrast to the conventional non-compressed receiver, which was unfeasible for wide frequency-hopping bandwidths, the proposed method in this paper conducted the FHSS signal detection with compressed sampling rates. The measurement kernels were designed adaptively based on the continuously updated knowledge from the compressed measurement. Moreover, in contrast to the iterative measurement kernel optimizations, the DNNs was trained once off-line based on the TSI optimization, and implemented repeatedly online to adaptively design the measurement kernels, enabling efficient FHSS signal detection. Simulation results showed that the proposed adaptive compressed detection method achieved stably low missing detection rates, compared to the compressed detection system with random measurement kernels and the recently proposed work. In addition, through the simulations, we also showed that the efficiency of the proposed adaptive FHSS detection method with the implementation of the DNNs was proved to be significantly higher than that using the recursive measurement kernel optimization methods. Thus, the measurement kernel design procedure improved its efficiency significantly, and became much more practical for the online adaptive measurements and detection of FHSS signals.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:
In the measurement kernel design stage, the pre-filtered signal from the wireless channel is modeled with an moG distribution as described in Equation (11). As a compressed measurement can be treated as a weighted sum of neighbored Nyquist samples, the distribution of the measured result y k (1 ≤ k ≤ M), given the measurement kernel Φ k and the measurement kernels and data in the 1st through the kth measurements is also an moG distribution: where P r (H 0 |Λ k−1 ), P r (H 1 |Λ k−1 ) and P r (B l |H 1 , Λ k−1 ) represent the posterior probabilities of the signal absent case, the signal present case and the probability that the lth subband is occupied in the signal present case, respectively, given the measurement kernels and data during the 1st through the (k − 1)th measurements (i.e., Λ k−1 ). g 0 (y k |Φ k ) and g l (y k |Φ k ) (1 ≤ l ≤ L) denote the Gaussian component in the signal absent case and the Gaussian component where the lth subband is occupied in the signal present case, respectively. Let CN(·, ·) stand for the PDF of the complex Gaussian distribution with the first and second parameters representing the mean and covariance matrix/variance for the distribution, respectively. As the rows of the measurement matrix are normalized to unit energy, we have: and where C nn is diagonal matrix with its diagonal entries equal to the channel noise variance ss + C nn , with C (l) ss representing the covariance matrix of the noise-free FHSS signal that falls into the lth subband in the signal present case.