A Local Search Maximum Likelihood Parameter Estimator of Chirp Signal

: A local search Maximum Likelihood (ML) parameter estimator for mono-component chirp signal in low Signal-to-Noise Ratio (SNR) conditions is proposed in this paper. The approach combines a deep learning denoising method with a two-step parameter estimator. The denoiser utilizes residual learning assisted Denoising Convolutional Neural Network (DnCNN) to recover the structured signal component, which is used to denoise the original observations. Following the denoising step, we employ a coarse parameter estimator, which is based on the Time-Frequency (TF) distribution, to the denoised signal for approximate estimation of parameters. Then around the coarse results, we do a local search by using the ML technique to achieve ﬁne estimation. Numerical results show that the proposed approach outperforms several methods in terms of parameter estimation accuracy and efﬁciency.


Introduction
Chirp signals have a broad range of applications, such as in radar, sonar, communication, and medical imaging [1,2]. Chirp usage and estimation of their parameters, i.e., the initial frequency f 0 and chirp rate k, is a significant part in the digital signal processing area [3,4]. The main trends in research on chirp signal parameter estimation are improving estimation accuracy, increasing computational efficiency, and enhancing adaptation to low SNRs [5].
At present, TF analysis is an efficient tool to analyze the behavior of nonstationary signals [3]. From TF distribution we can see which frequencies exist at a given time instant. TF transform, e.g., Short-Time Fourier Transform (STFT) [6] and Wigner-Ville Distribution (WVD) [7,8], can convert signals from one-dimensional time domain to two-dimensional TF domain. The Instantaneous Frequency (IF) of a chirp signal is a linear component with time. So, by using line detection methods, e.g., Radon transform [9,10] and Hough transform [11][12][13], which perform integral along all potential lines in TF domain, convert the task of lines detection into locating the maximum peak in the parameter's domain after integral. Directly use the TF based methods have been practiced to be effective and practical for detecting and estimating chirp signals. But mostly, the signals we sampled are blurred by noises, only from the TF domain can hardly obtain chirp information, and the methods based on Radon-Wigner Transform (RWT) [8,10] or Wigner-Hough Transform (WHT) [11] have heavy computations.
The FRFT is also shown to correspond to a representation of the signal on an orthonormal basis formed by chirps, which are essentially shifted versions of one another [14]. So, FrFT have a good property of energy concentration for chirp signal on a proper transform order. After the FrFT transform of a chirp signal, we can estimate the parameters by searching the peak position in the FrFT domain. Although FrFT performs very well in chirp signal detection, however, in terms of parameter estimation, it is highly influenced by signal length, sample rate, and searching step size.
It has been proved that the ML estimator [21,22] obtains the best performance for finite data samples and is asymptotically optimal in the sense that it achieves the Cramér-Rao Lower Bound (CRLB) [23,24]. Directly use ML estimator to chirp parameters estimation involves a two-dimensional grid search of compressed likelihood function that is limited by computational consideration for high estimation accuracy.
Another approach to chirp parameter estimation is based on deep learning. Datadriven and deep learning-based methods in the digital signal processing area have been of significant interest in recent years. Xiaolong Chen applies Convolutional Neural Network (CNN) for replacing the Fourier transform and FrFT, uses it for single frequency signal and LFM signal detection and estimation [25], it has proved that the CNN based method can achieve good recognition performance in low SNR conditions. Hanning Su addresses the problem of estimating the parameters of constant-amplitude chirp signals that have single or multiple components and are embedded in noise based on a Complex-Valued Deep Neural Network (CV-DNN) [5]. Simulation results indicate that the CV-DNN outperforms conventional processors. And it is more accurate and faster than the WHT, which will enable real-time signal processing with fewer computational resources. Furthermore, they demonstrate that the CV-DNN shows strong robustness to the changes in modulation parameters and the number of components of a chirp signal. Besides chirp signals, Yuan Jiang proposes a deep learning denoising based approach for line spectral estimation [26], by using CNN to preprocess sinusoidal signals in the time domain and offers a substantial improvement in line spectral estimation. As for denoising, Se Rim Park proposes a CNN method to remove noises from speech signals for enhancing the quality and intelligibility of speech [27], the training date and validation data set are the STFT amplitude spectra of the noisy and clean audio signals respectively and then transform back to the time domain by using inverse STFT with amplitude spectra of CNN output and the phase spectrum of the noisy signal. Deep learning also plays an active role in the field of signal detection and classification. Huyong Jin proposes a CNN-based framework to perform preamble detection for underwater acoustic communications applications [28], which can learn features from the TF spectrum and can give an efficient solution for preamble detection under complicated underwater acoustic communications. For signal classification, Johan Brynolfsson uses WVD instead of the spectrogram as basic input into CNN to classify one-dimensional non-stationary signals and has achieved good performance [29].
Inspired by the above ideas, in this paper, we propose a local search maximum likelihood parameter estimator based on the deep learning denoising approach for chirp parameter estimation in low SNR conditions. As shown in Figure 1, the proposed approach consists of a DnCNN, which is trained to perform denoising for noisy observation chirp signals, and a local search estimator applied on the denoised signals and the observations. We explain how DnCNN is composed and the details of the local search estimator. Numerical results show that the proposed approach can yield a substantial improvement in estimation accuracy and efficiency over the RWT and FrFT. The benefit is attributed to the DnCNN denoiser, which can reshape the signal of observation, leading to a great improvement of SNRs.

Deep Learning-Based Denoiser
Driven by the easy access to the large-scale dataset and the advances in deep learning methods, there have been several attempts to handle the denoising problem by deep neural networks [30][31][32][33]. In this section, we present the design of the DnCNN used for denoising and the associated training process.
The overall framework of DnCNN is shown in Figures 2 and 3. We modify the ResNet [34,35] to make it suitable for the reshaped signal image array denoising as shown in Figure 3. Unlike the residual network that uses many residual units (i.e., identity shortcuts), our DnCNN employs three single residual units to predict the latent clean signal. With such a residual learning strategy, the network can be easily trained and help ease optimization [35,36].  Generally, DnCNN has a layer structure that includes an input layer, more hidden layers, and a regression output layer. The input of DnCNN is a real mono-component chirp signal blurred by additive white Gaussian noises. We cut the observation sequences y(n) into equal-length segments at regular intervals and splicing them form lift to right in chronological order to form a two-dimensional image matrix, denoted as Y 0 ∈ R m×n . The regression output layer computes the half-mean-squared-error loss for regression problems. The optimization goal of training is to minimize the half-mean-squared-error loss between regression output y i (n) and the clean labels x i (n), then the loss function for training is shown in Equation (1).
where M is mini-batch size, y i (n) ∈ R N×1 x i ∈ R N×1 , N is the length of observation.
During the training process, we use mini-batch gradient descent with Adam optimization algorithm [37] to evaluate the gradient of the loss function and backpropagate to update DnCNN weights. To effectively remove noises of observation, we need to train the DnCNN with a diverse set of training signals with different parameters and SNR conditions that cover the signal of interest. One important factor in generating the training data is the SNR. As the SNR of the test signal is unknown, DnCNN has to be trained with a range of SNR values.

Local Search Maximum Likelihood Parameter Estimator
After denoising, parameter estimation is performed by local search estimator using denoised signal y (n) and observation y(n). As shown in Figure 4, the proposed approach consists of a TF coarse estimator, which uses TF distribution of y (n) to coarsely estimate the initial frequency f 0 and chirp rate k , then an ML fine estimator applied on the f 0 , k and y(n) to get the fine estimation results of the initial frequencyf 0 and chirp ratek.

TF Coarse Estimator
A simple and effective parameter estimation method of the chirp signal is to estimate the IF. The IF represents the instantaneous frequency at different time, which combines the initial frequency f 0 and chirp rate k in a linear and clear relationship. TF coarse estimator directly uses the IF information of y (n). So, its performance is highly influenced by the time-frequency localization and energy concentration about the IF. It is well known that the classical WVD provides a high-resolution TF representation of a mono-component chirp signal [38]. Here we employ WVD for coarse estimation. One important factor that should be noticed in TF coarse estimator is the SNR, since we directly use IF, only when the SNR is high enough, the result is acceptable. We will analyze the improvement of SNR between y (n) and y(n) later.
For a discrete signal y (n) with N samples, the WVD is defined as Equation (2).
For an ideal chirp signal, its WVD is an impulse spectral line along its IF, finite length chirp signal's WVD is dorsal fin shape as shown in Figure 5. We can see that the spectrogram of the chirp signal obtains the maximum value at the frequency point f 0 + k t at each discrete time t n . Therefore, the IF of y (n) can be estimated by maximizing the spectrogram at each discrete time as Equation (3). Considering that there are 2L+1 discrete IF points, then the initial frequency f 0 and chirp rate k can be obtained by statistically averaging as Equations (4) and (5).

ML Fine Estimator
After TF coarse estimation, ML fine estimation is performed, the previous result f 0 , k is used to narrow the search scope of the compressed likelihood function of observation y(n). It is worth noting that f 0 , k is estimated by DnCNN's output y (n), however the final estimation results of the ML fine estimator is based on y(n), we will analyze the estimation accuracy of y (n) and y(n) later.
Consider an observation y(n) with additive Gaussian white noise, as shown in Equation (6).
where the parameters, chirp rate k, and initial frequency f 0 , are unknown. The data model expressed in (10) can be written in matrix form as Equation (7).
Under the ML principle, the ML estimate of y(n) is obtained by maximizing the compressed likelihood function, which describes as Equation (9).
H denotes the conjugate transpose. It can be observed that obtaining ( f 0 ,k) will require a multidimensional grid search over the two-parameter vectors, which is of high computational load. To avoid that, we resort to the local search method by limit the parameter vectors' scope to a small area, this operation is based on the result of the TF coarse estimator, which describes as f ∈ [f 0 − σ 1 , f 0 + σ 1 ] and k ∈ k − σ 2 , k + σ 2 , where σ is limitation width. We can see that the accuracy of the TF coarse estimator and the scope of σ directly affect the performance of the proposed estimator, we will analyze it later.

Simulation and Analysis
In this section, we present simulation results to demonstrate the performance of the proposed approach for denoising and parameter estimation.

Training Set and Test Set of DnCNN
In our simulation, the number of sampling points of the observation y(n) is fixed to N = 513. The midpoint of the signal is the origin of time. The sample frequency is 256. There are only chirp signals and additive white Gaussian noises in the system without other interference or noise. The parameter change scopes of the training set and validation set are f 0 ∈ [9, 13], k ∈ [4,7], and 0.2 as an interval, as shown in Figure 6a. The parameter scope of the test set is the same as the training set, but the parameters are randomly generated, and accurate to one decimal place. The SNR of the training set varies from -8 dB to -3 dB at an interval of 1dB. We randomly generate m i signals with the same parameter f 0 and k under the specific SNR as shown in Table 1. The parameter combination has covered all values within the parameter setting range. The validation set size is one-fifth of the training set. For the test set, we randomly generate 500 signals, and SNR varies from -8 dB to 1 dB for each signal as shown in Figure 6b. In the training process of DnCNN, we set the reshaped signal image array Y 0 ∈ R 36×57 . The mini-batch size is 128. The maximum number of the epoch is 4. Before each training epoch, we shuffle the training data. The initial learning rate is 0.0002 and updates the learning rate every epoch by multiplying with the learning rate drop factor 0.9. The training is completed on NVIDIA TITANV GPU with MATLAB Deep Learning Toolbox.

Numerical Results and Analysis
For simplicity, the proposed method is called LsML, TF coarse estimator is called TFCE. The estimator that uses denoised signal y (n), as shown in Figure 7, is called DnCNN-XXX. We compare LsML with six other methods including RWT [8] and FrFT [15], which directly use the observation y(n), TFCE, DnCNN-FrFT, DnCNN-RWT and DnCNN-LsML. Figure 8 compare the estimation performance of the above methods in estimating chirp ratek and initial frequency f 0 . Here, the rotation angle of RWT is from 85 to 91 degrees with an interval of 0.01, the fractional power of FrFT is from 0.96 to 1.08 with an interval of 0.0001, the performance metric is the natural logarithm of Root Mean Squared Error, as described in Equations (10) and (11).
where M = 500 is the number of signals in the test set, f 0i and k i are the true value, f 0i and k i are estimation results. The parameter search step of LsML is 0.02, we set the limitation width σ as f ∈ [f 0 − 0.15, f 0 + 0.15] and k ∈ k − 0.35, k + 0.35 . Table 2 lists the average runtime of 100 signals conducted by these methods.    It can be concluded from these results that the proposed LsML has an excellent performance for chirp parameter estimation. By comparing the results of FrFT, RWT, DnCNN-FrFT, and DnCNN-RWT in Figure 8a, we can see that after DnCNN denoising, the accuracy ofk is significantly improved. This means that y (n) has a higher SNR than y(n). But by comparing the results of LsML and DnCNN-LsML in Figure 8a,b, we can see that observation y(n) is more suitable as input to the ML fine estimator. This means that after denoising of DnCNN, although the SNR of y (n) has improved a lot, as shown in Figure 9, the signal parameters have changed slightly. As analyzed before, the accuracy of TFCE determines the value of limitation width σ, which will further affect the performance of LsML. The impacts of different σ on estimation accuracy of LsML are shown in Figure 10a,b. We can see that when the SNR is very low, we need a high value of limitation width σ.  For generality and comparability, we conducted simulation experiments in the background of additive Gaussian noise. Generally, as long as the spectrum width of noise is much larger than the bandwidth of the system, and the spectral density in this bandwidth can basically be considered as a constant, then we can treat it as white noise. For example, thermal noise and shot noise have uniform power spectral density in a wide frequency range; they can usually be considered as white noise. In specific application scenarios, there are different noise environments. As for the special interference under certain conditions, it needs to be studied separately and explored in-depth, e.g., the Doppler shift and multipath in the field of underwater acoustic engineering. Because the time domain characteristics of the signal have changed a lot, the convolutional network may not be able to extract appropriate features from the training set based on the data labels. The method proposed in the article may not necessarily apply to these interferences.

Conclusions
In this paper, three primary ideas are proposed. First, a DnCNN-based method for extracting highly structured chirp signals in low SNR conditions is proposed. It takes advantage of deep learning feature extraction ability to recover the noiseless chirp signals. Second, a TFCE is proposed, we directly use the IF of denoised signal to pre-estimate chirp parameters, which not only have a considerable accuracy but also contribute to improving the performance of ML. Finally, by resorting to the local search method, a non-iterative local searching estimator based on ML is proposed. Simulation results show that the proposed LsML outperforms several methods in terms of parameter estimation accuracy and efficiency in low SNR scenarios.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest:
The authors declare no conflict of interest.