1. Introduction
Ground penetrating radar (GPR) is a surface geophysical method that utilizes high-frequency broadband electromagnetic waves (1 MHz–10 GHz) to detect and locate structures or objects in the shallow subsurface [
1,
2]. GPR has numerous characteristics, including a high resolution, strong anti-interference ability, and high efficiency, and this technique is nondestructive; consequently, GPR has been extensively used in many fields, such as geological exploration, water conservancy engineering, and urban construction [
3]. There has been an increasing tendency to use nondestructive testing techniques that do not alter the reinforcement elements of vulnerable structures, such as the combined methodology, which uses GPR and infrared thermography (IRT) techniques for the detection and evaluation of corrosion. In [
4], cracked cement concrete layers that are located below the asphalt layer in the case of rigid pavements were similarly investigated. Therefore, detection is a difficult task, and nondestructive surveys are, in many cases, applied to detect these types of damage. However, the difficulty in data interpretation limits their use [
5]. Furthermore, the GPR profiles are affected by various factors, such as complex and varying detection environments, the instrument system, and the data acquisition mode, which results in various forms of clutter and noise that reduce the quality of the radar signal. Therefore, it is particularly important to research fast and effective noise attenuation algorithms to obtain GPR data with a high signal-to-noise ratio (SNR) [
6,
7].
In recent years, many scholars worldwide have conducted a substantial amount of research on methods of attenuating GPR noise. Common noise attenuation algorithms include the curvelet transform, empirical mode decomposition (EMD), and the wavelet transform. To improve the clarity of GPR data in the process of underground pipeline positioning, a new method based on the curvelet transform that reduces clutter and profile artifacts to highlight significant waves was proposed, as reducing noise and removing undesirable items, such as clutter and artifacts, are important for highlighting these echoes; the experiments show that the qualitative and quantitative results of this method are satisfactory [
8]. However, because of strong linear interference, the conventional curvelet transform is ineffective for noise removal in this case, because it cannot adaptively remove noise according to the signal characteristics; hence, a method, called the empirical curve wave transform, which can suppress interference signals, was proposed and compared with the conventional curvelet transform. The results confirmed the effectiveness of the method [
9]. In addition, to remove noise from GPR echo signals, a denoising method that was based on ensemble EMD (EEMD) and the wavelet transform was presented in [
10]; as compared with other common methods, the EEMD-wavelet method improves the SNR. Ref. [
11] first used a complete EEMD (CEEMD) method to perform time-frequency analysis of data for processing GPR signal data. The CEEMD method was proven to solve the mode mixing problem in EMD and significantly improve the resolution for EEMD processing when the GPR signal to be processed has a low SNR, thereby effectively avoiding the disadvantages of both EMD and EEMD. The results show that, in a comparison with EMD and EEMD methods, CEEMD obtains higher spectral and spatial resolution, and it also proves that CEEMD has better characteristics. To further reduce random GPR noise that is based on denoising using EMD, an EMD technique in combination with basis pursuit denoising (BPD) was developed and provided satisfactory outputs [
12]. Ref. [
13] extended
EMD to form a semiadaptive dip filter for GPR data to adaptively separate reflections at different dips. Ref. [
14] used the two-dimensional Gabor wavelet transform to process signals and proposed a new denoising method to be solved when extracting the reflected signals of buried objects. In a comparison with the
filter, the effectiveness of this method was proven. Another alternative is the drumbeat-beamlet (dreamlet) transform. Because the dreamlet foundation automatically satisfies the wave equation, it can provide an effective way to represent the physical wave field. Ref. [
15] theoretically deduced the representation of the damped dreamlet and reported its geometric explanation and analysis. Furthermore, a GPR denoising approach that was based on the empirical wavelet transform (EWT) in combination with semisoft thresholding was proposed, and a spectrum segmentation strategy was designed that accounted for different frequency characteristics of different signals; this method achieved better performance than CEEMD and the synchrosqueezed wavelet transform (SWT) [
16].
Nevertheless, all of the above algorithms are based on domain transformation. A significant number of scholars have researched strategies involving the sparse representation (SR) of signals or signal processing combined with morphology in order to further improve the noise attenuation performance and increase the data fidelity. According to the correlation of a signal, the eigenvalues and corresponding eigenvectors were obtained by decomposing the covariance matrix of GPR data, and a linear transformation was applied to the GPR data to obtain the principal components (PCs), where the lower-order PCs represent the strongly correlated target signals of the raw data and the higher-order PCs represent the uncorrelated noise; thus, the target signal was extracted, and uncorrelated noise was effectively filtered out by principal component analysis (PCA) [
17]. Implementing the SR of a signal is an effective method that can use the sparsity and compressibility of noisy data to estimate the signal from noisy data; in this method, signal estimation can be achieved by relinquishing some unimportant bases and eliminating random noise. Ref. [
18] derived a damped SR (DSR) of a signal; a damping operator is employed in the DSR to obtain greater accuracy in signal estimation. Additionally, based on the physical wavelet, a seismic denoising method that was based on sparse Bayesian learning (SBL) was developed in [
19]. In the SBL algorithm, the physical wavelet can be estimated based on various seismic and even logging data and correctly describe the different characteristics of these different seismic data. Moreover, the physical wavelet can adaptively estimate the trade-off regularization parameter that is used to determine the quality of noise reduction according to the updated data mismatch and sparsity. In the iterative process. Through comprehensive and real seismic data examples, the effectiveness of the SBL method has been proven.
Another conventional technique, namely, time-domain singular value decomposition (SVD), introduces pseudosignals that did not previously exist when eliminating the direct waves and poorly suppresses the random noise surrounding the nonhorizontal phase axes. To resolve these inadequacies, an SVD method in the local frequency domain of GPR data based on the Hank matrix was proposed, and a comparison showed that this method could improve the suppression of random noise in proximity to nonhorizontal phase reflections [
20]. In addition, a new dictionary learning method, namely structured graph dictionary learning (SGDL), was recently proposed by adding the local and nonlocal similarities of the data via a structured graph, thereby enabling the dictionary to contain more atoms with which to represent seismic data; the SGDL method was shown to effectively remove strong noise and retain weak seismic events [
21]. In [
22], the authors addressed the denoising of high-resolution radar image series in a nonparametric Bayesian framework; this method imposes a Gaussian process (GP) model on the corresponding time series of each pixel and effectively denoises the image series by implementing GP regression. Their method exhibited improved flexibility in describing the data and superior performance in preserving the structure while denoising, especially in scenarios with a low SNR. Furthermore, the authors of [
23] proposed a modified morphological component analysis (MCA) algorithm and applied their technique to the denoising of GPR signals. The core of their MCA algorithm is its selection of an appropriate dictionary by combining the undecimated discrete wavelet transform (UDWT) dictionary with the curvelet transform dictionary (CURVELET). The modified MCA algorithm was compared with SVD and PCA to confirm the superior performance of the algorithm. The authors first put forward the expression of the basic principles and the methods of mathematical morphology. Subsequently, they combined the Ricker wavelet and low-frequency noise to form a synthetic dataset example for testing the MCA method in order to verify the feasibility and performance of the MCA method. According to the results of the synthesis example, the proposed method can effectively suppress the large-scale low-frequency noise in the original data and, at the same time, it can slightly suppress the small signals that exist in the original data. Finally, the proposed method was applied to field microseismic data, and the results are encouraging in [
24]. The authors of [
25] developed a novel algorithm based on the difference in seismic wave shapes and introduced mathematical morphological filtering (MMF) into the attenuation of coherent noise. The morphological operation is calculated in the trajectory direction of the rotating coordinate system, and the rotating coordinate system is established along the coherent noise trajectory to distribute the energy of the coherent noise in the horizontal direction. When compared with other existing technologies, this MMF method is more effective in rejecting outliers and reduces artifacts. A new method was proposed for enhancing the GPR signal. It is based on a subspace method and a clustering technique. The proposed method makes it possible to improve the estimation accuracy in a noisy context. It is used with a compressive sensing method to estimate the time delay of layered media backscattered echoes coming from the GPR signal [
26].
Most of the above noise attenuation algorithms are based on the SR strategy of signals and they adopt domain transformation to process the data. Nevertheless, these approaches are all based on a fixed transformation basis and cannot self-adjust according to the characteristics of various signals. Hence, these methods cannot accurately represent the signal when encountering complex GPR signal data. Thus, it is necessary to develop an adaptive transform basis denoising method that is based on the characteristics of GPR data. A deep convolutional denoising autoencoder (CDAE) is one possible solution, which is a new method of random noise attenuation based on a deep learning architecture that is a type of unsupervised neural network learning algorithm. Deep CDAEs are mainly composed of two types of networks: encoders and decoders. In the context of this research, the encoder encodes noisy GPR profile data into multiple levels of abstraction to extract the 1D latent vectors containing important features, while the decoder decodes the 1D latent vectors containing the feature information to reconstruct the noise-free signal and, thus, eliminate random noise. Such algorithms are often used in the fields of noise attenuation and image generation.
Models that are based on deep learning show great promise in terms of noise attenuation. However, the disadvantages of these methods are that a large number of training samples are required and the computational costs are very high. Refs. [
27,
28] showed that denoising autoencoders constructed using convolutional layers with a small sample size can be used to effectively denoise medical images and they can combine heterogeneous images to increase the sample size, thereby improving denoising performance. In [
29], the authors proposed using deep fully convolutional denoising autoencoders (FCDAEs) instead of deep feedforward neural networks (FNNs), and their experimental results showed that deep FCDAEs perform better than deep FNNs, despite having fewer parameters. In addition, a very novel data preprocessing method is proposed. This method uses data points between adjacent samples to obtain a set of training data. To obtain a better SR, they constructed standard penalty points that are based on the combination of the standard penalty points of
and
, and a comparison with normal denoising autoencoders verified the superiority of this method [
30]. Ref. [
31] proposed the deep evolving denoising autoencoder (DEVDAN); it has an open structure in the generation phase and differentiation phase, which can automatically and adaptively add and discard hidden units; in the generation phase, they use the dataset (unlabeled) to improve the prediction performance of the discriminative model, optimize and modify the discriminant model from the data in the generation phase, and, finally, achieve a dynamic balance and improve the accuracy of the overall model prediction. Ref. [
32] developed a new denoise/decomposition method that is based on deep neural networks, called DeepDenoiser. The DeepDenoiser network uses a mask strategy. First, the input signal is decomposed into signals of interest and uninteresting signals. These uninteresting signals are defined as noise. The composition of this noise includes not only the usual Gaussian noise, but also various nonseismic signals. Subsequently, nonlinear functions are used to map the representation into the mask, and these nonlinear functions are finally used to learn and train the data SR in the time-frequency domain. The DeepDenoiser network that is obtained through training can suppress noise according to the minimum change of the required waveform when the noise level is very high, thereby greatly improving the SNR. DeepDenoiser has clear applications in seismic imaging, microseismic monitoring and environmental noise data preprocessing. More recently, Ref. [
33] proposed a new method that is based on the deep denoising autoencoder (DDAE) to attenuate random seismic noise.
In summary, the conventional noise attenuation methods can be roughly divided into four categories: 1. methods based on a fixed transformation basis, 2. methods based on a sparse representation, 3. methods based on morphological component analysis, and 4. methods based on deep learning.
Table 1 summarizes these strategies.
GPR signals attenuate more rapidly than seismic signals, and GPR waveforms are more complicated due to the different methods of observation. If a deep CDAE is directly applied to attenuate the noise of theoretical synthetic GPR data and field data, the GPR profile will be distorted and the attenuation of noise will be incomplete due to overfitting, an incorrect size of the local receptive field, and the representational bottlenecks and vanishing gradients that are encountered in deep learning. To solve these problems, the authors have modified the structure of deep CDAEs and optimized the network structure consisting of a dropout regularization layer, an atrous convolutional layer, and a residual-connection structure. Furthermore, a modified deep CDAE strategy that is based on network structure optimization is proposed, namely, convolutional denoising autoencoders with network structure optimization (CDAEsNSO), which consists of atrous-dropout CDAEs (AD-CDAEs) and residual-connection CDAEs (ResCDAEs), all of which effectively improve the performance of conventional CDAEs. CDAEsNSO exhibits a strong noise attenuation capability and good adaptability to different data and various types of noise and it does less damage to the information in the original profile, thereby maintaining a high level of fidelity.
8. Conclusions
When compared with data domain transformation-based methods and SR-based techniques, convolutional denoising autoencoders (CDAEs) that are based on deep learning can adjust themselves according to the features of signals. CDAEs represent a kind of unsupervised learning neural network that can adapt to the denoising algorithm. However, when CDAEs are directly applied to attenuate the noise of GPR data, they encounter various problems, including overfitting, the size of the local receptive field, and the representation bottlenecks and vanishing gradients that are typical of deep learning. Therefore, the authors proposed some network structure optimization strategies, such as the addition of a dropout regularization layer, an atrous convolution layer and a residual-connection structure, and obtained a new GPR noise attenuation algorithm, namely CDAEsNSO.
CDAEsNSO, which was proposed based on CDAEs, can effectively remove random noise and Gaussian spike impulse noise from GPR data. Moreover, the proposed algorithm does little damage to useful waveforms, such as information of reflected waves, diffracted waves, and multiples, in the original profile and maintains high data fidelity, effectively improving the noise attenuation effect. At the same time, this method also has certain limitations. For example, a large amount of data is required as learning samples during network training, and the network is a computationally intensive operation during the training process, which requires a higher level of computer equipment. In terms of network training, it is also more time consuming than other algorithms. However, once the network training is completed, the model can be used directly to achieve end-to-end processing operations. In the final processing stage, when compared with other algorithms, it does not need to be recalculated.
Therefore, the efficiency of data processing is higher.
The GPR profiles contain considerable amounts of redundant information. Nevertheless, the detailed features in a whole profile are sparse. Therefore, we proposed the sliding window method to process GPR profile data to obtain training datasets by combining heterogeneous images to boost the sample size for improved noise attenuation performance. This method not only avoids the redundancy of information while obtaining more detailed waveform characteristics, but also utilizes fewer GPR data to obtain more training datasets that meet the training requirements of CDAEs.