Learning-Based Clutter Mitigation with Subspace Projection and Sparse Representation in Holographic Subsurface Radar Imaging

: The holographic subsurface radar (HSR) is an effective remote sensing modality for survey-ing shallowly buried objects with high resolution images in plan-view. However, strong reﬂections from the rough surface and inhomogeneities obscure the detection of stationary targets response. In this paper, a learning-based method is proposed to mitigate the clutter in HSR applications. The proposed method ﬁrst decomposes the HSR image into raw clutter and target data using an adaptive subspace projection approach. Then, the autoencoder is applied to carry out unsupervised learning to extract the target features and mitigate the clutter. The sparse representation is also combined to further optimize the model and the alternating direction multiplier method (ADMM) is used to solve the optimization problem for precision and efﬁciency. Experiments using real data were conducted to demonstrate that the proposed method can effectively mitigate the strong clutter with the target preserved. The visual and quantitative results show that the proposed method achieves superior performance on suppressing clutter in HSR images compared with the widely used state-of-the-art clutter mitigation approaches.


Introduction
Microwave imaging has been successfully used as a non-destructive remote sensing modality in subsurface targets surveys, including structural assessment [1], landmine detection [2][3][4], and geological exploration [5,6]. Holographic subsurface radar (HSR) is one such technology that uses electromagnetic waves in a frequency band with a narrow width at several discrete frequencies and employs plane scanning of a surface to record subsurface radar holograms with high resolution in plan-view and with low radar cost [7,8]. However, the visibility of shallow buried targets in HSR images is usually obscured by clutter contamination, such as surface reflection, antenna coupling, and the inhomogeneities scattering response [9].
Several techniques were proposed to mitigate the clutter. One simple and convenient method is the mean subtraction (MS) [10]. MS can be regarded as a filter in the timespace domain by averaging the ensemble of radar data and subtracts the mean to reduce the clutter, but this approach will cause distortion to the intensity of target response. Parameter estimation works well when the model parameters are estimated accurately, while how to achieve this estimation is still an open issue [11]. Time gating is an intuitive and simple method to suppress direct coupled waves and ground surface reflection, but application of time gating is prohibited in HSR due to the limited bandwidth and range resolution [12]. The subspace projection methods, such as singular value decomposition (SVD) [13][14][15], principal component analysis (PCA) [16], and independent component

Holographic Subsurface Radar Model
Unlike GPR B-scan, which is formed along either the cross-track or down-track direction as columns, HSR signals are collected by the radar systems in plan-view scanning of a surface. In this case, each element of the radar echo matrix is an integral of the received signals in the antenna beamwidth at the corresponding scanning location. A simple scenario of HSR imaging through a medium is shown in Figure 1, where a coordinate system is defined to represent the region of interest. For the simple case of a continuous wave (CW) HSR, the received radar signal at the location (x, y) can also be represented as s(x, y). To form a HSR image, the 2-D fast Fourier transform is performed on s(x, y) to generate the saliency representation S(k x , k y ), where k x and k y denote the wavenumber variables of X and Y directions parallel to scanning plane. Then, a matched filter is imposed on S(k x , k y ) in the wavenumber domain as H(k x , k y ; z) = j2π(2k 0 ) 3 z 3 k z 4 e jk z z (1) where z denotes the depth in media away from surface, k 0 = 2π f ε r c denotes the total wavenumber and k z = k 0 2 − k x 2 − k y 2 denotes the wavenumber variables of Z direction.
Finally, the 2-D holographic image can be obtained by implementing the 2-D inverse fast Fourier transform on the results of matched filtering as where IFFT2 denotes the 2-D inverse fast Fourier transform. The optimal holographic image can be obtained when the value of z is taken as the actual depth of targets [35][36][37]. In this article, the formed image is a hologram of amplitude and the size is interpolated to 512 × 512 pixels by zero-padding in frequency domain during imaging. The operating frequency is an important parameter when designing a HSR system, which can be selected according to lateral resolution. For certain transmitting signals, the lateral resolution is related to the accumulated observation angle. Ideally, assume the antennas are omnidirectional and scan along the principal X and Y directions. In this case, the observation angle can reach the range [−π, π] and the lateral resolution is [7].
where λ is wavelength of the operating signal.

Theory of Autoencoder
An autoencoder is a type of unsupervised learning artificial neural network that uses an encoder and a decoder to reconstruct the input data. Figure 2 shows the network structure of our autoencoder, used to mitigate clutter. The encoder contains five convolution layers with the kernel of (3, 3), and each convolutional layer is followed by rectified linear unit (ReLU) as the activation function of the convolutional layers which can avoid the phenomenon of gradient vanishing compared to other activation functions (like Sigmoid or tanh). The maxpooling operation is performed with 2 × 2 windows and a stride of 2 to downsample the feature maps after the first and last convolutional layer. The decoder part can be understood as the reverse operation of the encoder to restore the latent vector and ultimately obtain the profile with the same dimensions as the encoder input, therefore completing the denoizing of the images. The input layer and hidden layer form the encoder to compress the high-dimensional input into a low-dimensional latent-space representation. On the other hand, the output layer works as the decoder to reconstruct the input by the inverse mapping from the latent-space representation. Autoencoders have been widely used in data denoizing, dimensionality reduction, and image generation [38][39][40][41][42]. In [26], an autoencoder-based clutter mitigation method was proposed due to the advantage that this technique requires neither prior information regarding the penetrable medium characteristics nor analytic framework to describe the through-medium interference. Instead, the cluttered radar images are considered to be the noisy input and their corresponding target images are treated as clean output. Then, the autoencoder-based algorithm learns how to denoize and clean the corrupted images with training data of clutter and target images. Mathematically, the training dataset is comprised of K cluttered images {X 1 , X 2 , ..., and their corresponding clean target images {X t 1 , X t 2 , ..., X t K } K i=1 , which are captured by subtracting the background (radar signals collected from the same scene with targets removed) from the raw signals. For the input clutter image X, image denoizing process of the autoencoder can be divided into two steps. In the first step, the encoder tries to learn a compressed latent representation Z of the input X, i.e., where f (·) is a nonlinear activation function, and W is the weight matrix of the encoder.
In the second decoder step, an estimated output target imageX is reconstructed by the inverse mappingX =Ŵ Z whereŴ is the weight matrix of the decoder. Weight matrixes W andŴ can be optimized during the training stage by minimizing the loss function

Test of the Standard Autoencoder
For intuitional explanation of the autoencoder performance, two test sets were collected at the Laboratory of Cognitive Radar, National University of Defense Technology, Changsha, China. The stepped-frequency signals, covering a 10-30-GHz frequency band with a step size of 10 MHz, were implemented with the Vector Network Analyzer. The first set (Scene I) was acquired from a scene with a metal circular ring against a concrete slab. Photographs of the measurement setup are depicted in Figure 3. The concrete slab was constructed as a wall with a thickness of 3 cm and relative permittivity (approximately) equals to 6. For the ring target, the radiuses of the inner and outer circle are 2 cm and 4 cm, respectively. The antennas were mounted on a 2D-scanning frame (with a dimension length of 25 cm and width 25 cm with a scanning step of 0.5 cm) and positioned 5 cm above from the slab. All settings of the second scene (Scene II) were the same except that the slab thickness was changed to 5 cm and the measurement geometry can also refer to Figure 3. The autoencoder was trained with a set of 5000 cluttered images and their corresponding target images obtained by background subtraction (80% for training and 20% for validation, the same as in the following article). The training data were recorded of 10 various scenes by our HSR systems, which can record a hologram less than 20 s. In each scene we obtained 500 training images by setting different kinds of targets or targets at different locations with a certain type of of medium such as concrete, planks, bricks, etc.   Figure 4 shows the formed HSR holograms of these two test sets and their corresponding output images obtained with the standard autoencoder, respectively. For quantitative analysis, the Signal to Clutter Ratio (SCR) was adopted to assess the effectiveness of clutter mitigation processing. The SCR can be calculated as where I(p) is the p-th pixel in the image. N c and N t are the number of pixels in the clutter region R c and target region R t , respectively. Figure 4a presents the raw holographic image of Scene I obtained at 10 GHz, where the target can be noticed but the clutter is also obvious. Figure 4b shows the image after applying the autoencoder. Comparing Figure 4a,b, we can see the autoencoder-based algorithm effectively removes clutter in the HSR imaging result with a SCR improvement of 11.0 dB. However, in Figure 4c, we can hardly see the target signature since the useful signal suffers extra attenuation through the thicker concrete of Scene II and the surface reflections are dominant in the HSR image. By employing the standard autoencoder, as shown in Figure 4d, the target response is still obscured by the heavy clutter. One reason for the poor performance of the autoencoder in Scene II could be that the target image is totally overlapped by strong clutter, and in this case, the massive redundant information increases the difficulty of extracting target information. It could also be an overfitting problem due to the limited training pairs. Furthermore, we double the amount of radar data to train the autoencoder. Figure 5a,b shows the formed images of Scene I and Scene II after clutter mitigation by the retrained autoencoder, respectively. Compared to Figure 4b, the target image in Figure 5a is slightly clearer and the SCR is improved by 0.7 dB. The image of the target is still not distinguishable in Figure 5b, which indicates that simply increasing the amount of training data is not the key to improving the performance of standard autoencoder when the target response overlaps with strong clutter.

Denoising Autoencoder for HSR with Subspace Projection
In this section, the adaptive SVD approach is introduced into the autoencoder model [29]. This SVD method concludes that the subspace distribution of target response is related to the variance of left singular vectors based on analysis of the signal cross-correlation characteristics and provides a reliable reference for removing the clutter and extracting the target signal when the obscure barrier is relatively homogeneous. Since the real scene could not be completely homogeneous, a relatively clearer target image with most of the clutter removed is usually obtained after the adaptive SVD. Therefore, the target image with residual clutter can be considered noisy. Then, if we use the target images produced by subspace projection and their corresponding clean target images for training, the clutter mitigation is alleviated by a denoizing process. For the input cluttered image X i , output target image of the denoizing autoencoder model can be written aŝ whereX t i is the target image estimated by the denoizing autoencoder, Y i is the target subspace calculated by the adaptive SVD in X i , E(·) and D(·) represent the encoder and decoder, respectively. In the modified model, 5000 subspace projection matrixes and their corresponding real target images are similarly applied to train the autoencoder, therefore reducing the redundant information and improving the extraction of salient information. Figure 6a shows the imaging result of Scene II obtained after the adaptive SVD. With calculated clutter subspace removed, the target response is much stronger than that shown in Figure 4c. Figure 6b presents the output image produced by the denoizing autoencoder with subspace projection, which has less clutter than those from standard autoencoder or the adaptive SVD, i.e., in Figures 4d and 6a.

Learning-Based Clutter Mitigation with Subspace Projection and Sparse Representation
By using the denoizing autoencoder with subspace projection, a clear target image tensor with most of the clutter removed is obtained, yet, there is residual clutter that may raise false alarms. Inspired by the success of Robust Principal Component Analysis (RPCA) in GPR clutter mitigation that separates the clutter and target image by solving a LRSD problem [20], we optimize the autoencoder model to further mitigate the clutter under the RPCA framework.
Mathematically, an observed HSR data matrix X can be decomposed as three components, i.e., the clutter matrix C, the target matrix T and the noise matrix N whose Frobenius norm is assumed to satisfy N F ≤ ε for ε > 0. Thus, the recorded data can be modelled by Since directly extracting the target matrix from the model is a non-convex problem, RPCA treats clutter as a low-rank matrix and targets as a sparse matrix based on LRSD theory. Then Equation (9) can be rewritten as where L ∈ R M×N and S ∈ R M×N correspond to the low-rank clutter and sparse target parts, respectively, thus the problem in Equation (10) can be solved by following convex optimization [21]. min where · * denotes the nuclear norm used to sum singular values of the matrix, · 1 denotes the l 1 norm which calculates the sum of the absolute values of the matrix entries, and λ is a positive regularization parameter that controls the sparsity of the sparse matrix S. Under certain noise sparsity and rank upper-bound assumptions, the low-rank clutter matrix L and sparse target matrix S can be recovered. However, the performance of RPCA on HSR images is not so good as that on radar B-scan images due to the difficulty for appropriate low-rank matrixes to represent clutter in such plan-view holograms where clutter distributes homogeneously [29]. Therefore, the proposed method catches the sparse target component in a similar way to RPCA, i.e., by the l 1 norm, but inherits the subspace selection capability of the adaptive SVD to suppress the strong HSR clutter. In addition, by combining the denoizing autoencoder terms, the clutter mitigation model can be reformulated as a constrained optimization problem min C,T where Y is the initial target estimated by the adaptive SVD, and α and β are the regularization parameters. The first item of model is used to constrain that the calculated target matrix approximates the result of the autoencoder. Instead of setting the initial parameter values to zero by RPCA, the proposed model applies clutter and target matrixes obtained by the adaptive SVD as initial values to iterate, therefore improving the extracting of salient information for more optimal solutions. The constrained problem Equation (12) can be addressed by the following augmented Lagrangian function [30].
where Z denotes the Lagrangian multiplier and µ > 0 is a penalty parameter. Then the variables C, T, and Z can be solved by the ADMM algorithm which updates each variable alternately by minimizing the augmented Lagrangian function with other variables fixed. Therefore, (13) is decomposed as following subproblems.
T k+1 = arg min Since subproblem (14) is a least-squares problem regularized by a nuclear norm penalty, it can be solved by singular value shrinkage [30]. The shrinkage operator T (a, b) is defined as follows Then we can update C k+1 by applying the singular value thresholding [43] SVD( where U and V are unitary matrixes, and Σ is a diagonal matrix of singular values. Subproblem (15) can be rewritten after some algebraic manipulations as where C is a constant term independent of T. To solve the squares problem regularized by the l 1 norm, we can use the shrinkage operator as Suppose the size of an acquired holographic image is M×N(M≤N), the solving procedures are summarized in Algorithm 1. The proposed clutter mitigation algorithm estimates the clutter and target matrix by solving several subproblems, which have an overall computational complexity of O(M 3 + N 2 + N 3 ) for each iteration. Even though the proposed model is slightly more computationally expensive than RPCA, which has an overall computational complexity of O(MN 2 ) for each iteration, it is much more efficient than MCA which has an extra requirement for appropriate dictionaries. Considering an image of size 512 × 512, the running-time of RPCA, MCA and the proposed algorithm in the same platform are 1.8 s, 291.4 s and 4.1 s, respectively.
The structure of the proposed method is shown in Figure 7. First, the raw dataset of various scenes mentioned above is acquired by our radar system. Then, the input and out datasets are obtained for supervised training by applying the adaptive SVD and background subtracting to the raw dataset. The autoencoder can learn how to denoize or clean the cluttered images using training data comprising both cluttered and clean data. After training the denoizing autoencoder, the proposed clutter mitigation model is performed by minimizing a constrained optimization problem with the adaptive SVD and joint sparsity constraints. Finally, a test image out of the training set is employed to evaluate the performance of the model.

Regularization Parameters Tuning
Two regularization parameters, α and β, are used in the proposed algorithm to control the amount of target and clutter information in solutions. For instance, setting a large value of β mitigates more background clutter at the expense of removing weak target response. Therefore, selecting appropriate regularization parameters plays a significant role in suppressing clutter while maintaining the sparse target matrix. In RPCA [20], the regularization parameter is designed to 1/ max(M, N). However, this formula may not be appropriate for the proposed algorithm since the autoencoder term and extra regularization parameters are introduced into the model. Grid search and random search are often employed to determine the regularization parameters. However, these methods are timeconsuming for a large searching boundary. Parameters tuned with domain knowledge is efficient but not globally accurate. Here, Bayesian optimization, which has been used to tune hyperparameters for global optimization problem in machine learning [32][33][34], is adopted to estimate α and β.
Let ψ denote the regularization parameters set, i.e., ψ = [α, β] ∈ Λ, where Λ is the bounded space of the regularization parameters. Then, the Bayesian optimization problem can be defined as ψ * = arg max ψ∈Λ f (ψ) (22) where f (ψ) denotes the cost function used to evaluate the quality score of the target matrix T. In this paper, the cost function is defined as the SCR of images obtained from the validation set X v . To perform Bayesian optimization, a prior p( f ) over the function should be selected to express assumptions about the function being optimized. Here we choose the Gaussian process prior due to its flexibility and tractability. During the Gaussian process, the posterior distribution of f (ψ) with the former n iteration values can be estimated by the mean function m(ψ) and variance function δ 2 (ψ) , i.e., where Normal(·) denotes the joint Gaussian distribution. Since the variance can be calculated with the mean and covariance, thereby, the Gaussian posterior distribution is obtained by the covariance function c(ψ i , ψ j ) instead of the variance function. In general, the mean function is considered to be a constant, and the covariance function is represented as a kernel function to determine the Gaussian process [44]. The automatic relevance determination (ARD) squared exponential kernel is commonly used in the Gaussian process. However, sample functions with this covariance function are too smooth to describe complex optimization problems in practice. Therefore, the ARD Matérn 5/2 kernel [45] is selected as the covariance function and is defined as where r(ψ i , ψ j ) is the Mahalanobis distance and θ 0 denotes the covariance amplitude. After obtaining the posterior distribution of f (ψ), an acquisition function is required to estimate the optimal solution ψ * . Here, the Expected Improvement (EI) [46] is chosen as the acquisition function. Suppose ψ max = arg max ψ∈Λ 1:n f (ψ) denotes the best current observation value at iteration n. The expected improvement function is given by where is the standard deviation function associated with the Gaussian process, and φ(·) and Φ(·) are the probability density function and the cumulative distribution function of the standard norm distribution, respectively. Then, the final optimal solution can be written as ψ * = arg max ψ∈Λ a EI (ψ) (26) In this article, a validation set comprising 1000 of the 5000 former mentioned holographic images was used to tune the regularization parameters. Initially, the parameters of the proposed algorithm are set to α = β = 0.0001. The search intervals were set as 0.0001 ≤ α ≤ 0.9 and 0.0001 ≤ β ≤ 0.9. To compute the cost function SCR, target regions are determined by background subtracting to remove the clutter. The regularization parameters used in the proposed model are selected as α = 0.4486 and β = 0.5987.

Experimental Results
In this section, the proposed clutter mitigation is corroborated on the laboratory data collected with the experimental HSR system developed by our research group. First, the experimental setup is described. Then, the performance analysis and comparison with those of existing clutter mitigation methods are presented qualitatively and quantitatively, followed by the investigation on the effect of the training data.

Experimental Setup
In addition to Scene II, two other real radar data were collected at our laboratory to validate the performance of the proposed method on detecting multi-targets and complicatedshaped targets. One test set was acquired from a scene with two targets behind the simulated wall (Scene III). Targets of interest consisted of two metal circles with the diameters of 3 cm and 4 cm, respectively. A 2-cm-thick pine wood and a 1-cm-thick gypsum slab partition were incorporated into the wall. Another data set used a model pistol as the target, which was overlapped with several paper documents and buried in a handbag (Scene IV). The antennas were kept 3 cm above the media surface and moved 40 steps at an interval of 0.6 cm along the 2D horizontal directions. The generated stepped-signal has a bandwidth of 20 GHz centered at 20 GHz. Figures 8 and 9 present the photographs of the measurement setup of Scene III and Scene IV, respectively.

Performance Analysis and Comparison
For comparison purposes, six baseline methods, including PCA [16], the standard SVD [13], the adaptive SVD [29], RPCA [21], MCA [19], and the standard autoencoder [38], were implemented and evaluated on the same test scene. Figure 10 shows the obtained images at 10 GHz before and after clutter mitigation of Scene II. The dataset comprising 5000 holographic images mentioned above was used to training the autoencoder.
As shown in Figure 10a, we can hardly see the ring target in the imaging result of Scene II because the target response is much weaker than the surface reflections. The output images acquired from subspace projection methods are displayed in Figure 10b-d. Though the clutter subspaces are estimated to be reduced, the images of target in Figure 10b-d are still destroyed by the clutter to varying degrees. Figure 10e depicts the result of the RPCA method; since the clutter in the holographic image has irregular shapes, it is difficult to achieve a low-rank matrix to remove the clutter. The result of the MCA method is shown in Figure 10f, where most of the clutter is suppressed but target is also removed, demonstrating that constructing adaptive subdictionaries to separate the clutter and target component in HSR images is still a open issue. The standard autoencoder fails to extract the target feature because the response is too weak to be detected, as is shown in Figure 10g. After applying the proposed method, the result is presented in Figure 10h, where the clutter is completely removed without compromising the target image. A quantitative analysis of different clutter mitigation methods in terms of SCR is listed in Table 1. Among these seven methods, the proposed method produces the best SCR of 32.5 dB, followed by the adaptive SVD with a SCR of 7.2 dB. Though the standard autoencoder achieves the third highest SCR of 4.4 dB, it distorts the target shape, which is unacceptable for subsurface objects detection.   The holographic images of Scene III at 11 GHz are presented in Figure 11. Without any preprocessing, Figure 11a depicts an image where the target response is obscured by strong clutter. With the availability of subspace projection methods, part of the strong clutter is removed, and the target signatures in the hologram obtained from the adaptive SVD are stronger than those from PCA and the standard SVD, as is shown in Figure 11b-d. This is because PCA and the standard SVD assume that the wall clutter only resides in the first subspace while the reflections backscattered from a heterogeneous wall have been proved to span a multidimensional subspace which is difficult to be accurately estimated [14]. Figure 11e,f shows the results obtained using RPCA and MCA. Though the relatively weak clutter is reduced effectively, chunk clutter still remains in the processed images because both methods capture the sparse components in the radar data to represent the targets. However, the clutter can scatter in various sizes and shapes if the holographic images are not preprocessed. The results in Figure 11g,h show that the autoencoder-based methods not only successfully mitigate the strong clutter but also preserve the target responses. Compared to the standard autoencoder, the proposed method performs slightly better in terms of producing a clearer target image and revealling the reflections of the circle metals. The SCRs of these methods listed in Table 1 also demonstrate the effectiveness of our proposed method. Specifically, the SCR improvement over the standard autoencoder, RPCA, the adaptive SVD, MCA, PCA, and the standard SVD is 11.1, 21.1, 23.3, 23.5, 25.1, and 25.5 dB, respectively. Figure 12 depicts the holographic images of Scene IV at 17 GHz before and after clutter mitigation. Without clutter mitigation, the target shape in the image shown in Figure 12a is barely identifiable compared with that in Figure 12b-d, which present the results after applying PCA, the standard SVD, and the adaptive SVD methods for clutter removal, respectively. Among these subspace projection methods, the adaptive SVD achieves the best performance and successfully reconstructs the pistol-like target image due to the subspace identification based on the cross-correlation characteristics of radar signal. However, the scattered clutter is increased when the subspace is removed. Figure 12e,f illustrates that RPCA and MCA can effectively suppress the scattered clutter but fail to recover the target response which is obscured by strong interference. It can be seen in Figure 12g that the standard autoencoder reconstructs two chunk targets instead of the pistol shape, while the proposed method yields a relatively accurate target image because the training data are modified to improve the generalization ability of the autoencoder, as shown in Figure 12h. Since the targets are not well recognized in the above images, Figure 12i presents the result of background subtraction as an extra reference. Though our proposed method still obtains the highest SCR on Scene IV compared with the other clutter mitigation methods, the SCR improvement is much lower than those on Scene II and Scene III, which demonstrates the proposed method achieves better performance when the target shape is regular or standardized.

Effect of Training Data
In general, the generalization ability of trained neural networks is related to the amount of training data. Therefore, the amount of training data can also affect the performance of proposed learning-based clutter mitigation method. To investigate the influence of training data, five datasets are constructed with different amounts of training radar signals. Set I contains 1000 radar images acquired from 10 different test scenes (In each scene, we obtained 100 training images by setting different kinds of targets or targets at different locations with a certain type of of medium such as concrete, planks, bricks, etc.). Various targets and materials are used in the test scenes. Similarly, Set II, III, IV, and V comprise 2000, 5000, 8000, and 10,000 radar signals obtained from the same 10 scenes (In each scene we obtained 200, 500, 800, and 1000 training images by setting different kinds of targets or targets at different locations with a certain type of of medium such as concrete, planks, bricks, etc.), respectively. After training the learning-based model with these datasets, we applied the proposed method on Scene I to IV and evaluate the performance in terms of SCR, which is presented in Table 2.  Table 2, it can be clearly seen that SCR of the proposed method can be improved by increasing the amount of training data. Trained on these datasets, on average, the SCR is improved by 8.0 dB when the amount of training data is increased from 1000 to 2000, whereas the improvement falls to 0.55 dB when the amount of training data is increased from 8000 to 10,000. This slight improvement in SCR indicates that the proposed model does not require a large quantity of training sets to achieve good performance.

Discussion
The qualitative and quantitative results of Section 3 clearly demonstrate that the proposed method can effectively mitigate the clutter and improve the SCR. This superiority is attributed to the learning-based model and the availability of training dataset comprising of cluttered radar images captured in diverse scenes and the corresponding clean images obtained by background subtraction. In addition, with the assistance of the adaptive SVD, strongest clutter in HSR images can be preprocessed, thereby reducing redundant information and improving the extraction of salient information. Furthermore, the sparse representation is also combined to further optimize the model and the alternating direction multiplier method (ADMM) is used to solve the optimization problem for precision and efficiency. Though reconstructing the target image of irregular shape is still a challenge because the analytic model is not considered in the proposed method, as is shown in Figure 12, experiments carried out on several datasets demonstrate that the proposed method achieves superior performance in mitigating the strong clutter in HSR images with the target preserved compared with the existing clutter mitigation methods in terms of SCR.

Conclusions
In this paper, a learning-based clutter mitigation method is proposed for holographic subsurface imaging. First, an autoencoder-based clutter mitigation scheme is presented. Then, taking advantage of the adaptive SVD, the performance of autoencoder in extracting the target response is improved when the useful signal overlaps with strong clutter. The sparse representation is also combined to further enhance the sparse target profile and the ADMM method is used to solve the optimization problem for precision and efficiency. Both visual and quantitative results on real data have demonstrated that, compared with the existing clutter mitigation methods, the proposed method achieves superior performance on mitigating the strong clutter in HSR images with the target preserved. Future work will focus on improving the method with a more suitable neural network to reconstruct the irregular target signature.