Performance Evaluation of Group Sparse Reconstruction and Total Variation Minimization for Target Imaging in Stratiﬁed Subsurface Media

: Sparse reconstruction methods have been successfully applied for e ﬃ cient radar imaging of targets embedded in stratiﬁed dielectric subsurface media. Recently, a total variation minimization (TVM) based approach was shown to provide superior image reconstruction performance over standard L1-norm minimization-based method, especially in case of non-point-like targets. Alternatively, group sparse reconstruction (GSR) schemes can also be employed to account for embedded target extent. In this paper, we provide qualitative and quantitative performance evaluations of TVM and GSR schemes for e ﬃ cient and reliable target imaging in stratiﬁed subsurface media. Using numerical electromagnetic data of targets buried in the ground, we demonstrate that GSR and TVM provide comparable reconstruction performance qualitatively, with GSR exhibiting a slight superiority over TVM quantitatively, albeit at the expense of less ﬂexibility in regularization parameters.

Recently, a generalized sparse image reconstruction approach with total variation minimization (TVM) was proposed for efficient and reliable radar imaging through multilayered background media [4]. More specifically, the multilayered subsurface Green's function was incorporated in the imaging algorithm to model the wave propagation effects in the multilayered environment and was efficiently evaluated using the saddle point method. As compared to standard l 1 -norm minimization-based techniques [20,22], which are based on point target model, the TVM-based approach minimizes the gradient of the image, thereby leading to better edge preservation and, in turn, reconstruction of non-point-like and extended targets [4,21].
An alternative to TVM based approach is group sparse reconstruction (GSR), which can also account for the target extent [21,23]. In high-resolution images, each extended or non-point-like target generally

Sparsity-Based Image Formation through Stratified Subsurface Media
In this section, we first describe the signal model for imaging through stratified subsurface media with a co-located MIMO radar. Then, we briefly review the TVM and GSR techniques for image reconstruction.

Signal Model in Matrix Form
We consider an N-element transmit array and an M-element receive array, with r tn = (x tn , z tn ) and r rm = (x rm , z rm ) denoting the respective position vectors of the nth transmitter and mth receiver. A stepped-frequency signal, with P frequencies uniformly covering the frequency band [ f min , f max ], is used for imaging. The transmitters are assumed to be activated sequentially, while simultaneous reception at all receivers is assumed. We focus on the four-layered background media, shown in Figure 1, where the first layer is air and the remaining three are subsurface layers. The dielectric constant and conductivity of the subsurface layers are assumed to be (ε r2 , σ 2 ), (ε r3 , σ 3 ), and (ε r4 , σ 4 ), while the second and third layers have a thickness of d 2 and d 3 , respectively. Although the presented formulation considers only three subsurface layers, it can be readily generalized to an arbitrary number of subsurface layers. generally occupies a contiguous group of pixels rather than a single pixel. As such, the point-targetbased sparse signal model can be refined to exclude from the solution space any image whose support contains isolated pixel indices [22]. The group sparsity approach incorporates the clustering of the non-zero image pixels into a small number of contiguous groups as a constraint in the sparse reconstruction problem. The specific clustering pattern can be structured to match the desired target shape.
In this paper, we provide a performance evaluation of TVM and GSR schemes for non-point-like target imaging in stratified subsurface media. To this end, we consider a multiple-input multiple output (MIMO) radar system and use numerical electromagnetic data of targets buried in a fourlayered background environment. We show that the two methods provide comparable performance qualitatively under noisy measurements. Quantitatively, GSR exhibits a slight superiority in terms of image reconstruction, especially at low signal-to-noise ratio (SNR) values. However, this GSR quantitative performance is achieved at the expense of less flexibility in setting of regularization parameters.
The remainder of the paper is organized as follows. Section 2 provides a review of the Green's function formulation for modeling the wave propagation effects in the multilayered background media and details the TVM and GSR based imaging algorithms. Section 3 describes the considered metrics for quantitative assessment and provides performance comparison using two sets of image reconstruction results of targets embedded in stratified subsurface media. Concluding remarks are provided in Section 4.

Sparsity-Based Image Formation through Stratified Subsurface Media
In this section, we first describe the signal model for imaging through stratified subsurface media with a co-located MIMO radar. Then, we briefly review the TVM and GSR techniques for image reconstruction.

Signal Model in Matrix Form
We consider an N-element transmit array and an M-element receive array, with , and , denoting the respective position vectors of the th transmitter and th receiver. A stepped-frequency signal, with frequencies uniformly covering the frequency band , , is used for imaging. The transmitters are assumed to be activated sequentially, while simultaneous reception at all receivers is assumed. We focus on the four-layered background media, shown in Figure 1, where the first layer is air and the remaining three are subsurface layers. The  dielectric constant and conductivity of the subsurface layers are assumed to be  ,  ,  , , and , , while the second and third layers have a thickness of and , respectively. Although the presented formulation considers only three subsurface layers, it can be readily generalized to an arbitrary number of subsurface layers. Assuming a point target embedded in the fourth layer, the received scattered field, , , , at the mth receiver with the nth transmitter active can be expressed as Figure 1. Radar imaging through stratified subsurface media [21]. Assuming a point target embedded in the fourth layer, the received scattered field, E s (r rm , r tn , k p ), at the mth receiver with the nth transmitter active can be expressed as E s (r rm , r tn , k p ) = G(r rm , r, k p )G(r, r tn , k p )σ(r)dr, (1) where σ(r) is the scene reflectivity at position r = (x, z), k p is the free-space wave number of the pth frequency, and G(r rm , r, k p ) and G(r, r tn , k p ) are the layered media Green's functions characterizing wave propagation from the transmitter to the target and from the target to the receiver, respectively. We note that Equation (1) essentially uses the first-order Born approximation, which ignores the multiple scattering effects. We discretize the region being imaged in the xz-plane into K × L pixels, and represent the corresponding scene reflectivity by the K × L matrix s. Then, the mth received signal in Equation (1) can be expressed in matrix form as where vec(·) returns the column-wise vectorization of its matrix argument, the pth element of y m,n is [ y m,n ] p = E s (r rm , r tn , k p ), and Ψ m,n is a P × KL dictionary matrix encompassing the stratified media effects, with its (p, q)th element given by The layered media Green's function for the GPR imaging configuration in Figure 1 can be expressed in closed-form using the Saddle Point Method (SPM) as [4] G(r R , r, k p ) ≈ jk 2 where r R = (x R , z R ), k 1 is the wavenumber of the air layer, α is a real-valued scaling variable, and with where y = [ y T 1,1 , y T 1,2 , . . . The signal model in Equation (8) corresponds to the full data measurements, comprising all P frequencies from all N transmitters and M receivers. In many practical operational scenarios, there are often cost constraints, which may limit the number of transmitters and receivers available for deployment. Note that reducing the number of frequencies at which measurements are made over the desired bandwidth may not translate into cost reduction. This is because the antennas and radio frequency (RF) front end would still be required to operate over the entire frequency band. As such, we retain the use of all P frequencies and assume that N t < N transmitters and M r < M receivers are available for data collection. Under these constraints, the model in Equation (8) takes the form y = ΛΨs = Θs. (9) Here, Λ is an M r N t P × NMP measurement matrix given by [24] where ' ' denotes the Kronecker product, I (·) is an identity matrix with the subscript indicating its dimensions, Φ is an M r × M matrix constructed by randomly selecting M r rows of I N t P , and ϑ is an N t × N matrix consisting of N t randomly selected rows of I MP .

Total Variation Minimization
Using the reduced measurements in Equation (9), the unknown scene reflectivity vector s can be recovered by solving the TVM problem [4], where δ represents a small tolerance error, In this work, we use the Nesterov algorithm in the NESTA package to solve the TVM problem in Equation (11) [25]. This algorithm utilizes a regularization scheme together with a smoothed version of l 1 -norm to achieve the solution of the underlying convex optimization problem.

Group Sparse Reconstruction
In high-resolution imaging, targets generally occupy a group of neighboring pixels whose extent depends not only on the target dimension, but also on the system resolution. This prior pixel neighborhood information can be incorporated in the image reconstruction problem using group sparsity constraints. More specifically, the scene reflectivity vector s can be obtained by solving the convex optimization problem [23,26,27] where g q ⊆ {0, 1, . . . , KL − 1} is an index set corresponding to the group of pixels forming a neighborhood around the qth pixel and the diagonal weighting matrix W (q) ensures that the weighting within a group is according to the desired pixel neighborhood relation. Figure 2 provides an example of how the grouping of the image pixels works for a 10 × 10 image [23]. The number in the top left corner of each square indicates the pixel index, whereas the pixel weight of the depicted group is represented by the number in the center of each pixel. The weights are chosen such that their sum equals unity to avoid unintentional scaling of the reconstruction result. As shown in Figure 2, the index set for the group corresponding to the 12th pixel is g 12 = {2, 11, 12, 13, 22}, with the corresponding weighting matrix W (12) = diag 1 8 , 1 8 , 1 2 , 1 8 , 1 8 . In this paper, we solve the reconstruction problem in Equation (13) using Primal YALL1 group [27] and utilize overlapping groups. neighborhood around the th pixel and the diagonal weighting matrix ensures that the weighting within a group is according to the desired pixel neighborhood relation. Figure 2 provides an example of how the grouping of the image pixels works for a 10 10 image [23]. The number in the top left corner of each square indicates the pixel index, whereas the pixel weight of the depicted group is represented by the number in the center of each pixel. The weights are chosen such that their sum equals unity to avoid unintentional scaling of the reconstruction result. As shown in Figure 2, the index set for the group corresponding to the 12th pixel is 2, 11, 12, 13, 22 , with the corresponding weighting matrix diag , , , , . In this paper, we solve the reconstruction problem in Equation (13) using Primal YALL1 group [27] and utilize overlapping groups.

Performance Evaluation Results and Discussion
In this section, we first describe the metrics considered for quantitative performance evaluation and then present the image reconstruction assessment results of targets embedded in a stratified subsurface.

Performance Evaluation Results and Discussion
In this section, we first describe the metrics considered for quantitative performance evaluation and then present the image reconstruction assessment results of targets embedded in a stratified subsurface.

Quantitative Metrics
We consider two different metrics, namely, relative clutter power (RCP) and Earth mover's distance (EMD), for quantitative performance evaluation of the TVM and GSR schemes.

Relative Clutter Peak
We define the target region, R t , as the union of rectangular or circular regions at known target positions, whereas the remainder of the image comprises the clutter region, R c ,. The size of each individual target region is determined based on the ground truth and the system resolution. With A R t ,ŝ and A R c ,ŝ denoting the respective maximum amplitude of target and clutter regions in the reconstructed imageŝ, the RCP is defined as [23] RCP in dB = 20 log 10 whereŝ q is the qth element ofŝ. The RCP metric penalizes strong clutter and favors clean images with low noise and clutter power and high target amplitudes.

Earth Mover's Distance
Earth mover's distance is defined as the minimal amount of image intensity that has to be moved to transform one image into another [28,29]. For the underlying application, we measure the EMD between the reconstructed image and the ground truth image. This metric incorporates perceptual differences between the reconstructed and ground truth images, as it measures error in terms of not only the differences in pixel values, but also physical distance away from the actual target locations. It, therefore, is a preferred metric over mean-squared error in sparse reconstruction literature [30]. In this work, we use a fast implementation of EMD [31].

Performance Comparison
We consider a MIMO radar system with 17 uniformly spaced transmitters from −0.96 to 0.96 m and 16 receivers equally spaced from −0.9 to 0.9 m. Both arrays are at a height of 0.2 m above the ground. The stepped-frequency signal covers the 0.8 to 2 GHz bandwidth with P = 49 frequency steps. A time-domain full wave electromagnetic solver based on Finite-Difference Time-Domain (FDTD) method is used for generating the received signals from two different scenes. Fast Fourier Transform (FFT) is applied to transform the time-domain received signals to frequency domain. The radar measurement configuration and signal parameters are the same in both scenarios. White Gaussian noise is added to the frequency-domain data. For image reconstruction, the thickness and complex permittivity of each layer of the stratified subsurface media are assumed to be known a priori. In practice, however, these parameters can be estimated using an inversion scheme. The inversion of multilayered medium parameters has been well developed within the framework of one-dimensional inverse scattering in the past two decades [32][33][34][35]. For the conventional single layer subsurface, analytical methods for estimation of the dielectric slab parameters have been provided in [32,35]. For multilayered subsurface media, the number and parameters of each layer can be efficiently retrieved using a layer-stripping algorithm or global optimization inverse scattering techniques [33,34].

Example 1
In this example, we consider three metallic targets (two rectangular and one cylindrical) embedded in a three-layered background environment, as shown in Figure 3. The dielectric constant, conductivity, and thickness of the second layer are ε r2 = 6, σ 2 = 0.01 S/m, and d 2 = 0.2 m, respectively. The third layer with a dielectric constant ε r3 = 3 and conductivity σ 3 = 0.005 S/m contains the three targets. The target dimensions are specified in Table 1. We randomly select two transmitters (12% of the available quantity) and six receivers (38% of the available number). For each chosen transmitter-receiver pair, we utilize all 49 frequency measurements to reconstruct the image. Figures 4 and 5 depict the images obtained using TVM, GSR, and standard l 1 -norm sparse reconstruction [20] for −10 and −5 dB SNR values, respectively. We observe from Figures 4 and 5 that although there are a few false reconstructions in both TVM and GSR results, the two approaches provide cleaner images of the targets as compared to the standard l 1 -norm sparse reconstruction result.           Next, we quantitatively evaluate the performance of the TVR, GSR, and standard -norm reconstruction schemes. We consider SNR values in the [−10 10] dB range with 5 dB increments. We perform a total of 100 Monte Carlo trials for each SNR value with different realization of noise and different randomly chosen sets of two transmitters and six receivers each time. For every trial, we reconstruct the image using TVM, GSR, and standard -norm schemes and compute the corresponding values of the metrics. Figure 6 plots the RCP and the EMD, each averaged over 100 trials, versus SNR. The variance of the EMD for each SNR is also indicated in Figure 5. We observe that GSR and TVM provide almost identical RCP performance, while significantly outperforming standard -norm reconstruction for all SNR values. In terms of EMD at low SNR, GSR provides the best performance as manifested by the lowest EMD average and variance values, while standardnorm reconstruction has the worst performance with TVM in the middle. At higher SNR values, the average EMD values for all three reconstruction methods are comparable. However, both GSR and TVM yield a smaller variance for the EMD as compared to the standard -norm reconstruction. Similar trends were observed for reconstruction performance with a random selection of four transmitters and four receivers and a random selection of four transmitters and eight receivers.

Example 2
In this example, we consider a large composite metallic target consisting of two semi-cylinders on top of a rectangle cylinder, which is embedded in the third layer of a three-layer background as shown in Figure 7. The target dimensions are also specified in Figure 7. The physical and electrical properties of the three background layers are the same as in Example 1. Figures 8 and 9 depict the reconstruction results obtained with two randomly chosen transmitters and six randomly chosen receivers using TVM, GSR, and standard -norm sparse reconstruction for SNR values of −5 and 0 dB, respectively. Similar to Example 1, we observe that both TVM and GSR approaches yield superior quality images as compared to the standard -norm sparse reconstruction. Next, we quantitatively evaluate the performance of the TVR, GSR, and standard l 1 -norm reconstruction schemes. We consider SNR values in the [−10 10] dB range with 5 dB increments. We perform a total of 100 Monte Carlo trials for each SNR value with different realization of noise and different randomly chosen sets of two transmitters and six receivers each time. For every trial, we reconstruct the image using TVM, GSR, and standard l 1 -norm schemes and compute the corresponding values of the metrics. Figure 6 plots the RCP and the EMD, each averaged over 100 trials, versus SNR. The variance of the EMD for each SNR is also indicated in Figure 5. We observe that GSR and TVM provide almost identical RCP performance, while significantly outperforming standard l 1 -norm reconstruction for all SNR values. In terms of EMD at low SNR, GSR provides the best performance as manifested by the lowest EMD average and variance values, while standard l 1 -norm reconstruction has the worst performance with TVM in the middle. At higher SNR values, the average EMD values for all three reconstruction methods are comparable. However, both GSR and TVM yield a smaller variance for the EMD as compared to the standard l 1 -norm reconstruction. Similar trends were observed for reconstruction performance with a random selection of four transmitters and four receivers and a random selection of four transmitters and eight receivers. Next, we quantitatively evaluate the performance of the TVR, GSR, and standard -norm reconstruction schemes. We consider SNR values in the [−10 10] dB range with 5 dB increments. We perform a total of 100 Monte Carlo trials for each SNR value with different realization of noise and different randomly chosen sets of two transmitters and six receivers each time. For every trial, we reconstruct the image using TVM, GSR, and standard -norm schemes and compute the corresponding values of the metrics. Figure 6 plots the RCP and the EMD, each averaged over 100 trials, versus SNR. The variance of the EMD for each SNR is also indicated in Figure 5. We observe that GSR and TVM provide almost identical RCP performance, while significantly outperforming standard -norm reconstruction for all SNR values. In terms of EMD at low SNR, GSR provides the best performance as manifested by the lowest EMD average and variance values, while standardnorm reconstruction has the worst performance with TVM in the middle. At higher SNR values, the average EMD values for all three reconstruction methods are comparable. However, both GSR and TVM yield a smaller variance for the EMD as compared to the standard -norm reconstruction. Similar trends were observed for reconstruction performance with a random selection of four transmitters and four receivers and a random selection of four transmitters and eight receivers.

Example 2
In this example, we consider a large composite metallic target consisting of two semi-cylinders on top of a rectangle cylinder, which is embedded in the third layer of a three-layer background as shown in Figure 7. The target dimensions are also specified in Figure 7. The physical and electrical properties of the three background layers are the same as in Example 1. Figures 8 and 9 depict the reconstruction results obtained with two randomly chosen transmitters and six randomly chosen receivers using TVM, GSR, and standard -norm sparse reconstruction for SNR values of −5 and 0

Example 2
In this example, we consider a large composite metallic target consisting of two semi-cylinders on top of a rectangle cylinder, which is embedded in the third layer of a three-layer background as shown in Figure 7. The target dimensions are also specified in Figure 7. The physical and electrical properties of the three background layers are the same as in Example 1. Figures 8 and 9 depict the reconstruction results obtained with two randomly chosen transmitters and six randomly chosen receivers using TVM, GSR, and standard l 1 -norm sparse reconstruction for SNR values of −5 and 0 dB, respectively. Similar to Example 1, we observe that both TVM and GSR approaches yield superior quality images as compared to the standard l 1 -norm sparse reconstruction. The two quantitative performance metrics, averaged over 100 Monte Carlo trials, are plotted vs. SNR in Figure 10 for TVR, GSR, and standard l 1 -norm reconstruction schemes. For EMD, the variance is also indicated in Figure 10. Again, we observe that the performance of GSR and TVM is comparable in terms of RCP, while that of standard l 1 -norm reconstruction is significantly lower. Unlike the three-target scene in Example 1 wherein TVM performance in terms of average EMD was approximately half-way between that of GSR and standard l 1 -norm reconstruction, the EMD curve for TVM in the larger composite target case closely follows the corresponding curve for GSR. TVM's capability of edge preservation better manifests itself in case of the composite target, leading to smaller performance difference with GSR as compared to the three-targets scene in Example 1.
Reconstructions with a random selection of four transmitters and four receivers, and a set of four transmitters and eight receivers, both sets chosen at random, yielded similar quantitative performance trends.  The two quantitative performance metrics, averaged over 100 Monte Carlo trials, are plotted vs. SNR in Figure 10 for TVR, GSR, and standard -norm reconstruction schemes. For EMD, the variance is also indicated in Figure 10. Again, we observe that the performance of GSR and TVM is comparable in terms of RCP, while that of standard -norm reconstruction is significantly lower. Unlike the three-target scene in Example 1 wherein TVM performance in terms of average EMD was approximately half-way between that of GSR and standard -norm reconstruction, the EMD curve for TVM in the larger composite target case closely follows the corresponding curve for GSR. TVM's capability of edge preservation better manifests itself in case of the composite target, leading to smaller performance difference with GSR as compared to the three-targets scene in Example 1.
Reconstructions with a random selection of four transmitters and four receivers, and a set of four transmitters and eight receivers, both sets chosen at random, yielded similar quantitative performance trends.

Discussion
The results provided in Section 3.2 quantify and validate the superior performance of the GSR and TVM approaches over the standard -norm reconstruction for non-point-like targets, especially at low SNR values. A comment is in order on the choice of regularization/penalty parameters for the employed sparse reconstruction methods. The Primal YALL1 group solver requires setting of a length-2 penalty parameter vector [27], whereas NESTA, employed for TVM and standard -norm reconstructions, requires the specification of smoothing and stopping parameters. Both the smoothing and stopping parameters in NESTA should be set to small values for higher accuracy or large values for faster convergence [36]. In general, choosing a small value for the smoothing parameter warrants a small value of the stopping parameter. For large smoothing parameter value, the stopping parameter can also be larger. Note that setting the smoothing parameter equal to zero results in use of the standard "non-smoothed" version of the -norm in the optimization problem. For the Primal YALL1 group solver, the elements of the length-2 penalty vector are set as inversely proportional to ‖ ‖ , where ‖•‖ denotes the infinity norm of the argument [23].
For the considered numerical experiments, we set these parameters for NESTA and YALL1 empirically for a nominal number of transmitters and receivers under noise-free conditions following the aforementioned guidelines, opting for higher accuracy over faster convergence for NESTA. For NESTA, no adjustments were made when the total number of transmitters and receivers were increased or decreased compared to the nominal case. However, the proportionality constants for the penalty parameters in YALL1 group had to be adjusted to account for the change in ‖ ‖ with

Discussion
The results provided in Section 3.2 quantify and validate the superior performance of the GSR and TVM approaches over the standard l 1 -norm reconstruction for non-point-like targets, especially at low SNR values. A comment is in order on the choice of regularization/penalty parameters for the employed sparse reconstruction methods. The Primal YALL1 group solver requires setting of a length-2 penalty parameter vector [27], whereas NESTA, employed for TVM and standard l 1 -norm reconstructions, requires the specification of smoothing and stopping parameters. Both the smoothing and stopping parameters in NESTA should be set to small values for higher accuracy or large values for faster convergence [36]. In general, choosing a small value for the smoothing parameter warrants a small value of the stopping parameter. For large smoothing parameter value, the stopping parameter can also be larger. Note that setting the smoothing parameter equal to zero results in use of the standard "non-smoothed" version of the l 1 -norm in the optimization problem. For the Primal YALL1 group solver, the elements of the length-2 penalty vector are set as inversely proportional to Θ H y ∞ , where · ∞ denotes the infinity norm of the argument [23]. For the considered numerical experiments, we set these parameters for NESTA and YALL1 empirically for a nominal number of transmitters and receivers under noise-free conditions following the aforementioned guidelines, opting for higher accuracy over faster convergence for NESTA. For NESTA, no adjustments were made when the total number of transmitters and receivers were increased or decreased compared to the nominal case. However, the proportionality constants for the penalty parameters in YALL1 group had to be adjusted to account for the change in Θ H y ∞ with any increase or decrease in the number of transmitters and receivers employed. Thus, compared to YALL1 group, NESTA was found to be more robust to changes in the amount of data employed for the sparse reconstructions.

Conclusions
In this paper, we conducted qualitative and quantitative performance evaluations of group sparse reconstruction and total variation minimization approaches for radar imaging through stratified subsurface media. The TVM approach minimizes the gradient of the image resulting in good edge preservation, while group sparse approach exploits prior pixel neighborhood information about extended targets for reliable imaging. Under reduced number of transmitters and receivers, numerical EM measurements with varying SNR levels were considered. Both TVM and GSR approaches demonstrated comparable qualitative performance for different subsurface scenarios. The quantitative evaluation revealed a slight performance advantage of GSR over TVM. However, this advantage came at the cost of reduced flexibility in setting regularization/penalty parameters for GSR vs. TVM.