Underlying Topography Inversion Using TomoSAR Based on Non-Local Means for an L-Band Airborne Dataset

: The underlying topography is an important part of the three-dimensional structure of forests, and is used for a variety of applications, such as hydrology and water resource management, civil engineering projects, and forest resource surveying. Due to the three-dimensional imaging ability and strong penetration, the tomographic synthetic aperture radar (TomoSAR) with a long wavelength has been shown to be a useful tool to estimate the underlying topography. At present, most of the current methods use the local means method to estimate the sample covariance matrix, in which the vertical backscattering power is estimated. However, these methods cannot easily obtain high-precision underlying topography, and often lose some detailed information. In this paper, to solve this problem, a non-local means method is introduced to estimate the optimal covariance matrix by combining weighted neighborhood pixels. To validate the feasibility and effectiveness of this proposed method, a BioSAR 2008 campaign L-band dataset acquired from the northern forests of Sweden was used to inverse the underlying topography. The results show that the accuracy of the underlying topography retrieved by the proposed method is improved by more than 30% when compared with the traditional method.


Introduction
The underlying topography, as an important parameter of forest resource surveying, not only affects the spatial distribution of forest resources, but is also closely related to the stability of forest ecosystems [1,2]. The traditional aerial survey or optical remote sensing approaches can only obtain the height information of the forest canopy, and they cannot obtain the real underlying topography of forests [3][4][5]. Tomographic synthetic aperture radar (TomoSAR) [6][7][8][9], and especially the long-wavelength SAR systems, can penetrate the forest canopy to the ground and record the backscattering information from the forest vertical structure [10], which provides us with the possibility of underlying topographic mapping [11,12]. TomoSAR is an extension of the traditional two-dimensional (2D) imaging to three-dimensional (3D) imaging by collecting several images at different heights [13]. In other words, TomoSAR can be used to obtain the scattering echo along the elevation by the computed tomography for each resolution cell [14]. Therefore, it has been used for underlying topography inversion by effectively separating different scatterers along the vertical direction in the same resolution cell [15][16][17][18].
Over forest areas, there are a lot of distributed scatterers, and their vertical backscattering power is contained in the amplitude and phase information of the covariance matrix (CM). Accordingly, the tomographic 3D imaging of forests is processed using the CM [19,20]. For example, the frequently used tomographic methods such as beamforming [6], adaptive beamforming (Capon) [8], and the multiple signal classification (MUSIC) [9] algorithm all obtain tomograms on the basis of CMs [21,22]. Therefore, the estimation accuracy of the CM directly determines the performance of tomographic focusing. In fact, it is impossible to acquire a true CM, due to the lack of observations. To address this difficulty, the true CM is generally replaced by a sample covariance matrix (SCM). At present, the SCM is usually estimated by the local means (LM) method [23][24][25], which statistically averages all the pixels in a sliding window. This method is simple and easy to implement. However, it can only obtain a correct estimation when all the pixel statistics within the window are consistent. If there are some differences, the estimation of the CM will be inaccurate, which easily leads to the mixing and superposition of different scattering mechanisms. Furthermore, the resolution of the tomographic spectrum will be coarse, resulting in a loss of detailed information and a reduction in the underlying topography estimation accuracy. For example, a slope can become flat ground after being processed by the LM method.
To overcome the above problems, the non-local means (NLM) method was applied to perform three-dimensional focusing, which has been already studied in [26,27]. NLM takes advantage of the neighborhood pixels in a sliding window, that is also the search window, around the center pixel. It also calculates the similarity between the center pixel and the neighborhood pixels to obtain the weight. Based on the weight, the optimal CM is calculated. Compared to the LM method, the NLM method can preserve the true scattering characteristics in each resolution cell after averaging. Based on the estimated CM, the reconstructed tomograms can retain the details well, especially over undulating terrain. However, reference [26] only considered radiometric similarity between two pixels but ignored the spatial similarity, which probably resulted in homogeneous area misinterpretation or excessive smoothness. Moreover, it is necessary to manually select a homogeneous area prior to filtering. The methodology in [27] can solve the above problem by adaptively considering both the radiometric similarity and the spatial similarity between two pixels. This method is based on the complex Gaussian distribution to calculate the similarity distance. However, over forest areas, the scattering process is complex. In particular, the pure volume scattering hypothesis no longer applies under the observation of SAR sensors with long wavelength such as L band and P band. Besides the volume scattering, there are ground scattering and double-bounce scattering. This means that it is not suitable to use complex Gaussian distribution to express statistical characteristics. In addition, it is still not yet fully clear how NLM performs in underlying topography estimation. In view of this, this paper proposes an improved NLM in TomoSAR to estimate the underlying topography. This method adopted the affine invariant distance to replace the distance used in [27], since it is invariant to any affine transformation of the matrices. This distance does not assume a particular statistical distribution. Thus, the proposed NLM method is suitable for the application of forests and can improve the accuracy of underlying topography estimation.
The rest of this work is formed as follows. Section 2 first presents the tomographic SAR model, and briefly describes the principles of the NLM method. Three classical tomographic estimators based on the NLM method are then described. Section 3 gives a description of the research area and the experimental dataset, and then presents and analyzes the experimental results. A further discussion about the tomograms in other channels and forest height estimation is shown in Section 4. Finally, the conclusions of this paper are drawn in Section 5.

SAR Tomography Model
TomoSAR is a 3D extension of the traditional 2D SAR imaging [28]. The technique makes multiple observations of the same target at different heights in the normal direction (i.e., cross-range c) of the traditional azimuth x-range r plane. It can achieve 3D imaging of Remote Sens. 2021, 13, 2926 3 of 18 target objects by obtaining another synthetic aperture in the height dimension (as shown in Figure 1).
where ( , , ) is the complex reflectivity along the vertical direction , is the complex unit, and = 4 / ( ) is known as the vertical number. is the perpendicular baseline between the master image and this th image, represents the wavelength, is the slant distance between the master image and the target, and is the incidence angle of the radar.
As the coherent noise in forest areas is strong, a multi-look approach is generally needed to suppress it. If we assume that the number of looks is , then the tomographic SAR model can be rewritten as: where ( ) represents the vector composed of the measurements; ( ) denotes the unknown vector with complex reflectivity elements; ( ) is a 1 noise vector; and ( ) indicates a steering matrix with ( ) = ( ), ( ), ⋯ , ( ) , where ( = 1, ⋯ , ) represents the discrete height position. Through multiple SAR observations at different times with slightly different incidence angles over the same place, N SAR images can be acquired. Some preprocessing steps which contain co-registration, deramping, and phase calibration with respect to a common master image must be done [29]. For an arbitrary pixel (x, r), the focused complex value u n (x, r) for the nth acquisition is expressed as [6,30]: where β(x, r, z) is the complex reflectivity along the vertical direction z, j is the complex unit, and k z = 4πb ⊥n /λrsin(θ) is known as the vertical number. b ⊥n is the perpendicular baseline between the master image and this nth image, λ represents the wavelength, r is the slant distance between the master image and the target, and θ is the incidence angle of the radar. As the coherent noise in forest areas is strong, a multi-look approach is generally needed to suppress it. If we assume that the number of looks is L, then the tomographic SAR model can be rewritten as: where g(l) represents the vector composed of the N measurements; β(l) denotes the unknown vector with D complex reflectivity elements; e(l) is a N × 1 noise vector; and A(z) indicates a N × D steering matrix with A(z) = [a(z 1 ), a(z 2 ), · · · , a(z D )], where z d (d = 1, · · · , D) represents the discrete height position. For the distributed scatterers, SAR tomography is interested in the reflectivity power along the vertical direction, which can be given by: where E(·) denotes the statistical expectation operator.
Equation (3) can be actually regarded as a problem of spectral estimation. To date, a lot of spectral estimation approaches have been proposed to implement tomographic focusing, which can be categorized into three groups: (1) non-parametric spectral estimation [6,13,14]; (2) parametric spectral estimation [9,15]; and (3) compressive sensing [16,19,31]. Among these methods, the most widely used classical spectral estimation methods are Beamforming, Capon, and MUSIC.

Beamforming
Beamforming was the earliest algorithm used to solve the overlay problem [32,33]. The basic idea is that this "finite impulse response filter" allows signals of a specific spatial frequency to pass without distortion, and attenuates the signals of other frequencies. The power of the reflectivity estimated by beamforming is written as [6,34]: where the SCM R is defined by: where (·) H denotes the Hermitian operator.

Adaptive Beamforming (Capon)
Some of the shortcomings of beamforming have been solved by the Capon approach, including the low height resolution, irregular sampling, and serious sidelobes. Its basic principle is similar to that of beamforming, but it does not assume that the received signal is the spatial white noise distribution whose variance is the unit matrix I, which is replaced by R. The reflectivity power estimated by Capon can be expressed as [8,35]:

Multiple Signal Classification (MUSIC)
The MUSIC algorithm decomposes R into signal and noise by eigen decomposition. The reflectivity power can be given by [9,33]: where U represents the eigenvector of the signal.

The NLM Algorithm
According to the analysis in the previous section, beamforming, Capon, and MUSIC all estimate the backscattering power spectrum from the CM. In order to obtain a CM that is closest to the real values, we utilize the NLM method to optimize the estimation. NLM first uses a search window (e.g., 11 × 11) with the target pixel as the center, and then uses a matching window (e.g., 3 × 3) to calculate the SCM of the center pixel and its neighborhood pixels. After this, the similarity weight, which is composed of the spatial and radiometric similarity between each neighborhood pixel and the center pixel, is calculated by traversing the entire search window. Finally, the weighted SCM of all the neighborhood pixels is used to describe the CM of the center pixel. The NLM method not only makes maximum use of the neighborhood information to ensure the authenticity of the estimation for the center pixel, but it also eliminates the interference information, such as noise or heterogeneous Remote Sens. 2021, 13, 2926 5 of 18 information in the neighborhood, by means of the weighting. This improves the estimation accuracy of the center pixel.
The value of the center pixel in the estimation window W can be estimated using the NLM method, which is given by [36]: where x 0 indicates the target pixel (the central pixel in the estimation window), index x i denotes the neighborhood pixel, and C x i represents the observation value of the neighborhood pixel. The weights are expressed as: where the spatial similarity f s (x 0 , x i ) is defined by [27]: and radiometric similarity f r (x 0 , x i ) is defined by: where P is a 2D vector about positions of pixels with reference to the matching window center, and P 2 is the total pixel number in the matching window [27]. Both the spatial and radiometric similarity kernels can be estimated from [27]: where γ is the user-defined scaling factor; γ s is the spatial extent of the filter, and can refer to the window size used in other filters such as the Lee filter [37]; and γ r controls the amount of filtering based on the radiometric similarity between these two pixels. They are usually designated as γ s = 3 and γ s = 0.9 [36]. The function d(C x 0 , C x i ) is the distance between two Hermitian matrices. This distance is defined as the affine invariant (AI) distance, because it keeps the same under any affine transformation of the matrices, and can be expressed as [27,38]: where log is a logarithm function and · F is the Frobenius norm. Using this method for the CM is similar to "filtering", i.e., assigning different weights to filter out the pixels with low similarity around the target pixel and selecting the pixels with high similarity to jointly estimate the optimal CM value of the target pixel.

The Traditional Spectral Estimation Methods Based on NLM
The three classical spectral estimation methods (beamforming, Capon, and MUSIC) are all calculated by using the SCM obtained by the LM method. In the proposed approach, NLM is applied to the CM estimation of the tomographic SAR model to improve its estimation accuracy and the inversion accuracy of the underlying topography. The steps are as follows: (1) Solve the SCM R of all the pixels in the area of interest.
(2) Specify the size of the search window W and matching window P. (3) Calculate the spatial similarity f s (x 0 , x i ) and radiometric similarity f r (x 0 , x i ) of the matching window between the central pixel x 0 and neighboring pixel x i . (A pixel (m, n) in the research region is located at x 0 within its search window W). (4) Calculate the weight of the neighborhood pixel x i based on the spatial and radiometric similarity. (5) Calculate the optimal weighted CM of the center pixel by the using of the SCM of all the neighborhood pixels (except for the center pixel) in the search window and their corresponding weights. (6) Substitute the estimated CM into the spectrum estimation formula to estimate the pixel's spectrum. (7) Traverse the whole study area, and repeat steps (3) to (6) to obtain the spectra over the whole area.
In addition, the details of the classical spectral estimators based on NLM are provided in Table 1. Table 1. Details of the classical spectral estimators based on NLM.

Until (finish)
The principle of optimal CM estimation based on NLM is shown in Figure 2. The central pixel x 0 is the weighted average of all the neighborhood pixels x i in the search window W. Neighbors with a high similarity are given larger weights w x 0 ,x 1 i and w x 0 ,x 2 i , while the neighborhood pixels with large differences are given smaller weights w x 0 ,x 3 i . Remote Sens. 2020, 17, x FOR PEER REVIEW 6 of 18 (4) Calculate the weight of the neighborhood pixel based on the spatial and radiometric similarity. (5) Calculate the optimal weighted CM of the center pixel by the using of the SCM of all the neighborhood pixels (except for the center pixel) in the search window and their corresponding weights. (6) Substitute the estimated CM into the spectrum estimation formula to estimate the pixel's spectrum. (7) Traverse the whole study area, and repeat steps (3) to (6) to obtain the spectra over the whole area.
In addition, the details of the classical spectral estimators based on NLM are provided in Table 1.
The principle of optimal CM estimation based on NLM is shown in Figure 2. The central pixel is the weighted average of all the neighborhood pixels in the search window . Neighbors with a high similarity are given larger weights , and , , while the neighborhood pixels with large differences are given smaller weights , .

Experiments and Results
For the purpose of investigating the feasibility and validity of the proposed method, TomoSAR was applied to a real airborne SAR dataset to obtain the underlying topography, and a comparison was made with LiDAR measurements.

Experiments and Results
For the purpose of investigating the feasibility and validity of the proposed method, TomoSAR was applied to a real airborne SAR dataset to obtain the underlying topography, and a comparison was made with LiDAR measurements.

Study Area and Dataset
This paper selected a boreal forest in Krycklan, located in the north of Sweden, as the study area, which is made up of coniferous trees such as Scots pine and Norway spruce, and a few birch trees. The average annual temperature is about 1 • C and the average annual precipitation is around 600 mm. Moreover, the terrain topography is hilly, varying from 190 m to 290 m. The average tree height is about 18 m and the maximum tree height is around 30 m [39,40].
In order to address some important specific requirements of the European Space Agency (ESA) BIOMASS Earth resources exploration program, a stack of six L-band SAR images in fully polarimetric mode was obtained over the study area by the German Aerospace Centre (DLR) during the ESA BioSAR 2008 campaign on 15 October 2008. The BioSAR 2008 project was carried out as a cooperation between the ESA, the DLR, the Swedish Defense Research Agency (FOI), the Swedish University of Agricultural Sciences (SLU), the Biosphere Remote Sensing Research and Education Centre (CESBIO), and the Polytechnic of Milan in Italy. They undertook the preprocessing steps for this dataset, including co-registration, flat-earth phase removal and phase error calibration [40]. Six uniform perpendicular baselines were distributed along the vertical direction. The interval between two adjacent baselines was about 6 m. The incidence angle varied from 25 • at the near range to 55 • at the far range [40]. The vertical resolution varied in the range of 6 m to 25 m. Parameters of the E-SAR airborne system and the information of perpendicular baselines are presented in Tables 2 and 3, respectively [40].
Moreover, in order to validate the results of TomoSAR, a laser radar S/N425 TopEye system onboard a helicopter platform was used to generate a point cloud dataset over this study area on 5 August and 6 August 2008. From these point cloud data, the digital terrain model (DTM) was obtained.

Comparison of the Tomograms
A profile along range (the red dotted line, as seen in Figure 3) was selected as an example for tomogram analyzing. This profile was fixed at the 300th azimuth resolution cell.
According to Tebaldini et al. [11], the backscattering reflectivity of the HH polarization channel mainly comes from the ground, and the backscattering power of the HV polarization channel is mainly concentrated on the canopy volume scattering. Figure 4 shows the spectrum estimation results in the HH polarization channel obtained by the Remote Sens. 2021, 13, 2926 8 of 18 classical spectral estimation methods (beamforming, Capon, and MUSIC) based on the LM and NLM methods, respectively. It was found that the tomograms estimated by the three methods based on the LM method failed to separate the ground and canopy backscattering contributions in many places (e.g., the three red circles in Figure 4a,c,e) for which the ground contributions were weak, mistreating the scattering phase center of the canopy as the ground scattering phase center. However, the tomograms based on the NLM method could basically distinguish the scattering phase centers of the ground and the canopy, such as those marked by the three yellow circles in Figure 4b,d,f. In addition, the tomograms estimated by the NLM method were clearer and more continuous than those estimated by the LM method, which was mainly due to the more real and accurate CM estimated by the NLM method. As a result of the NLM method considering more non-local neighborhood information, the target pixel reflects its own optimal information by looking for pixels with higher spatial and radiometric similarity. This avoids the erroneous information which can result from a direct and sloppy adoption of the LM method, such as noise or pixels that differ greatly from their characteristics. According to Tebaldini et al. [11], the backscattering reflectivity of the HH polarization channel mainly comes from the ground, and the backscattering power of the HV polarization channel is mainly concentrated on the canopy volume scattering. Figure 4 shows the spectrum estimation results in the HH polarization channel obtained by the classical spectral estimation methods (beamforming, Capon, and MUSIC) based on the LM and NLM methods, respectively. It was found that the tomograms estimated by the three methods based on the LM method failed to separate the ground and canopy backscattering contributions in many places (e.g., the three red circles in Figure 4a,c,e) for which the ground contributions were weak, mistreating the scattering phase center of the canopy as the ground scattering phase center. However, the tomograms based on the NLM method could basically distinguish the scattering phase centers of the ground and the canopy, such as those marked by the three yellow circles in Figure 4b,d,f. In addition, the tomograms estimated by the NLM method were clearer and more continuous than those estimated by the LM method, which was mainly due to the more real and accurate CM estimated by the NLM method. As a result of the NLM method considering more non-local neighborhood information, the target pixel reflects its own optimal information by looking for pixels with higher spatial and radiometric similarity. This avoids the erroneous information which can result from a direct and sloppy adoption of the LM method, such as noise or pixels that differ greatly from their characteristics. Figure 5 shows the underlying topography obtained from the tomogram estimation results. Obviously, the spectral estimation method based on the LM method failed to estimate the phase center of the ground in many parts, and mistook the height of the canopy as the height of the ground, especially in the methods of beamforming and Capon. Two important parameters in the NLM method were selected as = 3 and = 15, and the optimal value of was selected through a large number of experiments (see Figure  6). The selection of the above two parameters depend on the study area. In general, several profiles or plots are selected to optimize the parameters before the application for whole areas. According to the experience in this paper, the range of is 3-5, and the range of is 10-25.  Figure 5 shows the underlying topography obtained from the tomogram estimation results. Obviously, the spectral estimation method based on the LM method failed to estimate the phase center of the ground in many parts, and mistook the height of the canopy as the height of the ground, especially in the methods of beamforming and Capon. Two important parameters in the NLM method were selected as p = 3 and w = 15, and the optimal value of w was selected through a large number of experiments (see Figure 6). The selection of the above two parameters depend on the study area. In general, several profiles or plots are selected to optimize the parameters before the application for whole areas. According to the experience in this paper, the range of p is 3-5, and the range of w is 10-25.
In order to quantitatively investigate the effectiveness of the proposed method, the estimation accuracies of different methods were calculated by the root mean square error (RMSE) values. According to Table 4, after the application of the NLM method, the accuracy of the three classical spectral estimation methods of beamforming, Capon, and MUSIC was significantly improved by 34.87%, 38.28%, and 31.61%, respectively.    Furthermore, we analyzed the computational impact on the used NLM method quantitatively, as shown in Table 5. For the selected range profile, the running time of the NLM methods increased by approximately 14 times that of the LM methods for the same computer (a desktop PC with an i7-8700 6-core processor).    In order to quantitatively investigate the effectiveness of the proposed method, the estimation accuracies of different methods were calculated by the root mean square error (RMSE) values. According to Table 4, after the application of the NLM method, the accuracy of the three classical spectral estimation methods of beamforming, Capon, and MUSIC was significantly improved by 34.87%, 38.28%, and 31.61%, respectively. Furthermore, we analyzed the computational impact on the used NLM method quantitatively, as shown in Table 5. For the selected range profile, the running time of the NLM methods increased by approximately 14 times that of the LM methods for the same computer (a desktop PC with an i7-8700 6-core processor). For verifying the universality of the proposed method, we applied the method to the whole study area for the inversion of the underlying topography, and compared the results with those of the LM method.
From Figure 7, it can be observed that the underlying topography estimated based

Inversion of the Underlying Topography
For verifying the universality of the proposed method, we applied the method to the whole study area for the inversion of the underlying topography, and compared the results with those of the LM method.
From Figure 7, it can be observed that the underlying topography estimated based on the LM and NLM methods is close to the LiDAR DTM measurements. However, there are many areas where the underlying terrain has been overestimated (e.g., the six black circles in Figure 7) in the results based on the LM method. However, the results based on the NLM method (e.g., the six white circles in Figure 7) are less likely to show this. The main reason for this is that the LM method fails to distinguish the scattering phase center between the ground and the canopy in some areas, which leads to the canopy height being mistaken as the ground height. This is the same as the tomogram analysis. In addition, the NLM method also shows a better inversion performance in places with drastic terrain height changes, and it preserves more detailed information (e.g., the three white rectangles in Figure 7). Therefore, the NLM method is applicable for use in areas with large terrain fluctuation, and can achieve a high inversion accuracy. Table 6 lists the RMSEs of the two methods for the inversion of the underlying topography, which proves quantitatively that the NLM method improves the accuracy of the inversion of the underlying terrain by more than 30% when compared with the LM method. derlying terrain by more than 30% when compared with the LM method.   Figure 8 shows the relationships between the LiDAR measurements and the inversion results obtained by different methods. Moreover, we calculated the correlation coefficients between the LiDAR measurements and the estimation results (see in Figure 8). The results show that the estimation results of the NLM method (as shown in Figure 8df) were more relevant to LiDAR measurements than those of the LM method (as shown in Figure 8a-c). It suggests that the inversion results of the NLM method are more reliable.   Figure 8 shows the relationships between the LiDAR measurements and the inversion results obtained by different methods. Moreover, we calculated the correlation coefficients between the LiDAR measurements and the estimation results (see in Figure 8). The results show that the estimation results of the NLM method (as shown in Figure 8d-f) were more relevant to LiDAR measurements than those of the LM method (as shown in Figure 8a-c). It suggests that the inversion results of the NLM method are more reliable.

Optimal Covariance Matrix Estimation
From Figure 9, it can be seen that the SCM calculated using the LM method is significantly different from the optimal CM obtained by the NLM method. Figure 9a represents the amplitude information of the SCM estimated by the traditional LM method. Its main diagonal elements have almost identical values, which is the result of using the local mean values. However, the amplitude information of the CM estimated by the NLM method (see Figure 9b) clearly distinguishes the proportions of the different information. Furthermore, there are also some differences between the two methods with regard to phase information (see Figure 9c,d). Thus, the scattering phase centers of the ground and the canopy can be detected separately. This is also the main reason why the accuracy of the NLM method is improved compared with that of the traditional LM method. Therefore, the accurate estimation of CM information plays an important role in underlying topography inversion. The estimation accuracy of the NLM method is better than that of the LM method, but its computation time of neighborhood search is higher, which may limit the application of this method for the for the large scale or global scale monitoring. Moreover, there may exist the potentially distortive effect of NLM approach on radiometric information, which may be caused by the topography. In the future, our work will focus on improving the computational efficiency of this method and removing the topography-induced radiometric distortion as much as possible.

Tomograms in the HV and VV Channels
Tomograms of the aforementioned selected profile in the HV channel and VV channel were also estimated. The estimated spectra obtained by the traditional methods can basically only detect the backscattering phase center of the canopy in the high tree region (approximately the 200th to 600th pixels), as shown in Figure 10. Furthermore, this is mainly embodied in the beamforming and Capon methods. In contrast, although the method presented in this paper shows a certain improvement effect in the high tree area, there are still some areas where the backscattering phase center of the ground is not detected, especially for beamforming. This is mainly because the backscattering reflectivity of

Optimal Covariance Matrix Estimation
From Figure 9, it can be seen that the SCM calculated using the LM method is significantly different from the optimal CM obtained by the NLM method. Figure 9a represents the amplitude information of the SCM estimated by the traditional LM method. Its main diagonal elements have almost identical values, which is the result of using the local mean values. However, the amplitude information of the CM estimated by the NLM method (see Figure 9b) clearly distinguishes the proportions of the different information. Furthermore, there are also some differences between the two methods with regard to phase information (see Figure 9c,d). Thus, the scattering phase centers of the ground and the canopy can be detected separately. This is also the main reason why the accuracy of the NLM method is improved compared with that of the traditional LM method. Therefore, the accurate estimation of CM information plays an important role in underlying topography inversion. The estimation accuracy of the NLM method is better than that of the LM method, but its computation time of neighborhood search is higher, which may limit the application of this method for the for the large scale or global scale monitoring. Moreover, there may exist the potentially distortive effect of NLM approach on radiometric information, which may be caused by the topography. In the future, our work will focus on improving the computational efficiency of this method and removing the topography-induced radiometric distortion as much as possible.

Tomograms in the HV and VV Channels
Tomograms of the aforementioned selected profile in the HV channel and VV channel were also estimated. The estimated spectra obtained by the traditional methods can basically only detect the backscattering phase center of the canopy in the high tree region (approximately the 200th to 600th pixels), as shown in Figure 10. Furthermore, this is mainly embodied in the beamforming and Capon methods. In contrast, although the method presented in this paper shows a certain improvement effect in the high tree area, there are still some areas where the backscattering phase center of the ground is not detected, especially for beamforming. This is mainly because the backscattering reflectivity of the HV polarization channel is mainly concentrated in the canopy. The inversion performances of the proposed method and the traditional method in the VV polarization channel are shown in Figure 11. It can be found that the backscattering power level of the VV polarization channel is between that of the HH polarization channel and that of the HV polarization channel, which is the same as the conclusion made by Tebaldini et al. [11]. In other words, the backscattering power from the canopy in the VV polarization channel is stronger than that from the HH polarization channel, and the backscattering power from the ground is stronger than that from the HV polarization channel.
Remote Sens. 2020, 17, x FOR PEER REVIEW 13 of 18 the HV polarization channel is mainly concentrated in the canopy. The inversion performances of the proposed method and the traditional method in the VV polarization channel are shown in Figure 11. It can be found that the backscattering power level of the VV polarization channel is between that of the HH polarization channel and that of the HV polarization channel, which is the same as the conclusion made by Tebaldini et al. [11]. In other words, the backscattering power from the canopy in the VV polarization channel is stronger than that from the HH polarization channel, and the backscattering power from the ground is stronger than that from the HV polarization channel.   the HV polarization channel is mainly concentrated in the canopy. The inversion performances of the proposed method and the traditional method in the VV polarization channel are shown in Figure 11. It can be found that the backscattering power level of the VV polarization channel is between that of the HH polarization channel and that of the HV polarization channel, which is the same as the conclusion made by Tebaldini et al. [11]. In other words, the backscattering power from the canopy in the VV polarization channel is stronger than that from the HH polarization channel, and the backscattering power from the ground is stronger than that from the HV polarization channel.

Forest Height Estimation
Besides the underlying topography, forest height is another significant parameter of forest resource surveying; furthermore, it is very important to estimate the forest biomass. In addition, the backscattering power of the canopy is mainly concentrated in the HV polarization channel, as we mentioned above. In order to clearly show the forest height estimation at the selected profile, the ground scattering phase was removed, as shown in Figure 12. In some areas (e.g., the three dotted circles), the traditional methods may cause canopy underestimation by mistaking the backscattering phase center of the ground for that of the canopy. However, the canopy height can be accurately estimated at the three corresponding solid circles in the results of the proposed method. In addition, the tomograms obtained by beamforming and Capon are similar, but the resolution of the latter is higher. The tomogram obtained by MUSIC is not as complete as those of the previous two methods. In the future, we will try to solve this problem by using the data of multiple polarization channels at the same time, and we will investigate in detail the performance of the proposed method in forest height inversion.

Forest Height Estimation
Besides the underlying topography, forest height is another significant parameter of forest resource surveying; furthermore, it is very important to estimate the forest biomass. In addition, the backscattering power of the canopy is mainly concentrated in the HV polarization channel, as we mentioned above. In order to clearly show the forest height estimation at the selected profile, the ground scattering phase was removed, as shown in Figure 12. In some areas (e.g., the three dotted circles), the traditional methods may cause canopy underestimation by mistaking the backscattering phase center of the ground for that of the canopy. However, the canopy height can be accurately estimated at the three corresponding solid circles in the results of the proposed method. In addition, the tomograms obtained by beamforming and Capon are similar, but the resolution of the latter is higher. The tomogram obtained by MUSIC is not as complete as those of the previous two methods. In the future, we will try to solve this problem by using the data of multiple polarization channels at the same time, and we will investigate in detail the performance of the proposed method in forest height inversion.

Conclusions
In this paper, we utilized the idea of NLM filtering into the estimation of the CM in tomographic estimation, to solve the problems caused by the inaccurate estimation of the CM, such as the mixing and superposition of different scattering mechanisms, the inaccurate tomographic spectrum, the loss of detail information, and the low estimation accuracy of the terrain information. The three classical spectral estimation methods (i.e., beamforming, Capon, and MUSIC) based on the NLM method are mainly used to perform tomographic focusing. Specifically, these three methods first calculate the SCM of all the pixels in the target region. Secondly, for each target pixel, in its search window, the similarity, including the spatial and radiometric similarity between each neighborhood pixel and the target pixel, is calculated through a matching window. The similarity weight between each neighborhood pixel and the center pixel is then calculated using the spatial and radiometric similarity. Finally, the optimal CM of the target pixel is estimated from the SCM of each neighborhood pixel and its corresponding weights. For the purpose of verifying the feasibility and effectiveness of this method, we used BioSAR 2008

Conclusions
In this paper, we utilized the idea of NLM filtering into the estimation of the CM in tomographic estimation, to solve the problems caused by the inaccurate estimation of the CM, such as the mixing and superposition of different scattering mechanisms, the inaccurate tomographic spectrum, the loss of detail information, and the low estimation accuracy of the terrain information. The three classical spectral estimation methods (i.e., beamforming, Capon, and MUSIC) based on the NLM method are mainly used to perform tomographic focusing. Specifically, these three methods first calculate the SCM of all the pixels in the target region. Secondly, for each target pixel, in its search window, the similarity, including the spatial and radiometric similarity between each neighborhood pixel and the target pixel, is calculated through a matching window. The similarity weight between each neighborhood pixel and the center pixel is then calculated using the spatial and radiometric similarity. Finally, the optimal CM of the target pixel is estimated from the SCM of each neighborhood pixel and its corresponding weights. For the purpose of verifying the feasibility and effectiveness of this method, we used BioSAR 2008 data obtained from the boreal forest of northern Sweden for the experimental verification. Tomogram analysis and underlying topography inversion were carried out. Moreover, in this paper, we have made a comparison between the consequences of the traditional LM method and the results of the NLM method. The consequences show that the NLM method performs significantly better than the LM method. The NLM method can not only better separate the scattering phase centers between the ground and the canopy, but it also shows a better inversion performance in the places where the topographic relief changes greatly. Moreover, the inversion accuracy of the NLM method is improved by more than 30% when compared with that of the LM method.
In conclusion, the NLM method considers giving different weights to the non-local neighborhood pixels with different similarity to the target pixel, to suppress the interference information and achieve more accurate estimation of the target pixel value. However, this makes the calculation slower and the estimation efficiency is reduced, which is something we need to investigate and improve upon in the future.