High-Resolution Forest Mapping from TanDEM-X Interferometric Data Exploiting Nonlocal Filtering

In this paper, we discuss the potential and limitations of high-resolution single-pass interferometric synthetic aperture radar (InSAR) data for forest mapping. In particular, we present forest/non-forest classification mosaics of the State of Pennsylvania, USA, generated using TanDEM-X data at ground resolutions down to 6 m. The investigated data set was acquired between 2011 in bistatic stripmap single polarization (HH) mode. Among the different factors affecting the quality of InSAR data, the so-called volume correlation factor quantifies the coherence loss due to volume scattering, which typically occurs in the presence of vegetation, and is a very sensitive indicator for the discrimination of forested from non-forested areas. For this reason, it has been chosen as input observable for performing the classification. In this framework, both standard boxcar and nonlocal filtering methods have been considered for the estimation of the volume correlation factor. The resulting forest/non-forest mosaics have been validated using an accurate vegetation map of the region derived from Lidar-Optic data as external independent reference. Thanks to their outstanding performance in terms of noise reduction, together with spatial features preservation, nonlocal filters show a level of agreement of about 80.5% and we observed a systematic improvement in terms of accuracy with respect to the boxcar filtering at the same resolution of about 4.5 percent points. This approach is therefore of primary importance to achieve a reliable classification at such fine resolution. Finally, the high-resolution forest/non-forest classification product of the State of Pennsylvania presented in this paper demonstrates once again the outstanding capabilities of the TanDEM-X system for a wide spectrum of commercial services and scientific applications in the field of the biosphere.


Introduction
Forests cover about 30% of the Earth's landmasses and represent a terrestrial ecosystem of fundamental importance for all living beings.Indeed, they play a key role in controlling climate changes through the continuous absorption and storage of carbon dioxide (CO 2 ) and its conversion into oxygen, having plants and trees acting as the lungs of our planet.Moreover, forests mitigate soil erosion, caused by either natural hazards or human activities, such as irresponsible farming.They are natural watersheds which catch rainwater, hence preventing flood events, and they represent a great source of energy, food, and livelihoods.Last but not least, forests are the natural habitat for a large variety of animal and plant species, preserving the existence of healthy ecosystems and biodiversities.However, starting in the last century, a massive loss and degradation of forests has occurred due to anthropogenic activities, putting this delicate natural balance in danger and, in many cases, leading to irreversible damages of sensitive environments, such as permanent extinction of plants and animal habitats or accelerated soil erosion [1].
In this scenario, the up-to-date assessment and monitoring of the forest resource state is a task of crucial importance and, for this purpose, spaceborne remote sensing represents a unique solution for providing consistent, timely, and high-resolution data from the regional up to the global scale.In the last few decades, the majority of global forest classification products have been generated from optical remote sensing systems operating in the visible and near-infrared frequency range.Among these, it is worth highlighting the global forest tree cover map produced in 2013 from Landsat data at a spatial resolution of 30 m [2].Given their day-light independence and their capability to penetrate through clouds, synthetic aperture radar sensors are now becoming operational systems for mapping forests coverage at a global scale.For example, global forest/non-forest classification maps have been provided by the L-band SAR sensor ALOS PALSAR at a posting of 25 m [3], and, more recently, by the TanDEM-X mission (X band) at a resolution of 50 m [4].
In this paper, we investigate the potential for high-resolution forest mapping with TanDEM-X InSAR data.The State of Pennsylvania, USA has been used as a test case, given the availability of a high-resolution external reference map, derived from optic and lidar data [5].Such a reference has been opportunely used for both training of the classification algorithm and validation of the resulting product.Here, we generated forest/non-forest mosaics with a resolution down to 6 m by exploiting the information derived from the volume correlation factor, which is derived from the interferometric coherence.The applied classification algorithm has been previously developed for the generation of the global TanDEM-X Forest/Non-Forest Map [4] and its settings are here optimized for the high-resolution case.
The precise estimation of the interferometric coherence is therefore a crucial step for the generation of accurate forest/non-forest maps.Nonlocal filters are nowadays the state of the art for the estimation of interferometric parameters at high-resolution, both in application to single-pair [6][7][8] and to multi-acquisition [9][10][11] interferometry.Their most prominent characteristic is the capacity to preserve the image spatial resolution while providing, at the same time, a strong denoising capability.For this reason, nonlocal estimation profitably supports a pixel-based classification method as the one presented in [4] and used in the present work as well.Here, we investigate the performance improvement derived by the use of the NLSAR filter [6] with respect to classical boxcar multilooking.
The paper is organized as follows: an overview of the Pennsylvania test site, of the TanDEM-X mission, and of the available data set is presented in Section 2. Section 3 summarizes the principles of the applied boxcar and nonlocal filtering methods, focusing on their most relevant aspects.In the same section, the coherence-derived volume correlation factor and the considered classification approach based on fuzzy clustering [4] are recalled as well.The resulting forest/non-forest classification maps at 6 m × 6 m and 12 m × 12 m spatial resolutions are presented in Section 4, together with the validation performed using the available high-resolution reference Lidar-Optic data (introduced in Section 2.1).The results are discussed and interpreted in Section 5. Conclusions and outlook are finally drawn in Section 6.

The Pennsylvania Test Site
The Pennsylvania state is situated in the northeastern and Mid-Atlantic part of the United States of America.Overall, it extends by 119,283 km 2 , with a width and length of of about 455 km and 273 km, respectively.Approximately 60% of the Pennsylvania territory is covered by temperate forests, which are mainly characterized by the presence of deciduous trees, such as beech, oak, maple, and birch.Other vegetation types include shrubs, bushes, and wildflowers.Regarding the topography, Pennsylvania is sort of diagonally bisected from southwest to northeast by the barrier ridges of the Appalachian Mountains, whereas the large Allegheny Plateau is located on the northwest side of the state.

The TanDEM-X Mission and Data Set
TanDEM-X (TerraSAR-X add-on for Digital Elevation Measurement) is the first spaceborne bistatic SAR mission comprising two formation flying spacecrafts.It is composed by the two twin satellites TerraSAR-X (launched in 2007) and TanDEM-X (launched in June 2010), which, since December 2010, have been operationally acquiring interferometric SAR images in bistatic configuration, with a resolution of about 3 m.Both satellites fly in a closely controlled orbit formation with the opportunity for flexible along-and across-track baseline selection.The mission primary goal was to generate a global, consistent, and high-precision digital elevation model (DEM) at a final independent posting of 12 m × 12 m [12], which was finalized and delivered in September 2016 [13].Since the beginning of the mission, more than half a million of high-resolution images have been acquired and processed, with incidence angles ranging between 30 • and about 50 • .For the generation of the forest/non-forest classification map of the State of Pennsylvania presented in this paper, we have considered 208 stripmap bistatic scenes (in HH polarization), each one typically extending over an area of about 30 km in range by 50 km in azimuth.An overview of the acquisitions coverage is given in Figure 1, where the amount of overlapping available scenes is clearly visible.Each area on ground was typically acquired once and only overlapping areas are characterized by two acquisitions.The dataset has been acquired by TanDEM-X in 2011 during the completion of the first global coverage.The interferometric baselines B ⊥ for the considered acquisitions are in the range between 100 m and 200 m.From such baselines, one can derive the the height of ambiguity h amb , which indicates the height difference corresponding to a complete 2π cycle of the interferometric phase.Hence, it quantifies the phase-to-height sensitivity of an interferogram and, for a bistatic acquisition, it is defined as: λ being the radar wavelength, r the slant range, and θ i the incidence angle.For the acquisitions considered in this paper, all heights of ambiguity h amb are comprised between 40 m and 50 m, as it is shown in Figure 2.Such values ensure a good quality of the phase unwrapping over most land cover types [14].For each scene, the interferometric processing was carried out using the experimental TanDEM-X processor (TAXI), developed at the DLR Microwaves and Radar Institute [15].For the estimation of the interferometric coherence, we considered both boxcar and nonlocal filters.We present forest/non-forest classification maps generated by using boxcar window filtering resulting in an independent posting of 12 m and 6 m, corresponding to approximately 4 to 16 looks, respectively, depending on the specific geometry.In addition, we exploited the NL-SAR filtering approach at a posting of 6 m.For nonlocal methods, the number of samples/looks used for the coherence estimation depends on the specific characteristics and properties of the illuminated target.Compared to the boxcar one, nonlocal filtering shows outstanding performance in terms of noise reduction capabilities together with high detail preservation, at the cost of a larger computational burden, and can be used e.g., for the generation of high-resolution DEMs.In this framework, we can mention the Nonlocal Means (NLM) filter [16], extended to InSAR data statistic in [6], which has been successfully applied for the generation of high-resolution DEMs from TanDEM-X data [17].The most relevant aspects of nonlocal filtering methods are briefly recalled in Section 3.1.This data set was generated by a joint collaboration between the University of Maryland and the University of Vermont and released in 2015.Optic and lidar data acquired between 2006 and 2008 were combined to generate a forest/non-forest classification map (binary information) for vegetation higher than 2 m, with a ground resolution of 1 m × 1 m.The methodology used for the generation of this map is described in [5], and an accuracy typically larger than 98% was obtained [18].In order to opportunely compare such a high-resolution map with the forest/non-forest classification products provided by TanDEM-X, we scaled the original resolution down to the one of the desired posting pos (equal to 6 m or 12 m) of the corresponding TanDEM-X classification map.Therefore, we averaged all pixels of the Lidar-Optic map within a cell of pos × pos meters, hence obtaining a forest density map as reference.This map has been used as reference for both training of the classification algorithm and validation of the resulting forest/non-forest classification maps.Clearly, we considered different portions of the reference map for these purposes (which are highlighted in Figure 1 with the overlaid polygons), as it is further discussed in Section 3.3.

The Nonlocal Filtering Method
In the framework of image denoising, the nonlocal-means filter (NLM) was introduced by [16] as a novel paradigm for the preservation of the image fine structure, details, and texture.Contrary to standard smoothing filters, that typically consider image features as noise and consequently remove them, the NLM firstly identifies similar structures within the image and then tries to preserve them in the subsequent filtering step.Indeed, the NLM algorithm assumes that the image has a good degree of redundancy or, in other terms, that every small cut-out of the image (namely, a "patch") repeats itself several times within the image itself.This property is successfully exploited to select candidate pixels picked out, in principle, from all over the image to perform the aimed estimation.More specifically, all the patches extracted from a given neighborhood are compared with the one associated to the current pixel to be estimated.
Once the similarity measure has been computed, the actual filter is then performed as a weighted average of neighboring pixels, with weights which are proportional to the patch similarity.The similarity between two patches is defined accordingly to the type of noise, e.g., in the case of additive white Gaussian (AWG) noise, the Euclidean distance is considered.This paradigm has been widely exploited in the field of image denoising and also extended to other more advanced filtering techniques, with respect the weighted average, as, e.g., in [19], where the nonlocal paradigm together with the filtering in the wavelet-transform domain is exploited.
For interferometric SAR (InSAR) applications, the estimation of the complex interferogram is a fundamental step, as it preempts any further processing.A critical problem in this context is the unbiased estimation of the coherence, i.e., the normalized complex correlation coefficient between master and slave acquisitions.This problem concerns every application that relies on the measured coherence values, such as the one that we propose in the present paper.Indeed, as it normally holds for the moving average filter, a bias-variance trade-off holds for the coherence estimation as well: the smaller the window size is, the more biased toward higher values the coherence moving average estimator will be.
The NLM algorithm is a suitable way to deal with this trade-off, since the filtering power (number of looks) can be increased independently of the desired spatial resolution.In order to better explain the applied processing, in the following, we present the definition of the used interferometric parameters.
Let us indicate the SAR Single Look Complex (SLC) image as u, defined as where A is the amplitude and φ the phase.The complex interferogram Γ can then be expressed as where the indexes 1 and 2 indicate master and slave SLCs.θ = φ 1 − φ 2 is the interferometric phase and the * and indicate the conjugate operator and the Hadamard product, respectively.For a given interferogram pixel p, its nonlocal means estimation is given by where q represents the comparison pixel taken from a neighborhood Ω of the pixel p.For the sake of simplicity, we indicate the image 2D coordinate with only one index, e.g., p.
In the present work, we exploit a well known algorithm for interferometric parameters estimation: the NLSAR algorithm [6].Here, the filtering is performed as in Equation ( 4) using the following weight kernel that is adapted depending on the local phase structure where c is a multiplicative constant and δ a parameter representing the trade-off between smoothing and detail preservation.The dissimilarity measure D NLM is defined as and depends on the generalized likelihood ratio between two interferometric patches (L G ), that for single look data is expressed as where C p and C q are the two covariance matrices related to the patches centered in p and q, respectively.Finally, a qualitative comparison of the pixel selection approaches is shown in Figure 3 for the boxcar and nonlocal methods.For each pixel of the image, the boxcar window (c) has a fixed shape, size, and weights.The nonlocal means (NLM) approach searches within a larger area to achieve a more effective filtering result.An exemplary estimation window according to the NLM is depicted in (d), where the red, orange, and yellow pixels qualitatively describe large, medium, and low weights, respectively, depending on the specific similarity measure.
The nonlocal means approach allows for the achievement of a better noise suppression, a finer resolution preservation and, at the same time, a reduction of the coherence estimation bias, with respect to a boxcar filter with an equivalent filtering power.Hence, we apply NLSAR on TanDEM-X interferometric SAR data to improve the resulting classification capability on vegetated (forest/non-forest) areas.In particular, we use NLSAR with standard parameters, with the search window size and the patch size that can vary up to 25 × 25 and 11 × 11 pixels, respectively.

The Volume Correlation Factor
As already mentioned in Section 3.1, the interferometric coherence γ represents the normalized complex correlation coefficient between master and slave acquisitions and describes the amount of noise affecting the interferogram [20].Several contributions may cause a coherence degradation in TanDEM-X interferometric data [14], which, assuming statistical independence, can be factorized as: where the terms on the right-hand side describe the error contributions due to: limited signal-to-noise ratio (γ SNR ), raw data quantization (γ Quant ), range and azimuth ambiguities (γ Amb ), spatial decorrelation (γ Range ), errors due to relative shift of Doppler spectra (γ Azimuth ), and temporal decorrelation (γ Temp ).The last term (γ Vol ) is called volume correlation factor.It describes the loss in coherence caused by the presence of volume scattering, e.g., over vegetated areas, and can therefore be exploited for forest mapping purposes.
Theoretically, a vegetation canopy can be modeled as the superposition of multiple scatterers located at different heights, each of those contributing with a different phase term [21].The resulting volume decorrelation is then obtained from the ensemble average over all scatterers within the canopy layer.For the sake of simplicity, we do not recall here the details of the described model (the reader can however refer to several available publications [12,21]), which has been successfully verified by means of time series of TanDEM-X bistatic acquisitions over forested areas [22].
Given a coherence estimate γ, the volume correlation factor can be easily quantified by compensating for all other error sources as: The impact and the evaluation procedure of each decorrelation contribution from TanDEM-X data is discussed in detail in [4], where it is shown that the interferometric coherence, and more specifically the volume correlation factor, are more suitable for forest discrimination purposes than the SAR backscatter, for which a higher confusion between different land cover types was observed.For this reason, the volume correlation factor only is exploited for the discrimination between vegetated and non-vegetated areas.A global mosaic of γ Vol from TanDEM-X data is shown in [22], where the coherence degradation in the presence of vegetation is clearly visible.In general, one should be aware that, due to the short wavelength, X band has a limited capability to penetrate through the vegetation canopy (if compared to longer wavelengths, such as L or P band), and the ability to discriminate between woody and herbaceous forest may be consequently affected.In this context, it has been demonstrated that the TanDEM-X system is actually sensitive to horizontal inhomogeneities in the vegetation canopy and a more elaborated model structure consisting of clouds of scatterers with gaps and extinction was suggested as more suitable to characterize the spectral properties of interferograms over forests [23].Moreover, it is worth noting that, in addition, we are currently investigating possible tomographic approaches for a better understanding of forest structure at X band [24].

The Forest/Non-Forest Classification Algorithm
In this section, we present an overview of the forest/non-forest classification method, which is applied to each TanDEM-X interferometric acquisition.It is based on a fuzzy clustering algorithm, an approach which is widely used in numerous contexts and applications for data classification, such as data mining or pattern recognition.For a more detailed description of the several aspects considered in the implementation, we refer to [4], where the present method is applied for the generation of the TanDEM-X Global Forest/Non-Forest Map.
For the present scenario, the volume correlation factor is applied as the only input feature for the discrimination of forested (F) and non-forested (NF) areas into c = 2 different classes.Fuzzy logic allows for a certain amount of overlap among different classes (or clusters), so that each input observation is theoretically associated to each existing partition with a certain degree or probability.For this purpose, we introduced a modified version of the membership function, originally proposed for the c-means fuzzy clustering algorithm [25], where we defined ad hoc weights depending on the statistical distribution of the input data.Hence, for each volume correlation value γ Vol,k , estimated for the k-th image pixel (according to Section 3.2), we defined a so-called weighted membership function Û = [u ik ] ∈ [0, 1], which describes the probability of that observation to belong to each of the c clusters (with i = [1, • • • , c]).Each cluster characterizes, therefore, observations with a high intracluster similarity and a low extracluster one.
Besides the physical characteristics of the forest under illumination, such as, e.g., its tree height and density, the volume correlation factor γ Vol strongly depends on the actual imaging geometry employed for the interferometric survey.Indeed, in [22], it has been shown that the X-band coherence over forested areas is considerably influenced by the local incidence angle (in particular, stronger decorrelation is expected for steeper incidence angles, due to the increased penetration capability of the radar microwaves through the canopy) and by the interferometric baseline B ⊥ (or, equivalently, the height of ambiguity h amb according to Equation ( 1)).The latter is due to the increase of the interferometric phase uncertainty of the ensemble average of the backscattered contributions within the same resolution cell.For the present forest classification method, a height of ambiguity not larger than 60-70 m should be employed, since, for smaller baselines, negligible volume decorrelation effects are typically observed, i.e., it becomes impossible to reliably distinguish between forest and non-forest classes [4].
Given a TanDEM-X bistatic acquisition, the algorithm settings (i.e., the cluster centers) are adapted to the specific acquisition geometry, namely the height of ambiguity h amb and the local incidence angle θ loc , which is the reason why we refer to a multi-clustering method for forest/non forest classification from TanDEM-X interferometric data.
An important task for any classification or regression method consists in the initial training of the algorithm by means of an external reference.This is carried out in order to derive the location of each cluster center v = {v F , v NF }.As discussed in Section 2.1, we used the high resolution Lidar-Optic forest/non-forest map, available for the complete State of Pennsylvania, both for training and for validation of the resulting forest/non-forest map.Obviously, different portions of the reference map were considered for training and validation, which are highlighted in Figure 1 in red and green, respectively.The heights of ambiguity of the considered acquisitions span between 38 m and 52 m (see Figure 2), while the nominal incidence angles range varies from 30 • to 50 • .It is important to note that the variation of the volume correlation factor γ Vol can be considered negligible within such a limited interval of heights of ambiguity.Hence, according to what presented in [4], we partitioned the original input γ Vol data set into N θ loc subsets, considering its variation with respect to the incidence angle only.Such partitions were selected as a compromise, on the one hand, to guarantee a sufficiently large and well-balanced number of input γ Vol observations for each θ loc interval, and, on the other hand, to opportunely sample the classification space comprised between the minium and maximum θ loc .As a result, we ended up with the following classification set-up: In the above relations, h amb,t and |θ loc,t | indicate the thresholds (subscript t) that define the different intervals (that is, each partition is identified by a pair of consecutive values of h amb,t and |θ loc,t |, respectively).
For each {l, m} subset of input observations derived as in Equation ( 10), the corresponding cluster center v l,m is finally obtained as the sample average of the corresponding γ Vol distributions for forested and non-forested areas [4] v E [•] being the expectation operator.As introduced in Section 2.1, the Lidar-Optic reference map resulting in a forest density map.For the generation of the cluster centers, we have investigated different thresholds on the Lidar-Optic reference for the discrimination between forest and non-forest (in particular, we have compared density values of 10%, 30%, 50%, 70%, and 90%).We could verify that the values of the cluster centers are almost independent of the specific threshold applied on the Lidar-Optic reference density map.This is because the vegetated areas are typically highly clustered within the estimation window, hence resulting in either very low (approaching 0%) or high (towards 100%) values which are actually assumed by the forest density.For this reason, we selected a reasonable threshold on the lidar-optic reference of 50%.
Given a TanDEM-X bistatic scene, we let the fuzzy clustering algorithm independently run for each incidence angle interval and we derived the weighted membership function, according to which the a priori information available from the data training is opportunely exploited to weight the Euclidean distances between the k-th observation and the i-th cluster.
The distribution of the volume correlation factor, estimated from the training data (red square in Figure 1) using boxcar and nonlocal filtering methods (both at 6 m resolution), is depicted in Figure 4a  The values of the corresponding cluster centers were calculated according to Equation ( 11) and listed in Table 1 for the different filtering methods (boxcar at 6 m and 12 m, nonlocal at 6 m), and incidence angle ranges, as in Equation (10).In general, lower mean values of γ Vol (and slightly smaller standard deviations) were obtained for the nonlocal filter, as a sequence of the larger number of looks used for the coherence estimation, as discussed in Section 3.1.Due to the smaller size of the boxcar window, the coherence is consequently biased, resulting in larger values of γ Vol , together with larger standard deviations.Moreover, we estimated the cluster centers over the same region by using quicklook images in input as well, which are reported in the last row of Table 1 (BXC-50 m).TanDEM-X quicklook images represent a spatially averaged version of the original full resolution data at a ground independent pixel spacing of 50 m × 50 m and their global data set was used for the generation of the global TanDEM-X Forest/Non-Forest Map.In this case, the obtained cluster center values are about 10% lower than those used for the classification of temperate forest for the global forest map, which were obtained using for training a data set over a large region in central Europe (see, in particular, Figures 5 and 6b of [4], for heights of ambiguity between 40 m and 50 m).Therefore, this aspect also suggests the opportunity to refine the grouping of the different forest types to improve the classification accuracy at global scale and will be the object of further investigations.
If we now consider bare soil areas instead of forested ones, the volume correlation factor is independent from the particular combination of incidence angles/heights of ambiguity [22], and mean values typically in the order of 0.95 or larger were obtained for the non-forest class v NF , which takes into account possible uncompensated decorrelation contributions, due to, e.g., ambiguities or geometry-induced spectral shifts.
Once the weighted membership was derived for each bistatic scene, the multiple information available from overlapping acquisitions needed to be properly combined, in order to generate the final mosaic.For this purpose, we basically applied the approach presented in [4], which is shortly recalled in the following for the sake of clarity.
For each image pixel, the N weighted membership available from N input overlapping acquisitions are merged in a weighted averaging process for the derivation of a combined membership Ũcomb where α i are the mosaicking weights, defined as where ∆γ Vol = ||v NF − v F || is the difference between the cluster centers of the corresponding forest (v F ) and non-forest classes (v NF ) and is a function of the local incidence angle and height of ambiguity, while γ SNR is the SNR correlation factor of the i-th observation.To obtain the final binary forest/non-forest information, we set a threshold at 50% on the weighted membership (at such high resolution, no significant dependency of the final classification performance on the selected threshold was observed, contrarily to [4]).Once this step was completed, we applied further additional information layers by exploiting external classification maps in a final post-processing step, in order to improve the resulting classification accuracy [4].For the present analysis, water bodies as well as urban areas were filtered out as follows:

•
Water Bodies typically show low backscatter values, in the same order of the system noise equivalent sigma zero (NESZ).These are due to the almost specular reflection of the radar signal over calm water areas, which may lead to a bias in the volume correlation factor and, in turn, to classification errors.To mitigate this effect, we used the freely available global map of open permanent water bodies provided by the European Space Agency (ESA) from the Climate Change Initiative (CCI) [26] at a spatial resolution of 150 m × 150 m.Moreover, we are currently investigating the opportunities for high-resolution water mapping by means of TanDEM-X interferometric data, as presented in [27].A preliminary example showing the potentials of the proposed method is shown in Figure 5, which depicts in (a) an optical image of the Lake Wallenpaupack (latitude/longitude 41.4 • N, 75.2 • W).The corresponding 150 m resolution water mask provided by ESA overlaid on the Lidar-Optic forest (green)/non-forest (white) map used for reference is shown in (b), whereas the experimental, high-resolution water mask derived by exploiting nonlocal-filtered coherence and amplitude generated from TanDEM-X data (also overlaid on the Lidar-Optic forest/non-forest map) is presented in (c).The improvement in terms of accuracy and achievable resolution is evident, and the generation of such a product at regional up to global scale will be object of future activities.

•
Urban Areas degrade the InSAR performance due to the occurrence of geometrical distortions and multiple reflections induced by man-made structures (buildings, bridges, . . .).Similarly to vegetated areas, these effects lead to a loss in the volume correlation factor.To prevent resulting misclassification, urban areas are opportunely filtered out by applying the binary Global Urban Footprint (GUF) derived from full-resolution TanDEM-X data backscatter information at a resolution of 12 m [28].

•
Invalid Pixels affected by geometric distortions, such as shadow or layover, are identified by exploiting backscatter as well as coherence information and are opportunely filtered out in the final classification map.

High-Resolution Forest/Non-Forest Maps of Pennsylvania
The high-resolution mosaicked Forest/Non-Forest (F/NF) map of the State of Pennsylvania is presented in Figure 6.Although the details and spatial features cannot be appreciated from this large-scale figure, the map has been generated exploiting non-local filtering at a spatial resolution of 6 m.Forested and non-forested areas are depicted in green and white, respectively.Water bodies are indicated in blue, whereas urban settlements and invalid areas are represented in black.The borders of the State of Pennsylvania are limited by the bold white line.Because of computational reasons, the map was divided into geocells extending by 0.5 • × 0.5 • in latitude/longitude.
In order to appreciate the difference among the different filtering types and their effect on the resulting classification, we present in Figure 7 the forest/non-forest map from a single TanDEM-X bistatic scene for a cut-off area of about 3 km × 2 km for different filtering methods and resolutions:

Validation and Performance
In order to validate the derived forest/non-forest mosaics and as quality measure for performance assessment, we use the accuracy A defined as which represents the fraction of pixels that have been correctly detected with respect to the total amount of pixels.The contributions in Equation ( 14) represent the true positives (TP, meaning "forest" classified in both, TanDEM-X and reference maps), true negatives (TN), false positives (FP), and false negatives (FN), which are contained in the so-called confusion matrix.The distribution of the values in the confusion matrix can be seen in the confusion map. Figure 8 shows the confusion map for three geocells, each one extending by 0.5 • ×0.5 • in latitude/longitude.The low-left corner coordinates for each geocell are indicated below each subfigures.The true positives (forest) and true negatives (non-forest) are indicated in green and white, respectively, while false positives and false negatives in red and blue.Filtered out pixels (water, urban areas, invalid) are depicted in black.The accuracy A, calculated for each validation geocell, is shown in Figure 9 for the classification maps obtained with boxcar at 6 m (blue), boxcar at 12 m (red), and nonlocal at 6 m (green).It can be seen that the nonlocal method always outperforms the boxcar at the same resolution, and only in a few cases the boxcar 12 m shows slightly better performance.The solid horizontal line indicates the overall accuracy on the complete validation area for each filtering method and posting, which is of about 76% for the boxcar 6 m, 77.4% for the boxcar 12 m, and 80.5% for the nonlocal 6 m.The dashed lines delimit the ±1σ interval from the overall accuracy, considering the single geocells accuracy.These results verify the advantage of exploiting nonlocal filtering methods for land cover classification with interferometric SAR data.Such a performance gain is obtained at a cost of a larger processing time, which is of about eight times heavier than a standard boxcar.

Discussion
In general, we observed lower values of accuracy in correspondence of areas dominated by the presence of rugged terrain, where geometrical distortions may cause an incorrect estimation of the volume correlation factor.This effect is visible, e.g., over the high-relief area of the Appalachian Mountains in the center of Figure 8a, where a larger amount of false negatives (i.e., TanDEM-X: non-forest, Lidar-Optic reference: forest, depicted in blue) is visible.In this sense, the reader should be aware that the data investigated in this paper have been acquired in ascending orbit only and that the availability of acquisitions with, e.g., crossing imaging geometry will help improving the final performance.Moreover, the low resolution water mask and some detection inaccuracies of the global urban footprint (GUF), applied to filter out water bodies and urban settlements, respectively, may additionally degrade the performance.This effect is visible, e.g., in the proximity of the urban area in the bottom-right corner of Figure 8a.Here, a larger number of false positives (i.e., TanDEM-X: forest, Lidar-Optic reference: non-forest, depicted in red) is visible.On the other hand, over flat terrain (as shown in Figure 8b,c), the classification does not show systematic errors, but the performance is still impacted by the high spatial variability ("noisiness") of the volume correlation factor, caused by the limited number of looks used for its estimation.This effect is particularly relevant for the boxcar filtering at 6 m, as visible in Figure 7a.This is another reason why the obtained accuracy levels are smaller than those obtained over the same area for the 50 m forest/non-forest mosaics discussed in [4], where an accuracy in the range between 85% and 93% was observed.As shown in Figure 9, the overall accuracy levels of the high-resolution maps are typically in the range between 65% and 90%.The performance decrease represents therefore the trade-off between the desired final accuracy and the improvement in terms of spatial resolution and details preservation (in our particular case, with respect to the 50 m maps, we increased the resolution by a factor of 17 and up to 70 for the 12 m and the 6 m mosaics, respectively).Moreover, one should note that for the global product presented in [4] at least two coverages acquired between 2011 and 2013 were mosaicked together and, over mountainous regions (such as the Appalachian Mountains) dedicated acquisitions with opposite viewing geometry (i.e., descending orbit) were additionally available, which allowed for a further mitigation of geometrical distortions and possible seasonal effects (i.e., change in coherence due to snow/rain events), increasing the final classification accuracy.On the other hand, for the investigations presented in this paper, only data acquired for the first global coverage (during 2011) have been considered, in order to limit the computational burden and keep the processing time feasible (in total, about 200 bistatic scenes were considered for this analysis).The inclusion of a larger dataset and the benefit on the resulting high-resolution classification maps, at the least on a local scale, will be object of future studies.
Looking at the high-resolution classification maps, the geocells generated by exploiting nonlocal filtering (green circles in Figure 9) show an accuracy that is in all cases better than the ones generated with the boxcar filtering at 6 m (blue circles), and in the majority of the cases (for 18 out of 24 of the analyzed geocells) better than those obtained by filtering with boxcar at 12 m resolution.The similar performance between boxcar 6 m and boxcar 12 m is due to the fact that the number of looks in both cases is comparable.

Conclusions
In this paper, we investigated the potentials and limitations for high-resolution forest classification by exploiting single-pass interferometric synthetic aperture radar (InSAR).We used bistatic TanDEM-X stripmap single polarization (HH) data acquired in 2011 to generate Forest/Non-Forest classification mosaics of the State of Pennsylvania, USA at a ground resolutions down to 6 m.For this purpose, we considered a multi-clustering classification approach, previously developed by the authors [4], which exploits the coherence-derived volume correlation factor γ Vol as the main input observable to discriminate forested from non-forested areas.In this work, we focused on different filtering methods used for the estimation of the interferometric coherence, in order to evaluate their impact on the resulting classification performance.In particular, we compared standard boxcar and nonlocal NLSAR filtering.We then validated the resulting forest/non-forest maps using, as external reference, an accurate Lidar-Optic forest/non-forest map of the area.The obtained results show that nonlocal filtering is a necessary step for the generation of highly accurate classification maps at such fine resolution.Thanks to their remarkable capabilities in terms of noise reduction and spatial features preservation, the NLSAR filtering approach provides an average improvement of the classification accuracy of about 4.5 percent points with respect to the equivalent boxcar at 6 m and of about 3% if compared to the boxcar at 12 m resolution, with an overall accuracy of 80.5%.
Future investigations will include alternative classification approaches for high-resolution forest mapping.In this scenario, the benefit and limitations of taking into account additional input SAR descriptors, such as, e.g., the backscatter, the interferometric phase and/or the DEM height estimate, will be considered.This will open the door for the utilization of more powerful classification methods

Figure 1 .
Figure 1.Optical GoogleEarth image of the State of Pennsylvania, USA; the green polygons indicate the TanDEM-X bistatic scenes footprints used for the generation of the forest/non-forest classification maps.The red and blue polygons highlight the regions where the Lidar-Optic reference map was selected for data training and product validation, respectively.

Figure 3 .
Figure 3. Principle of nonlocal pixel selection.The p pixel, in blue, is located on a road in the SAR amplitude images ((a,b) at the top).For each pixel of the image, the boxcar window (c) has a fixed shape, size, and weights.The nonlocal means (NLM) approach searches within a larger area to achieve a more effective filtering result.An exemplary estimation window according to the NLM is depicted in (d), where the red, orange, and yellow pixels qualitatively describe large, medium, and low weights, respectively, depending on the specific similarity measure.
,b, respectively, for forest (red) and non-forest (blue) areas, and for θ loc ∈ [35 • , 45 • ].Slightly smaller values for the volume correlation distributions obtained with nonlocal filtering, both for forest and non-forest classes, are due to the use of a larger number of looks, which brings to a reduction of the coherence bias, as explained in Section 3.1.

Figure 4 .
Figure 4. Sampled probability distributions of the volume correlation factor, estimated for the training data set for boxcar (a) and nonlocal (b), both sampled at 6 m resolution, for forest (red) and non-forest (blue) areas, and θ loc ∈ [35 • , 45 • ].

Figure 5 .
Figure 5. (a) Optic GoogleEarth image of the Lake Wallenpaupack (latitude/longitude 41.4 • N, 75.2 • W).(b) water mask (in blue) from the ESA-CCI [26] at a spatial resolution of 150 m overlaid on the Lidar-Optic reference map (green: forest, white: non-forest); (c) water mask generated by exploiting nonlocal-filtered coherence and amplitude from TanDEM-X InSAR data at a posting of 6 m, overlaid on the Lidar-Optic reference map.
(a) boxcar filter at 6 m resolution; (b) boxcar at 12 m; and (c) nonlocal filter at 6 m posting.The corresponding Lidar-Optic reference map at 1 m resolution is depicted in subfigure (d).By performing a visual inspection, it is evident that the nonlocal estimation approach provides the best performance in terms of details preservation and achievable accuracy of the resulting classification map.In the following, a detailed validation of the classification map generated with different filtering methods is presented.

Figure 6 .
Figure 6.Forest/non-forest map of the State of Pennsylvania (USA), generated using non-local filtering at a resolution of 6 m and overlaid on GoogleEarth.Forested and non-forested areas are depicted in green and white, respectively.Water bodies are indicated in blue, whereas urban settlements and invalid areas are represented in black.The borders of the State of Pennsylvania are sketched by the bold white line.

Figure 7 .
Figure 7. Forest/non-forest map for a small area in Pennsylvania (with center coordinates 41.1 • N, 76.6 • W), extending by about 3 km × 2 km, obtained from TanDEM-X data using boxcar filter at 6 m resolution (a), boxcar at 12 m (b), and nonlocal filter at 6 m posting (c); (d) shows the original Lidar-Optic reference map at 1 m resolution.

Figure 8 .
Figure 8. Confusion maps for three exemplary geocells, each one extending by 0.5 • × 0.5 • in latitude/longitude and with low-left corner coordinates indicated below each subfigures.TP are indicated in green, TN in white, FP in red, and FN in blue.Pixels which have been filtered out (water, urban areas, invalid) are depicted in black.

Figure 9 .
Figure 9. Validation of the forest/non-forest map by means of the Lidar-Optic reference map.Each accuracy value expressed in percentage and is obtained for a geocell extending by 0.5 • × 0.5 • , with low-left coordinates indicated on the horizontal axis.Each filtering method and posting is represented in one color (legend at the bottom-right corner).The solid horizontal lines indicate the overall acurracy on the complete validation area.The dashed horizontal lines represent the geocell standard deviation.

Table 1 .
Mean value (cluster centers, v F ) ± standard deviation of the distributions of the volume correlation factor for the class "forest", γ Vol,F , for different combinations of incidence angle ranges and nonlocal (NL) and boxcar (BXC) filter at different resolutions.Cluster Centers θloc < 35 • 35 • ≤ θ loc < 45 • θ loc ≥ 45 •