Forest Assessment Using High Resolution SAR Data in X-Band

Novel radar satellite missions also include sensors operating in X-band at very high resolution. The presented study reports methodologies, algorithms and results on forest assessment utilizing such X-band satellite images, namely from TerraSAR-X and COSMO-SkyMed sensors. The proposed procedures cover advanced stereo-radargrammetric and interferometric data processing, as well as image segmentation and image classification. A core methodology is the multi-image matching concept for digital surface modeling based on geometrically constrained matching. Validation of generated surface models is made through comparison with LiDAR data, resulting in a standard deviation height error of less than 2 meters over forest. Image classification of forest regions is then based on X-band backscatter information, a canopy height model and interferometric coherence information yielding a classification accuracy above 90%. Such information is then directly used to extract forest border lines. High resolution X-band sensors deliver imagery that can be used for automatic forest assessment on a large scale.


Introduction
Forest stand height is an important indicator for forest biomass for management purposes as well as for the assessment of carbon stocks [1]. The potential of such height estimates has been recognized by initiatives such as the Kyoto Protocol and by carbon accounting [2,3]. General forest parameters are an important source of information for monitoring climate change issues, quantifying renewable resources, and to observe deforestation and forest degradation. These parameters can best be estimated when 3D information on forest, i.e., a canopy height model, is integrated into the classification process [4]. However, forest is generally hard to map with optical sensors due to the influence from ground vegetation, shadow, cloud coverage and saturation (when the amount of biomass reaches a certain level), which may be especially high over rain forests [5][6][7].
Due to these issues the question arises if the forest stand height can be reconstructed using SAR images in X-band, since the signal penetrates clouds enabling continuous mapping also in tropical regions. The techniques that are used to retrieve 3D information from SAR data sets are SAR interferometry and stereo-radargrammetry. It is known that repeat pass SAR interferometry using X-band data cannot be applied practically over regions of forest due to the strong temporal phase decorrelation caused by the small wavelength of X-band [8]. Therefore, this study is based on stereo-radargrammetric processing of appropriate TerraSAR-X and COSMO-SkyMed data sets acquired at multiple viewing angles. Since the early years of SAR remote sensing stereo-radargrammetric techniques were applied to SAR image pairs, as demonstrated by the review paper given in [9]. The SAR stereo potential was distinctly augmented with the appearance of sensors capable of acquiring image data at different viewing angles, like the Canadian Radarsat-1 or the European Envisat-ASAR. In this context, investigations made in [10] and in [11] may be exemplarily referenced.
A new era was recently introduced with novel high resolution SAR satellite data in X-band, in particular from the TerraSAR-X [12] and COSMO-SkyMed [13] missions, both launched in June 2007. These sensors are able to collect images with a ground sampling distance (GSD) down to 0.75 m in Spotlight mode at various look angles. In addition, those sensors deliver imagery with very precise pointing accuracy [14][15][16], so that also remote regions where no reference data, i.e., ground control points, is available can be mapped and processed.
The presented approach for digital surface model extraction for forest assessment is based on the authors' previous works [17][18][19] where 3D surface reconstruction was performed by stereo-radargrammetry. A core achievement is the multi-image matching concept for digital surface modeling that has been transferred from the authors' previous work based on optical satellite images [20] to radar data, incorporating the SAR specific image geometry. Additionally, geometric constraints are embedded in the image matching process, improving the performance both in processing speed and accuracy, extending the standard stereo-radargrammetric approach.
One crucial problem using X-band signals is the downward penetration into the forest canopy causing a shift of the InSAR phase center, which causes a systematic underestimation of the forest height ( [21][22][23][24] and indicated in [25]; it should be noted that the cited works do not deal with TerraSAR-X nor COSMO-SkyMed data). The presented solution is based on a relative calibration to reference data and the validation of such surface models is made through comparison with LiDAR data.
In order to retrieve additional forest parameters, an image classification is presented based on X-band backscatter (intensity and texture), the estimated 3D canopy height model (CHM) and interferometric coherence information. Such information is then directly used to extract forest border lines. Overall, it will be shown that high resolution X-band imagery can be used for automatic forest assessment on a large scale.

Our Methods
To estimate forest height, the first step is to reconstruct a digital surface model (DSM) for the region of interest. In the second step, the forest height, or the so-called canopy height model (CHM), is extracted by subtracting an existing digital terrain model (DTM) from the surface model. It should be noted that this aspect is a weakness of the proposed method, since for many regions no accurate DTM exists. However, LiDAR data is currently available for large areas so that the presented method is a reliable and low priced method for updating existing LiDAR based CHMs.
Since the X-band radar signal penetrates into the canopy, causing a bias in the height estimate, a forest segmentation is performed, which is then used to correct the DSM according to a calibration to reference information. In this work a forest segmentation is defined as a binary classification into forest and non-forest regions, i.e., a forest mask. An overview of the proposed workflow is sketched in Figure 1. As seen, a DSM is extracted using multi-image radargrammetry. This DSM is utilized together with InSAR products and backscatter information to derive a forest mask. Finally, this mask helps to correct the height of canopy regions resulting in the final corrected DSM. It can be seen, that a forest mask is generated as a byproduct, which is in fact a very important forest parameter by itself.
The proposed algorithms are implemented in two Joanneum Research in-house software environments, namely the "Remote sensing Software Graz (RSG)" and the "Image Processing and Classification Tool (IMPACT)". Proposed workflow for deriving forest parameters using X-band SAR data.

Multi-Image DSM Generation
The accurate 3D reconstruction of forest regions using very high resolution SAR imagery alone is very challenging due to two reasons.
• First, the traditional repeat pass interferometric processing does not yield appropriate results over forest as the InSAR phase decorrelates within the 11 days TerraSAR-X repeat cycle [8], but also for shorter temporal baselines, like one day for COSMO-SkyMed-2 to COSMO-SkyMed-3 sensors. The same observations hold for ERS-1 and ERS-2 [26] and ERS-1/2 tandem [27].
• Second, even in cases of temporal phase correlation the resulting canopy height is systematically underestimated. The reason for that is the fact, that the SAR signal in X-band penetrates into the forest canopy changing the InSAR phase center and therefore the reconstructed height. This aspect has been observed on InSAR-based processing of airborne X-band data [21][22][23]28], on X-band SRTM satellite data [7], on Radarsat-2 data [25], and on simulations [22,24].
To tackle all these difficulties we first derive digital surface models using a multi-image stereo-radargrammetric approach. Then, the canopy height underestimation is corrected from the resulting DSM by applying an empirically learned correction model on regions of forest.
The radargrammetric processing is described in detail in [18] and can be applied successfully due to the very exact pointing accuracy of the TerraSAR-X and COSMO-SkyMed sensor [14][15][16]. The main steps in the DSM extraction are pairwise stereo matching followed by a joint point intersection procedure. The matching itself is based on a hierarchical approach employing multiple normalized cross-correlation kernels of different sizes as similarity measure. To get robust matching results image triplets are used, i.e., three SAR images acquired at different look angles. This method takes advantage of the good matching properties of adjacent images (i.e., small look angle difference below 15 • ) and the good geometric properties of non-adjacent images (i.e., large look angle differences) at the same time. By additionally including the matching results of non-adjacent images the spatial point intersection procedure increases in robustness due to an over-determination.
Stereo matching of a SAR image pair is improved by including geometric constraints. First, one image is quasi-epipolar registered to the other based on an affine polynomial transformation employing both sensor models to automatically generate tie-points. Second, for image matching a starting location for each pixel is predicted, again using sensor models and a coarse DSM (SRTM or ASTER model), by forward and backward intersection.
The individual steps of the processing chain are described in detail below. First, two images forming a stereo pair are processed to get the disparities between those images. Second, several of such disparity maps are jointly processed to extract one DSM.
First, for each stereo pair dense correspondences are extracted by: • Preprocessing of the SAR images by means of despeckle filtering. The tests revealed that a classical Lee filter with 5 × 5 pixels neighborhood increases the accuracy of the follow-up image matching procedure. Other filters like Kuan, Frost or GammaMAP did not show additional benefits w.r.t. to the final DSM quality.
• Automatic tie-pointing between the two images is based on the SAR sensor models and a coarse DSM (SRTM or ASTER model) or an average elevation estimate of the area of interest. A regular grid of points in the reference image is projected onto the DSM or an ellipsoid and then back-projected into the search image. Using these 2D tie-points an affine transformation is determined using a least-squares approach.
• Coarse registration of the search image to the reference image employing the affine transformation.
As interpolation a 6-point cubic resampling method is used. The main purpose of this quasi-epipolar registration is that the main disparity direction is aligned with one image dimension (for TerraSAR-X and COSMO-SkyMed horizontal due to the across-track stereo setup, i.e., disparities in range direction), so that the search window can be set as an elongated rectangle in epipolar direction. As a side effect the registered images are not rotated towards each other so that the matching is limited to find a 2D translation of each point.
• Extracting disparity predictions between the reference and the registered search image. Like in the tie-pointing phase a regular grid (e.g., every 16th pixel) of the reference image is projected into the registered search image using the SAR sensor models and a coarse DSM resulting in a 2D shift vector per processed pixel. In the following matching step these vectors define the start location for local image matching in the search image. This step is very useful in case of steep terrain or in general in hilly areas where the disparities from one image to the other span over a wide range. Having a coarse starting location the search within the matching step can be constrained, yielding speed-up and simultaneously increasing the accuracy of matching, since the likelihood for perceiving similar objects within a smaller matching window decreases. In cases of relatively flat terrain or when no coarse DSM exists, this step can be skipped.
• Image matching in order to find point correspondences. The proposed approach is based on image pyramids, where the results, i.e., the disparities, are calculated on the smaller image pyramid level and are then projected to the next larger pyramid level for refinement (cf. [29]). The previously mentioned disparity predictions are employed on the highest pyramid level to define the starting locations for matching. When no such predictions exist a null predictor is used, meaning that the starting location for matching in the search image is the same pixel location as the current reference image pixel. The matching itself is based on an areal search, comparing the local patch of the reference image within a search window in the search image. It turned out that the normalized cross-correlation outperforms other similarity measures like sum of absolute differences, Census transform or mutual-information in case of X-band SAR data.
To get robust results multiple normalized cross-correlation measures of different spatial extend are combined. The basic idea is that large "kernels" yield robust coarse results, whereas small "kernels" yield better location accuracy however also produce outliers,i.e., mismatches. Therefore, the combination of cross-correlation kernels with the sizes 15 × 15, 7 × 7 and 3 × 3 pixels outperforms the individual results and is a trade-off between a robust results (larger kernel) and a precise location accuracy (small kernel). The similarity function is thus defined as the sum of the three normalized cross-correlation values centered on the same search image position. For all reported results three pyramid level were used, the search region is constraint to 11 × 5 pixels, only every 4th pixel has been matched for speed-up and the prediction is based on the SRTM model.
Second, multiple disparity maps are used to extract one DSM. In the proposed triplet approach three stereo constellations are incorporated (image 1 to image 2, image 2 to image 3 and image 1 to image 3): • Spatial point intersection, i.e., an iterative least squares approach to find the 3D intersection point of SAR range circles as defined by the corresponding image pixels delivered from image matching. Within the spatial point intersection, the matching results achieved from individual image pairs are jointly used. In the ideal case, a point can be matched in all three image pairs, yielding four range circles in space to be intersected (the fourth measure is collected by tracing points from image 1 to image 2 and then to image 3 using the adjacent stereo matching results). Due to the extended over-determination of the least squares point intersection, erroneous matching results are either detected and removed or their displacement impact is reduced. This methodology has also been applied successfully to optical imagery [20,30].
• DSM resampling or rather regridding, i.e., interpolation of a regular raster of height values from these 3D points. Remaining gaps are filled using linear interpolation of the neighboring height values.
The presented approach yields an areal digital surface model. When subtracting a reference digital terrain model (DTM), e.g., available from airborne laser scanning, a canopy height model (CHM) can be extracted (cf. Figure 2 and Equation (1)). Such CHMs serve as an important information for forest assessment. As mentioned before, the canopy height underestimation can be quantified using laser scanner reference data. Such a comparison enables to determine the underestimation factor τ in percent. In regions of forest the X-band CHM is then corrected by multiplication with the factor 1/(1 − τ /100), finally yielding the corrected DSM. The forest segmentation presented in the next section is then used to correct the canopy height bias (see Figure 1). It should be noted that this problem is not straight forward, as such underlying image segmentation often is just not available.

Forest Segmentation
It is important to note that by "forest segmentation" a binary classification into regions containing forest / non-forest is meant, not stand boundaries or individual tree segmentation. First results on this topic were published recently [31]. The authors perform the classification on TerraSAR-X backscatter mean and standard deviation statistics and assume, that the land cover is multinomial categorical distributed and thus use a logistic regression. The underlying coefficients are estimated based on a maximum likelihood method.
We extend their method by including backscatter intensity and texture information, a 3D canopy height model and interferometric coherence information. For classification a supervised approach is chosen by selecting multiple regions together with their ground truth class labels (forest / non-forest) and training a maximum likelihood (ML) classifier. The proposed ML method assumes Gaussian distributed data and therefore acts like a Mahalanobis distance classifier with prior probabilities. This classifier is then applied to the whole spatial extent of the given images. The resulting classification is constructed with a GSD of 5 meters. Next, very small areas of a size less than 20 pixels, i.e., an area of 500 m 2 , are rejected using a standard region labeling approach [32].
Texture Description. As observed in [31] regions of vegetation are less textured, i.e., more homogenous, than regions of settlements or agricultural areas. Authors of [33] suggest to describe this texture information by a variance filter. However, our tests showed that such a simple parameter is not working satisfactorily on TerraSAR-X or COSMO-SkyMed data. Therefore, we choose the Texture-transform [34] which is invariant to illumination, computationally simple and easy to parameterize so that it also performs reasonably on high resolution radar data. This transform can be seen as a spatial frequency analysis, where the key idea is to investigate the singular values of matrices formed directly from gray values of local image patches (the backscatter information in our case). More specifically, the gray values of a square patch around a pixel are put into a matrix of the same size as the original patch. The texture descriptor is computed as the sum of some singular values of this matrix. The largest singular value encodes the average brightness of the patch and is thus not useful as a texture description. However, the smaller singular values encode high frequency variations characteristics of visual texture. Therefore, the singular values of this matrix are sorted in decreasing order. Then the Texture-transform at each pixel is defined as the sum of the smallest singular values. Based on a trial and error method several window sizes and singular values ranges were tested, where a window of size 33 × 33 and a range of 20 to 33 smallest singular values performed best.
Canopy Height Model. Obviously, vegetation heights are a useful information to segment regions of forest. The canopy height model is extracted employing the methodology described in Section 2.1.
InSAR Coherence. For forest segmentation the interferometric coherence, which is a measure of the interferogram's quality, can be of great value since regions of vegetation suffer from temporal decorrelation (see also the detailed study on interferometric decorrelation [35]). The standard coherence estimation is based on a local complex cross-correlation and is known to over-estimate the real coherence value, especially in areas of low coherence (cf. [36,37]). In general, a larger window within cross-correlation provides a better, i.e., less biased, coherence estimate. Until recently the standard procedure was to estimate the coherence over the same window used for multi looking. As the multi looking sizes become smaller for TerraSAR-X and COSMO-SkyMed imagery the coherence is highly over-estimated resulting in a noisy coherence image. Therefore, a decoupling of the window size of multi looking and cross-correlation is introduced. The resulting coherence estimate uses a correlation window of 10 × 10 pixels and a multi looking window of 2 × 3 pixels to get quadratic pixels (azimuth × range). The correlation window size is a trade off between a rather unbiased coherence estimation (where larger windows perform better [36,37]) and a locally well defined unblurred estimate. The specific size of 10 × 10 pixels was empirically determined. Regions of very low coherence correspond mainly to vegetation (forests and agricultural areas). Thus, such coherence information is used in the classification process as one feature.

Test Sites and Data
Two test sites located in Austria were chosen to investigate the possible achievable accuracies of surface mapping and forest assessment. The test sites, i.e., "Burgau" and "Seiersberg", are presented in this section together with the available radar and reference data. An overview of the two test sites is shown in Figure 3. The test data consist of multiple TerraSAR-X and COSMO-SkyMed images.
• The TerraSAR-X imagery include multi-look ground range detected (MGD) Spotlight and Stripmap products from ascending, respectively descending, orbit. All images were ordered as single-polarization products (HH) with science orbit accuracy. All Spotlight products are within the full performance look angle range of 20 • to 55 • , while two Stripmap images are outsite the full performance range of 20 • to 45 • for Stripmap imagery [12]. It should be noted that the images acquired at steep look angles have a lower GSD than all other products. The TerraSAR-X InSAR pair consists of single look complex (SSC) data, ordered as dual-polarization products (HH,VV) with science orbit accuracy.
• The COSMO-SkyMed image triplet contains Spotlight 2 products in level 1B-Detected Ground Multi-look (DGM) mode from ascending orbit and right looking sensor in single HH polarization.
The image orientations are used as delivered by the TerraSAR-X and COSMO-SkyMed data providers and were not improved using manually selected ground control points (GCPs). Previous studies have shown, that the initial geolocation accuracies are very precise. The CE90 values, i.e., the 90th percentile of the length of residual errors, are given in Table 1 (results taken from [14,15]). Our own studies conducted on TerraSAR-X data confirm those results [16,18]. Table 1.

Test Site Burgau
This rural test area covers agricultural as well as forest areas and shows flat to slightly hilly terrain, the ellipsoidal heights ranging from 270 to 445 meters above sea level (cf. Figure 3 (left)). TerraSAR-X imagery were acquired in the period of July and August 2009 and image triplets are gathered from ascending, respectively descending, orbit at different look angles. Table 2 sums up the image acquisition parameters of the "Burgau" test site. The TerraSAR-X Burgau data set is particularly of interest since the look angles from ascending and descending orbit are very similar, making a direct comparison possible. COSMO-SkyMed images were acquired in August 2010 from ascending orbit. The parameters are given in Table 3.  The InSAR coherence used in the forest segmentation process is derived from a TerraSAR-X single look complex (SSC) InSAR pair (see Table 4). Additional InSAR pairs are not available for this study, so that the evaluation is limited to the presented pair.

Test Site Seiersberg
This sub-urban area covers urban, agricultural as well as forest areas, the ellipsoidal heights ranging from 350 to 750 meters above sea level. Next to the river in the center there are flat afforested regions (cf. Figure 3 (right)) while in the borders there is hilly terrain mostly covered by forest. For this test site only imagery from the TerraSAR-X sensor was available and was acquired in the period of April to June 2009, in Spotlight and Stripmap mode. These image triplets have been acquired from ascending and from descending orbit at different look angles. Table 5 sums up the image acquisition parameters of the "Seiersberg" test site.

Reference Data
To enable quantitative evaluations LiDAR data is used as reference. It is important to note that airborne laser scanning underestimates the true vegetation height (cf. [38,39]). Nevertheless, LiDAR data is of significantly higher accuracy than the DSMs to be expected from radargrammetric processing of X-band SAR data, so that it can be seen as a useful reference. The LiDAR data employed in this study covers four measurements per square meter, which are processed to highly accurate DSMs and DTMs.
While the DSMs are automatically extracted, the DTMs are semi-automatically generated by classifying regions of vegetation, building, bridges and other man-made structures. The data sets were acquired in 2009. Therefore, the canopy height underestimation for the COSMO-SkyMed data is a bit larger than in our analysis, since the trees were growing within this year. Figure 4 shows a small subset covering 1,200 × 1,000 m 2 , or 600 × 500 pixels respectively, to introduce the reference data visually. The LiDAR reference data is used twofold:  • First, to evaluate the DSMs derived using multi-image radargrammetry by a comparison to the reference LiDAR DSM. To differentiate the accuracies of regions on bare ground and in forests several regions of interest are selected defining two classes. The evaluation is then performed for these two classes and described by the average residual height error. Those manually selected regions are visualized in Figure 5 and have been selected in homogeneous terrain and tree height, so that the standard deviation of the height error indicates the 3D reconstruction inaccuracy rather than terrain properties. In Figure 5 also the number of areas of interest (AOIs), their mean spatial extension in square meters and the overall area is given.
• Second, to evaluate the forest segmentation quality a reference mask is derived using LiDAR data. The 1-m GSD LiDAR CHM is filtered with an order-statistic filter of size 7 × 7 and order 37, i.e., the 75th percentile. The CHM is then down sampled to a GSD of 5 m using a 5 × 5 average resampling. Next, pixels with a height larger than 8 meters are considered as forest regions and small regions are filled to eliminate noise.

Multi-Image DSM Generation
For visual interpretation some detailed results are given in Figure 6.
Shown are stereo-radargrammetric derived DSMs, CHMs and related height errors, together with LiDAR reference data and a topographic map for the test site "Burgau". The height error maps reveal that regions of forest are reconstructed too low (bluish colors), while non-forest regions correspond to the LiDAR height information (green colors). In addition so-called border or edge effects are visible, i.e., incorrect height estimates at 3D break lines, e.g., at forest boundaries. This effect is known and can be traced back to the SAR layover, foreshortening and shadow effects.
For the quantitative evaluation several DSMs are derived to enable a comparison and to show the benefit of the triplet-based approach.
• Three DSM are extracted from pure stereo constellations. In particular, two images of a triplet form a stereo pair that is processed while ignoring the third image. These resulting DSMs are labeled 12, 23 and 13 (cf. e.g., Table 6, first column), meaning that image 1 was matched to image 2, and the like.
• Two additional DSMs are extracted using the complete triplet. Constellation 123 are gathered by combining match results of adjacent stereo pairs (i.e., 12 and 23 ), while 123-c includes the so-called cross-matching constellation 13.
Therefore, in total five DSMs are calculated for each triplet. To evaluate their accuracies 30 regions of bare ground and 73 forested areas were manually selected for test site "Burgau" (82 and 71 for "Seiersberg"), spatially equally distributed over the scenes (cf. Figure 5). These regions are compared to the reference LiDAR DSM and the quality is described by the average height error µ in meters and the average standard deviation height error σ in meters. For the regions of interest over forest also the average canopy height underestimation τ is extracted and given in percent.
Test Site Burgau. Tables 6 and 7 reveal that the intersection angle (the starting look angle θ plus the intersection angle in degrees is given in these Tables) is indirect proportional to the resulting DSM quality (i.e., small intersection angles results in large errors, seen in the large standard deviation values of the stereo constellations 23 ). Therefore, the pure stereo constellation 13 yield best results (cf. [18]). When analyzing the triplets, it can be seen that the triplet using cross-matching performs better. Overall, the best results on bare ground have a mean value below 20 cm for TerraSAR-X and 50cm for COSMO-SkyMed with a standard deviation of 2 meters. However, when moving into regions of forest a systematic canopy height underestimation is visible (like predicted from previous studies on InSAR processing over forest [21][22][23][24][25]28]). Tables 6 and 7 reveal that the standard deviation of height error drops a bit in regions of forest, while the mean height errors show systematic bias. This aspect should be investigated in future. Again the stereo configuration with the smallest intersection angle behaves differently than all others which yield an underestimation in the range of 25 to 35% for TerraSAR-X. For COSMO-SkyMed the underestimation is in the range of 20% where no direct correlation of the intersection angle and the canopy height can be observed. For triplets using cross-matching for ascending and descending orbits a detailed analysis of canopy underestimation and canopy height is given in Figures 7 and 8  height of each forested area in decreasing order according to the mean LiDAR canopy height data of the AOIs. In addition the canopy height underestimation is given in percent. For TerraSAR-X data from ascending orbit the underestimation is between 25 and 30% and increases with canopy height (cf. Figure 7(a)) whereas in the TerraSAR-X descending case the underestimation decreases with canopy height. Since this variation is within 5% of the tree height it may yield from inaccurate image matching. This aspect should be investigated in future research.
The basic trend of underestimation is more or less similar for imagery from the different orbits (as expected). Beside this main trend the scattering of the individual measurements shows a lot of noise. For COSMO-SkyMed the underestimation is between 16 and 20% and is slightly decreasing with canopy height.
Test Site Seiersberg. The results on Spotlight imagery for this test site are similar to the previous test site. Table 8 again shows that the small intersection angles of the constellations 23 yield poor results with large standard deviation height errors. The best triplet configuration result in average accuracy of 20 cm with standard deviations of about 2 meters on bare ground. In regions of forest again the canopy height is underestimated. The underestimation τ is in the range of 20% to 30%, constellations 23 behaving differently. Detailed plots on canopy height and their underestimation are shown in Figure 9. The canopy height underestimation drops to 15% for small trees for Spotlight dsc123-c constellation (Figure 9(b)). Again, it is assumed that such outliers come from inaccurate image matching. This aspect should be treated in future research. The accuracy of DSMs resulting from Stripmap images are in general lower, i.e., less details are visible, in comparison to Spotlight data (cf. [14,16]), with canopy height underestimation in the range of 25% to 40%. Since, this study is based upon an evaluation on homogeneous areas of interest this aspect is not reflected directly in the quantitative results given in Table 8. However, it can be seen that regions on bare ground are reconstructed with 1.5 m bias in height, while the standard deviation of areas on bare ground and on forest is quite similar to the ones resulting from Spotlight data. Further investigations are planned over tropical forest, where obviously Stripmap data are the preferred acquisition mode due to the larger swath width and the continuous along track coverage. Figure 9. Seiersberg TerraSAR-X Spotlight asc123-c (a) and dsc123-c (b) and TerraSAR-X Stripmap asc123-c (c) and dsc123-c (d): Canopy height underestimation.
Overall. The derivation of the canopy height shows significant underestimation in the range of 20% to 35% in our study and on average 26.6% ± 1.4% (TerraSAR-X) and 19.6% (COSMO-SkyMed) for the triplets using cross-matching. As this aspect is a result of an intrinsic physical property of radar sensing in X-band, a similar height bias in surface models over forest is to be expected also from TanDEM-X [40] and COSMO-SkyMed in tandem mode [13]. The significant differences of the underestimation are most likely traced back to different acquisition conditions as the images are gathered with 13 months time lag. For a direct comparison image triplets of both sensors, TerraSAR-X and COSMO-SkyMed, should be acquired throughout a small and common time frame.
In the presented case the canopy height underestimation can be corrected by applying the factor 1/(1 − µ(τ )/100). After that the maximal height error over forest is reduced to 0.5 m for the triplet cases. Nevertheless, as the amount of height underestimation over forest depends on a manifold of a-priori unknown factors, deriving a highly accurate canopy height model using very high resolution SAR imagery is and remains very challenging. Additional experiments over different types of forest (deciduous and coniferous) will show if a pre-segmented forest classification is sufficient to undo the underestimation bias on a large scale.

Forest Segmentation
The segmentation is evaluated at one test site as only one InSAR pair was available for this study. It is assumed that other InSAR pairs from TerraSAR-X or COSMO-SkyMed would yield similar results, as the coherence over forest is always low using these sensors in repeat pass mode. For training the maximum likelihood classifier the AOIs in Figure 5 are taken as input data. In the testing phase the forest segmentation is evaluated on the whole region "Burgau" at a GSD of 5 m, in contrast to the DSM evaluation which is based on AOIs. The features used for forest segmentation are shown in Figure 10.
Obviously, the most important information for the segmentation are the canopy height model and the InSAR coherence. This aspect can be verified in the confusion matrices given in Table 9, where the best result is shown (coherence, canopy height model and texture of amplitude), plus the individual results using only one of the proposed modalities.
The confusion matrix in Table 9 reveals that 90% of all pixels (here one pixel has a GSD of 5 meters) are correctly classified with respect to the LiDAR reference. About 8% of non-forest regions are classified incorrectly as forest. This especially happens in small forest clearances which are not seen due to the slant range SAR geometry. The 2% of pixels classified wrongly as non-forest are mainly small forest stands where image matching is unsuccessful and thus such regions get interpolated.
A direct comparison to the classification results in [31] is explicitly avoided since different test sites are used and the evaluation strategies are not comparable. However, as can be seen in the confusion matrices in Table 9 it is obvious that the novel image modalities, namely the canopy height model and InSAR coherence, improves the segmentation results. It is to be expected, that such information would also increase the robustness and accuracy of the land cover extraction algorithm in [31].  Figure 11 shows some examples. Overall, the quality of extracted forest border lines is higher for huge dense forests than for small isolated stands (this aspect was also observed in [31]), where small stands are often not detected at all. Nevertheless, forest border lines are in general very well extracted and their accuracy is directly dependent on the forest segmentation. Figure 11. Detailed views on forest border line extraction for two subsets. On the left reference border lines are given and on the right the automatically extracted borders using TerraSAR-X alone.

Conclusions and Future Work
Very high resolution SAR imagery enables a forest assessment. In particular, multiple TerraSAR-X or COSMO-SkyMed images representing the same area on ground under different look angles can be used to fully automatically derive accurate DSMs based on radargrammetric processing. In case reference DTMs are available the canopy height model can be extracted. The forest canopy height is an important parameter as it is strongly correlated with other forest parameters, such as forest biomass, timber volume or carbon stocks. Furthermore, it serves as an important source for classification of forest types and conditions, forest morphology, crown closure, vertical structure and stand height [4]. The presented study revealed that the height of the canopy is systematically underestimated as the SAR signal in X-band penetrates into the canopy. Therefore, a forest segmentation is proposed yielding an accuracy of 90%. This segmentation result is subsequently applied to correct the canopy height bias in regions of forest. Incorporating this approach, the DSMs have an average height accuracy of 20 cm for TerraSAR-X and 50 cm for COSMO-SkyMed and a standard deviation of about 2 m on bare ground and over forest evaluated on manually defined areas of interest.
However, the canopy underestimation depends on various aspects, including tree species, forest stand density, tree height and look angles. The forest test sites used in the presented study contain more or less exclusively dense stands of deciduous trees. It is expected that the canopy underestimation will be larger for coniferous trees and for clearer stands. In addition, several questions arose during the interpretation of the canopy height underestimation that should be tackled in future research. The main points are the observed variation of the underestimation w.r.t. tree heights and the connection between underestimation and stereo intersection angle. It is unclear if those effects depend on inaccurate image matching or on other aspects. Also the fact that the standard deviation of height error is smaller in regions of forest than on bare ground could eventually be related to image matching. Overall, a more detailed analysis is demanded extending the presented AOI evaluation approach to a spatially dense verification framework as e.g., presented in [31].
On the algorithmic level future work should emphasize on the improvement of stereo matching techniques, where the used quasi-epipolar rectification should be replaced by a real epipolar rectification incorporating the sensor models that finally limits the search space of image matching to one dimension. Then novel image matching algorithms employing (semi)-global optimization schemes could be applied [30]. The method for land cover (in specific forest) classification should also be improved, both on algorithmic and on evaluation level.
On the semantic level future work should focus on a comparison to InSAR based DSMs of tandem data in bistatic mode (TanDEM-X and COSMO-SkyMed-Tandem), compromising multiple InSAR pairs with different look angles to better understand the penetration of X-band signals into the canopy. In addition, a time series over multiple seasons would be useful to monitor the influence of weather conditions and seasonal effects on the canopy height underestimation. In the optimal case, such time series should be acquired by TerraSAR-X and COSMO-SkyMed to enable a fair comparison between the two sensors.