Remote Sensing Effects of Percent Tree Canopy Density and Dem Misregistration on Srtm/ned Vegetation Height Estimates

The U.S National Elevation Dataset and the NLCD 2001 landcover data were used to test the correlation between SRTM elevation values and the height of evergreen forest vegetation in the Klamath Mountains of California.Vegetation height estimates (SRTM-NED) are valid only for the two out of eight geographic directions, due to NED and SRTM grid data misregistration. Penetration depths of SRTM radar were found to linearly correlate to tree percent canopy density.


Introduction
Digital elevation models (DEMs) are digital representations of ground surface topography or terrain that provide quantitative elevation data for many applications in topographic and landcover studies, geomorphology, etc [1,2].
DEMs provide either bare earth (ground) -termed as digital terrain models (DTMs) -or elevations with respect to the reflective surface (which may be vegetation, human-made features, etc) -termed as digital surface models (DSMs).The U.S National Elevation Dataset (NED) gives bare earth elevations (DTM) for the conterminous U.S.A [3].The Shuttle Radar Topography Mission (SRTM) provides DSM elevation data for the majority of Earth's land surface [4].It has been shown [5] that SRTM OPEN ACCESS DSM, analyzed in conjunction with bald Earth elevation data from the NED, was highly correlated with vegetation height, yielding correlation coefficient values in the interval [0.76, 0.86] while the height differences computed from square elevation grids are aspect and slope terrain dependent [6].The aspect dependency of elevation differences among SRTM DSM and DTMs was found to be a function of the misregistration evident among the two height models [7].It is expected that the aspect/slope dependency of elevation differences will affect the vegetation height estimates from SRTM DSM and NED DTM data.On the other hand, the SRTM radar signal measurement results in a reflective surface elevation which depends on landcover and is a complicated function of the electromagnetic and structural properties of the scattering medium [8].The degree of penetration depends on vegetation type gap structure, canopy density, leaf-on versus leaf-off, wetness, etc [9,10].Thus it is assumed that percent tree canopy density [11] will also affect the SRTM vegetation height estimates.
The aim of this paper is to use landcover data [12] in order to assess the impact of the tree canopy density to the vegetation height estimates.Such an effort might improve the biophysical parameters estimates from SRTM data.

Study Area Data and Methodology
This section is devoted to describing the study area, the relevant DEM data and their derivative products (slope and aspect), the landcover data that were used, as well as the image of the noted elevation differences (NED minus SRTM).Statistics are computed for terrain classes in an attempt to model the impact of relief and tree canopy for the evergreen forest landcover class.The statistical distributions were modeled on the basis of mean, standard deviation, and the coefficient of skew and kurtosis [13].
Skewness characterizes the degree of asymmetry of a distribution around its mean [14].Positive skewness indicates a distribution with an asymmetric tail extending toward more positive values.Negative skewness indicates a distribution with an asymmetric tail extending toward more negative values.The coefficient of skew is a unitless number.Negative coefficient of skew indicates frequency distributions with asymmetric tails extending toward more negative values.According to an empirical rule [15], when the absolute value of the skew exceeds a value such as 0.5, then the error distribution is sufficiently asymmetrical to cause concern that the dataset may not represent a normal distribution.
Kurtosis characterizes the relative peaked ness or flatness of a distribution compared with the normal distribution.Positive kurtosis indicates a relatively peaked distribution (leptokurtic).Negative kurtosis indicates a relatively flat distribution (platykurtic curve), while a value of 0 represents a mesokurtic curve, more particularly, the bell-shaped curve of the normal distribution [14].

Data
The study area corresponds to a portion of the Klamath Mountains section that occupies parts of the northern California and southern Oregon between the Southern Cascade Mountains and the Coast Range mountains (Figure 1).It is bounded by latitude in the range 41.1815 o to 41.7380 o (N) and longitude in the range -124.0193o to -123.4545 o (W).The datasets (DTM, DSM and landcover data) were acquired from the USGS seamless data distribution system [16].Slope values (0 to 88 o ) were rescaled in the range 255 to 0. The lighter a pixel, the lower its slope.(c) Aspect image.Aspect values were quantified to the 8 directions (E, NE, N, NW, W, SW, S, SE) defined in a raster image while the zero label was used for flat terrain (if slope was less than 1 o , aspect was considered to be undefined).

Bare earth DEM, slope and aspect
The USGS National Elevation Dataset (NED) has been developed by merging the highestresolution, best quality elevation data available across the U.S into a seamless raster format.NED is essentially a digital terrain model (DTM) depicting bare earth (ground) elevations in a consistent projection (geographic coordinates), resolution (1 arc-second), and elevation units (meters).The horizontal datum is NAD83 and the vertical datum is NAVD88, except for Alaska, which are respectively NAD27 and NAVD29, with accuracy specification of root mean square error (RMSE) equal to 7 m [3].The NED DTM of the study area (Figure 2A) consists of point elevations in a grid of 2,003 rows by 2,033 columns.The orthometric heights (NAVD88) of NED DTM were converted to WGS84 ellipsoidal (geometric) heights using the GEOID03 geoid model as outlined in section 2.4 below.Subsequently, slope (Figure 2B) and aspect (Figure 2C) were computed [17,18].

SRTM finished DSM
The Space Shuttle Endeavour during an 11-day mission in February of 2000, known as "Shuttle Radar Topography Mission (SRTM)", obtained elevation data on a near-global scale (between 60 o N and 56 o S latitude) to produce the most complete, highest resolution DSM of the Earth [19].It produced digital topographic data on 3 arc-second grid which roughly translates to a 90-meter horizontal spatial resolution.The United States is covered by a more refined 1"x 1" grid (30-meter resolution).The elevation data are with respect to the reflective surface (first return), which may be vegetation, human-made features, etc.Several additional substantial editing steps (e.g.filling voids, modifying elevations over oceans, lakes, and lake islands), also referred to as "finishing" were applied [20].Its absolute horizontal and vertical accuracies is given as equal to 20 meters (circular error at 90% confidence) and 16 meters (linear error at 90% confidence) respectively [21].
The SRTM finished DSM of the study area is given in geographic co-ordinates (using as horizontal datum the WGS84 ellipsoid) with spacing of 1 arc second.The SRTM DSM grid values are provided in terms of orthometric heights with respect to the geoid model computed from the EGM96 geopotential model [21].Ellipsoidal heights with reference to the WGS84 ellipsoid were computed (Figure 3A) using the GEOID03 geoid model.Voids and points with slope less than 1 o (Figure 3B) were excluded from computations.Finally, the elevation difference (NED-SRTM) per grid point was computed (Figure 3C).These differences are affected by vegetation height and DEM errors.

Orthometric to geodetic height recalculation
In order to conduct a realistic and consistent comparison amongst the available height data sets, it was imperative that all heights refer to the same vertical datum.This is a rather complicated task considering that the available height data is not obtained in its original form and detailed accounts of the altered state of the height elevations are not readily available.Therefore, for the purposes of this study, it was decided to perform the data comparisons in terms of ellipsoidal heights with respect to WGS84, which for any further future analyses, is consistent with the geocentric reference system employed by GPS.In order to transform the NED (or SRTM) based orthometric heights H into the required corresponding ellipsoidal (geometric) heights h we used their known relationship with the geoid height N expressed by Equation 1: h-H=N (1) where in practice both sides of equation may be arrived at by separate means.That is, a gravimetric geoid model can give in a point of interest an estimate of the geoid height, whereas removing the orthometric height from an ellipsoidal height will yield another estimate.In the present case, in order to perform the necessary calculations for the ellipsoidal heights with respect to WGS84, we looked at first at two models of the geoid: (a) the geoid implied by the spherical harmonics expansion of the EGM96 geopotential model, and (b) through interpolation the GEOID03 model that is available in a grid form from the US National Geodetic Survey [22,23].
GEOID03 is a state-of-the-art refined hybrid model of the geoid in the conterminous United States (CONUS), combining the gravimetric geoid USGG2003 that uses the EGM96 global geopotential model (GGM) as an underlying long wavelength model [24], together with datum transformations and 14185 NAD83 GPS-derived ellipsoid heights on NAVD 88 leveled bench marks (the so called GPSBM2003 dataset).
It should be noted that the GEOID03 models geoid heights above the North American Datum of 1983 (NAD 83) which uses the GRS80 ellipsoid.However, latitudes and longitudes in the GRS-80 and WGS84 (G873) systems are very close to those of the NAD 83 system (with only 1-2 meters of horizontal shift).So any of these types of latitude and longitude (NAD 83 and WGS84) may be input, without affecting the interpolated geoid value.Since for the entire CONUS area the geoid is below the ellipsoid, the GEOID03 geoidal heights in the area of interest range from a height of about -25 meters in the northeastern corner portion and to a low of about -29 m in the southwestern corner portion of it (Figure 4A).
A straightforward comparison of the GEOID03 vs. the EGM96 geoid showed, as anticipated, that the former model provides a finer local detail (Figure 4B) of the geoid, with the differences of the two models ranging between 0.04 to -0.9 meters.Therefore, it was decided to use the geoid undulations corresponding to the GEOID03 model and to add them to the available DEMs (Equation 2) so that to derive the required ellipsoidal heights with respect to WGS84: h WGS84 =H DEM +N GEOID03 (2) It is important to note that evidently these geoid values are not identical with the values used in the original C-band processing of the SRTM data.On the other hand, the use of these values compensates somewhat for the 'discretization' error of up to one meter that is inherent in the available SRTM DSM values which are truncated integer-valued orthometric heights.

Landcover and percent canopy density
The US Geological Survey's National LandCover Database (NLCD) 2001 is a Landsat based landcover database containing 21 classes of landcover data [12] projected to Albers Conical Equal Area using the NAD83 Datum, GRS 1980 Spheroid, with a spatial resolution of 30 m.It improves upon its predecessor NLCD 1992 in two important ways.Whereas NLCD 1992 was simply a landcover data set, NLCD 2001 is a landcover database comprised of three elements: landcover, impervious surface and the sub-pixel percent tree canopy density.Secondly, NLCD 2001 used improved classification algorithms [25], which have resulted in data with more precise rendering of spatial boundaries between the standard 16 classes (with an additional nine classes being available in coastal areas and another four classes in Alaska only).
79.4% of the study area (Figure 5A) is covered by the evergreen forest class that correspond to areas dominated by trees generally greater than 5 meters tall, and greater than 20% of total vegetation cover (Figure 5A).More than 75% of the tree species maintain their leaves all year.Canopy is never without green foliage.According to the relevant metadata file [16] the overall accuracy (errors of Omission/Commission) is 88%/85.5% Kappa, while for the evergreen forest class the overall accuracy is 91% / 92%.
The sub-pixel percent tree canopy density within the NLCD 2001 database [11] for the study area is presented in Figure 5B.Tree canopy refers to a layer or multiple layers of branches and foliage at the top or crown of a forest's trees.The canopy density database element classifies each grid point into 101 possible values (0% -100%).The canopy density estimates apply only to the forest landcover classes.

Results and Discussion
For this study, the mask shown in Figure (3B) was applied to the landcover maps (Figures 5A, 5B) and so the points that corresponded either to voids or their slope were less than 1 o were excluded from our analysis.

Elevation differences modeling
The study area was divided into two terrain classes.The first one was formed by evergreen forest grid points with sub-pixel percent tree canopy density less than 66%.The excluded grid points formed the second class.Statistics (Table 1), frequency distributions (Figure 6) and rose diagrams (Figure 7) of the elevation differences per geographic direction were used.
It was revealed that elevation difference for grid points with slopes in opposite geographic directions was maximized, an exception being the N-S direction.For the majority of geographic directions, the absolute value of skew is less than 0.5 (Table 2), so the distributions (Figure 6) are sufficiently symmetrical for the normal distribution criterion to be fulfilled [15].Table 1 indicates that the majority of the frequency distributions are leptokurtic ones which, in turn, means that the higher frequencies are observed around the mean.
A two-sample means test (Equation 3) was applied [14], with the null hypotheses being that for a) E and W, b) SE and NW, c) SW and NE, directions the mean values were statistically the same: (3) X and Y correspond to the means of the two populations compared, Sx and Sy are the corresponding standard deviations, and nx and ny are the sample size   The differences in mean values (Table 1) were proven statistically significant since the observed tstatistics equal to a) 525.724, b) 615.74 and c) 467.37 that were far greater than the tabled critical value (2.326) of t (one-tailed test, for infinite degrees of freedom at the 0.01 level).So the null hypotheses were rejected.The data (Table 1, Figure 6) indicated an overestimation of elevation towards the NE, E and SE directions, whereas underestimation is evident along SW, S and NW directions.Figure 7 and Table 1, indicated that overestimation of elevation was greater for the terrain class formed by grid points with sub-pixel percent tree canopy density greater than 66%.Overestimation is acceptable since the SRTM radar signal measurement result in a reflective surface elevation which depends on terrain cover [7].

Elevation differences versus slope magnitude and percent tree canopy density
In order to reveal the possible slope and canopy dependency of elevation differences, both 1 o degree slope and 1% percent canopy intervals were defined.Mean elevation difference was then computed per slope/canopy interval per geographic direction (Figures 8, 9).Elevation differences were linearly correlated to the terrain slope, the steeper the slope, the greater the absolute value of the elevation difference (Figure 8).More specifically, an overestimation of elevation by the SRTM instrument was observed along the SE, E, and NE directions (negative elevation difference that decreases linearly with slope) while an underestimation was evident towards NW, W, SW directions (positive elevation difference increasing linearly with slope).The same results were obtained earlier for various world regions and terrains [17,26] and so it seems that there is rather a consistency in SRTM errors in relation to aspect.Thus, if vegetation height is assigned to elevation differences obtained from SRTM DSM and NED DTM, the effect of misregistration should be considered since estimations are valid only for the two geographic directions (N and S for the present case study).Elevation differences were linearly correlated to sub-pixel percent tree canopy density (Figure 9) per geographic direction (the greater the canopy density, the less the SRTM instrument penetration).
The relationship between the tree canopy (X) and the corresponding mean elevation difference (Y) per geographic direction was further explored by assuming the linear regression model (Figure 8).The test statistic for the significance of the regression model is distributed as the variance ratio (F) between the regression variance to residual variance [14].The null hypothesis under test is that of 'no linear relation of Y and X'.The computed variance ratio per geographic direction was given in Table 2. Since variance ratios were far greater than the F critical value that at the 0.01 significance level for (1, 89) degrees of freedom, the null hypotheses were rejected.Table 2. Linear regression of mean elevation difference (Y) per 1% width sub-pixel tree canopy density (X) per geographic direction.The linear regression coefficients (a and b), the slope of regression line expressed in degrees, the R square (R corresponds to the correlation coefficient) as well as the variance ratio were presented.The above findings clearly demonstrate that there are significant effects, such as different penetration depths of radar as a function of the tree canopy density which influence the biophysical estimates (elevation differences that were assumed to correspond to vegetation height).

Discussion
The NLCD 2001 landcover database was assumed to coincide with the landcover during the time of SRTM data acquisition.The SRTM mission took place during an 11-day period in February 2000; that is, during the winter period in the Northern Hemisphere.The study was concentrated only on the evergreen forest class (75 % of the tree species maintain their leaves all year).Hence, the influence of forest canopy seasonal variability (no leaf condition) on the elevation differences estimates should be minimal.
The aspect and slope dependency of elevation differences were computed on the basis of bare earth elevation (NED DTM) and not on basis of the reflective elevation surface (SRTM DSM).Elevation underestimation (Figure 8) is definitely geographic direction dependent [6,17] while the factors associated with are related to SRTM/NED misregistration [7].Slope magnitude (Figure 8) affects the elevation differences too [27].If vegetation height estimate is assigned to elevation difference (SRTM-NED) then valid estimates are only for two geographic directions (N and S for the current case study; Figure 8).
Both elevation data sources are not perfect and error is evident [3,15,19,21].NED DTM is of greatest accuracy (RMSE<7 m) than SRTM DSM (RMSE<10 m).Thus, the elevation difference (Figures 8 and 9) is a function of both vegetation height and the error in height estimates.On the other hand, mean elevation differences were computed per slope/canopy interval per geographic direction in Figures 8 and 9, and so averaging minimize the effect of the error (normal distribution of random error is assumed) and reveals the vegetation height effect.Figure 8 indicates that the mean elevation difference (vegetation height= SRTM DSM-NED DTM) corresponds to 15 m (the point in y axis that is crossed by the N and S lines).
The tree canopy density per geographic direction was correlated linearly to the SRTM/NED elevation difference.The absolute slope (Table 2) of the regression lines (for the 8 geographic directions) is within the interval [7.1 o , 12.7 o ] indicating that penetration depth of radar is decreasing linearly to the increase of the tree percent canopy density.
Figures 8 and 9 indicated that the most valid estimates of the relationship of the penetration depth of radar to the tree percent canopy density are obtained for the N and S directions.

Conclusions
The decomposition of elevation differences (NED DTM minus SRTM DSM) on the basis of aspect and slope terrain classes identified: a) overestimation of elevation by the SRTM instrument along the SE, E, and NE directions (negative elevation difference that decreases linearly with slope) while, b) underestimation was evident in the NW, W, SW directions (positive elevation difference increasing linearly with slope).These findings revealed the misregistration of NED DTM to SRTM DSM in the study area.
Elevation differences were linearly correlated to sub-pixel percent tree canopy density.The aspect/slope and percent tree canopy dependency of elevation differences (NED-SRTM) indicates that vegetation height estimates are mostly valid along two geographic directions due to the misregistration of DTM and DSM.The penetration depth of SRTM radar is tree canopy dependent and thus tree percent canopy density should be derived from Landsat ETM imagery in order to take it into account in vegetation height modeling studies from DTM and DSM data.

Figure 1 .
Figure 1.Location of the study area within the USA.

Figure 2 .
Figure 2. (a) NED DTM image.Elevation (geodetic height) values ( -24.5 to 1829 m) were rescaled in the range 255 to 0. The lighter a pixel, the lower its elevation.(b) Slope image.Slope values (0 to 88 o ) were rescaled in the range 255 to 0. The lighter a pixel, the lower its slope.(c) Aspect image.Aspect values were quantified to the 8 directions (E, NE, N, NW, W, SW, S, SE) defined in a raster image while the zero label was used for flat terrain (if slope was less than 1 o , aspect was considered to be undefined).

Figure 3 .
Figure 3. (a) SRTM DSM.Ellipsoidal (geodetic height) values (-28.4 to 1827.6 m) were rescaled in the range 255 to 0. The lighter a pixel, the greater its elevation.(b) The dark points corresponds either to voids (SRTM DSM) or to NED DTM points with slope less than 1 o .(c) Elevation difference image (NED DTM minus SRTM DSM).Values were in the range -128.16 to 173.07 m.The lighter a pixel, the greater its elevation difference.

Figure 4 .
Figure 4. Geoid-related maps in the area of interest.(a) Contours of geoid heights (in metres) as computed from the US NGS GEOID03 model.(b) Differences between the GEOID03 and EGM96 geoids (in meters)

Figure 5 .
Figure 5. (a).The white points correspond to the evergreen forest class of the study area.(b) The sub-pixel percent tree canopy density of the study area (the lighter a pixel, the greater its canopy density).

Figure 6 .
Figure 6.The frequency distributions per geographic direction.The y-axis represents the umber of grid points per 1 m elevation difference interval.

Figure 7 .
Figure 7. Rose diagrams for the mean elevation difference per geographic direction, for the study area (total area) as well as the 2 terrain subclasses formed by grid points with canopy density in the range [0,65] and [66,100].The elevation difference assigned to the radius of each rose-diagram was within the range [-30, -1] m.

Figure 8 .
Figure 8.The mean elevation difference versus 1 o slope intervals per geographic direction.

Figure 9 .
Figure 9.The mean elevation difference versus 1% interval of sub-pixel percent tree canopy density per geographic direction.

Table 1 .
Elevation difference statistics per geographic direction for the study area as well as for the 2 terrain subclasses formed by grid points with canopy density in [0,65] and [66,100] respectively.