In the following, the results of the GEDI-DTM and -CHM accuracy analysis from the previous section are discussed and put in the context of existing literature. However, a number of factors that may limit this discussion needs to be addressed beforehand: at the moment of writing (October 2020), no similar studies analyzing the accuracy of GEDI height metrics are available. Therefore, studies that analyzed the accuracy of height metrics from similar spaceborne lidar missions (especially ICESat) have to be used. In addition to using data from a different mission, these studies mostly use mean and standard deviation to describe the height differences between the analyzed heights and reference heights. These indices are susceptible to outliers and therefore deliver values that may differ from the values of the parameters used here (median and MAD), if the analyzed distribution is not gaussian. Unless the studies used for this discussion specifically mention the nature of the height differences distribution (which is rarely the case), their error indices may not be fully comparable with those from this study. It also has to be taken into account that the reference data in this study is itself based on lidar height estimates. Their specified accuracy is much higher than that of the GEDI lidar, in addition to a different acquisition method (discrete return vs. full waveform). Still, their height estimates might also contain certain errors (e.g., a general underestimation of forest heights) that are characteristic for lidar height estimates.
4.1. GEDI-DTM
The general statistical analysis of the DTM-differences reveals a relatively low median (and therefore a relatively high accuracy of the GEDI terrain height estimates), but also a high MAD (and therefore a relatively low precision of the GEDI terrain height estimates). The high accuracy is further indicated by the high R2-values, while a number of high outliers, especially in the Thuringian forest area, mirror the relatively low precision. These outliers are dispersed over the study areas and not spatially clustered, though some of them are part of the same orbit-tracks and are therefore spread over the study areas along these tracks.
The median DTM-differences from this study (Roda: −0.26 m, Thuringian forest: 0.18 m) are in the same value range as corresponding mean values from similar studies. Chen [
3] found a mean difference of −0.97 m for a study area in North Carolina (using ICESat data), while Wang et al. [
32] found a mean difference of −0.61 m for Alaska (using ICESat-2 data). The standard deviation values in both studies were similar to the MAD values found here, however the value range of the DTM-differences was much smaller. In both studies, diverse study areas with various types of land cover and topographies were used. Hilbert and Schmullius [
4] analyzed ICESat height estimates in forested and rugged terrain. Their results are especially relevant for this discussion, since their study area is also situated in the Thuringian forest, partially overlapping with the Thuringian forest study area from this study. Using height metrics derived from the ICESat-GLAS GLA01 product, Hilbert and Schmullius calculated a mean difference of 0.19 m compared to a reference ALS-DTM, which is remarkably close to the median of 0.18 m found in this study. Standard deviation of the differences was 5.01 m. In all studies mentioned, the R
2 for the correlation between analyzed and reference heights was between 0.98 and 1.
The analysis of the DTM accuracy based on height estimates derived with different algorithm setting groups shows both similarities and differences between the two study areas. The variation of median DTM-differences is higher in the Thuringian forest area. The height metrics from algorithm setting group 1, which are included in the L2B-datasets of the data used here, indeed are the most accurate for the Roda area, but not for the Thuringian forest area. Here, algorithm setting group 3 delivers the most accurate height estimates. In both areas, algorithm setting group 5 delivers a significantly lower accuracy than all other groups. This is caused by its lower waveform signal end threshold compared to the other five setting groups. A low threshold can lead to the detection of noise below the actual ground return signal, resulting in an underestimation of the ground elevation [
27]. Algorithms with higher signal end thresholds (e.g., algorithm 1 and 3) are more accurate in this study, allowing the conclusion that these algorithms seem to be preferable for estimating ground elevation in temperate forests. The identical accuracy of algorithm setting groups 1 and 4 is due to the fact that these algorithms only differ in regard to the signal start threshold, which has no effect on ground elevation estimations. It has to be noted that these accuracies are averages over all forest conditions present in these study areas. For the acquisition over more homogenous forests (e.g., with consistently high canopy cover), different algorithms might be preferable.
The analysis of the relation between GEDI-DTM accuracy and various environmental and acquisition parameters yields somewhat surprising results. An example is the low influence of land cover within the footprint. Several studies (e.g., Hodgson et al. [
37]) observed a lower accuracy of lidar-based DTM over areas with dense/high vegetation. In such areas, the assumption that the lowest mode of a received lidar waveform represents the ground is problematic, since this signal can also stem from low vegetation [
38]. However, the results from this study show a slightly higher accuracy of the DTM in mixed forests than in agriculture or grassland areas. The high accuracy of the DTM in urban areas is equally surprising, since those areas are usually challenging when generating lidar-based DTM [
39].
Other results of the analysis are more in accordance with literature. The increased scattering of DTM-differences with increasing slope has been observed in numerous studies [
2,
32,
40]. This effect can occur due to different reasons: on sloped and forested terrain, lidar returns from both vegetation and ground can occur at the same height and therefore might not be distinguishable in the return waveform, especially since the ground returns are spread out over different heights [
41]. Also, geolocation errors of footprints can lead to height errors if the footprint area is sloped in the same direction that the footprint is misplaced, since the elevation where the footprint was (falsely) located can be different from the elevation at the actual position of the footprint [
42]. This possibility needs to be considered especially for this analysis, given the low current geolocation accuracy of GEDI data. For example, in an area with a slope of 8.31° (which is the mean value of the median slopes within the footprints of the Roda study area), two points along the slope gradient with a horizontal distance of 10 m (which is the lower end of the specified geolocation accuracy of GEDI), have a ground elevation height difference of 1.46 m, while a horizontal distance of 20 m (the upper end of the specified geolocation accuracy of GEDI) results in a ground elevation height difference of 2.92 m. These numbers may give a rough estimation on what scale geolocation errors can influence GEDI DTM accuracy in sloped terrain.
The slight decrease of accuracy with increasing canopy cover within the footprint has also been observed in other studies [
32]. This is due the fact that the number of laser impulses that can penetrate the canopy and reach the ground decreases with increasing canopy cover. Stereńczak et al. [
43] did not observe an influence of canopy height when analyzing lidar DTM accuracy in a temperate coniferous forest, which is in line with the results found here.
The higher accuracy of height estimates acquired during nighttime can be explained by the lower background noise level (which is increased by sunlight during daytime). This improves the lidar’s capability to detect the ground through dense vegetation [
44]. The fact that the beam type of the footprint has almost no influence on DTM accuracy is most likely a result of the overall relatively low density of the forests in the study area. The ability of GEDI’s power beam to penetrate a canopy cover of up to 98% (in comparison to the 95% penetrable by the coverage beam) brings no advantage in this case, since only 20 of the used footprints have a canopy cover of over 95%. The value range of canopy cover values in the Roda area is most likely also the reason for the lack of influence of beam sensitivity on DTM accuracy in this study. Only 16 of the footprints used in the Roda study area have a canopy cover that is higher than their beam sensitivity, meaning that the probability of the laser beam detecting the ground is below 90% for these footprints.
The significantly lower accuracy of height estimates from a number of orbit-tracks in both the Thuringian forest and the Roda study area appears to be not completely explainable. As mentioned, it can be assumed that this result is not caused by one of the parameters whose influence is analyzed here (e.g., an increased slope). The only attribute that is consistent for all footprints of each orbit-track is the acquisition date. Therefore, it could be suspected that certain atmospheric conditions on these dates are the reason for the low accuracy. One parameter of this kind that can influence lidar height estimates is rainfall [
45]. While precipitation was indeed recorded in the Thuringian forest area on all three acquisition dates of the orbit-tracks listed in
Table 8, no precipitation was recorded in the Roda area on the two acquisition dates of the respective orbit-tracks listed in
Table 4 [
46].
4.2. GEDI-CHM
The differences between the two study areas regarding the accuracy of the GEDI-CHM are much larger than for the DTM. While the MAD of the CHM-differences is similar in both areas (with values close to 3), there is a significant difference between the medians: 2.11 m in the Roda area versus −0.23 m in the Thuringian forest area. However, these results are still somewhat in accordance with literature, since the results from similar studies also show rather large differences: while Nie et al. [
47] and Popescu et al. [
10] observed negative mean differences, Enßle et al. [
3], and Hilbert and Schmullius [
4] observed positive mean differences. The absolute values of these differences range from 0.2 to 2.5 m, about the same range as the median values observed in this study. It also has to be noted that Hilbert and Schmullius also observed a significantly lower accuracy of the CHM than the DTM derived from the same dataset. Two possible reasons for this generally lower accuracy of CHM compared to DTM are the dependency of CHM accuracy on DTM accuracy and the much more heterogenous surface of the canopy, which requires higher geolocation accuracy of the data.
The differences in the results from the two study areas are also present in the analysis of the algorithm setting groups. While CHM accuracy is still generally higher in the Thuringian forest, this is not the case for every single algorithm. Most notably, algorithm 4 is the second least accurate for the Thuringian forest, but the most accurate for the Roda area. This may be due to its high waveform signal start threshold: the mostly sloped ground of the Thuringian forest means that the return signals from the canopy top are spread out over the return waveform and therefore might not be high enough to surpass the high signal start threshold. This leads to an underestimation of canopy height. It seems that this pattern of high canopy tops not being able to trigger the waveform start in this algorithm is also present in the Roda area. However, in this case it leads to a relatively high accuracy of the algorithm, because this ‘underestimation’ reduces the positive bias of GEDI canopy height estimates that the other algorithms display. Still, the generally higher CHM accuracy in the Thuringian forest area is somewhat surprising, since increased slopes are more known to lead to an overestimation of canopy height (see below). Of course, the accuracy of canopy height estimates derived from waveforms also depends on the accuracy of the ground estimation. For example, the low CHM accuracy of algorithm 5 in both study areas is most likely caused by its low DTM accuracy: here, the underestimation of ground elevation results in a subsequent overestimation of canopy height. The accuracy metrics are also again averages over all forests in the study areas and might be different for specific, more homogenous forest areas.
Certain characteristics of the ALS reference data might have also influenced the results. Small-footprint, discrete return lidar systems tend to underestimate tree heights, especially of coniferous trees, because their laser impulses often miss the top of the tree crowns [
38]. The time span between the acquisition of the ALS height estimates and the GEDI height estimates also has to be taken into account when analyzing the GEDI-CHM accuracy. This time difference is consistent for the Roda area (5 years) but varies in the Thuringian forest area (between several months and 4 years). The mean annual height growth rate of the main tree types in the Roda area (pine and spruce) varies between 20–60 cm [
48,
49]. A mean annual height growth of 40 cm would amount to 2 m in 5 years. Therefore, it seems at least possible that the median CHM-difference of around 2 m can be explained to a certain degree by the height increase of the canopy between the acquisition dates. Another result that might suggest this is the apparent influence of vegetation height on CHM accuracy. In general, the annual height growth of trees slows down after the trees reach a certain age (depending on tree type and environmental factors) [
50]. This might explain why the median CHM-difference in this study is much higher for shorter (presumably younger) trees than for taller (presumably older) trees (
Table 11), assuming they experienced a larger height increase between the acquisition dates. However, this difference in accuracy appears to be too large to be solely explainable by height growth differences, and therefore may not be adequate to accurately quantify the influence of the acquisition time difference on GEDI-CHM accuracy. Also, there is no significant influence of acquisition time differences in the Thuringian forest area. Here, CHM accuracy does not significantly decrease with increasing time difference between the GEDI and ALS acquisitions (
Table 12). It also has to be noted that while the relation between CHM accuracy and vegetation height can be used to at least roughly assess whether vegetation growth between the acquisition dates had an effect on CHM accuracy, this does not give indications as to whether potential changes in land cover between the acquisitions might have influenced the results. There is, however, the possibility that the logging of trees in the Roda study area in June 2019 is the reason for at least some of the observed negative outliers.
The analysis of GEDI-CHM accuracy with regard to environmental and acquisition parameters shows, in some cases, discrepancies with the results from other studies. As an example, many studies [
51,
52] observe a higher accuracy of lidar CHM in coniferous forests than in deciduous or mixed forests. This is because deciduous forests (during leaf-on conditions) tend to have a more closed structure with less gaps for the laser impulse to detect the ground through [
4]. This could at least explain the higher MAD over mixed forests observed in this study.
The increased MAD of the CHM-differences in areas with steeper slopes in the Thuringian forest study area is more in accordance with literature. Hilbert and Schmullius [
4] observed a similar trend in an adjacent study area, using ICESat-GLAS canopy height estimates. Aside from the reasons for this effect that are stated in
Section 4.1 (regarding the influence of slope on DTM accuracy), increasing slopes also lead to an increased lidar waveform extent [
53]. This can lead to an overestimation of canopy height if the canopy height is derived from the height differences between the waveform signal start and the ground elevation.
The significantly lower CHM accuracy in areas with a canopy cover lower than 25% may be due to several reasons: in sloped areas with low canopy cover, the return signals from the canopy top might not exceed the waveform recording start threshold, so the waveform recording is triggered by lower canopy layers or ground return signals. In this case, the height difference between the waveform signal start and the ground elevation is lower or (close to) zero. In addition, if the vegetation is concentrated in the lower part of a sloped area, the waveform recording might also be triggered by ground surface from the upper part of the area, which has a higher elevation than the canopy top [
3]. The relatively low horizontal geolocation accuracy of GEDI footprints can also reduce CHM accuracy in areas with low and heterogenous canopy cover, if a footprint that covered vegetation is falsely located in an area with bare land, or vice versa. As
Table 13 shows, accuracy in areas with low canopy cover (<25%) improves when only footprints in flat terrain (median slope <3°) are considered, suggesting that slope is the dominant factor influencing CHM accuracy in areas with low canopy cover. The difference in accuracy between areas with low and high canopy cover that is still present in flat terrain (
Table 13) is most likely caused by geolocation errors, therefore giving a rough estimation on the influence of these errors on GEDI CHM accuracy.
The influence of acquisition time and beam type on CHM accuracy is generally similar to the results from the DTM, with the exception of the lower median of CHM differences from coverage beams compared to power beams. This may be connected to the increased median beam sensitivity of power beams (
Figure 13). For the dataset used here, the median CHM-difference increases with increasing beam sensitivity.
The connection between beam sensitivity and beamtype does not explain the increase of median CHM-differences with increasing beam sensitivity. As mentioned in
Section 4.1, the number of footprints where the beam sensitivity could have an influence on CHM accuracy is small, and, if anything, a positive influence of increasing beam sensitivity on CHM accuracy was to be expected. This may lead to the conclusion that beam sensitivity correlates with another parameter that has an influence on the accuracy. However, no parameter included in this analysis shows a noteworthy correlation with beam sensitivity (the highest being the correlation with slope, with a Pearsons’s r of −0.11).
As with the DTM, the low accuracy of height estimates from certain footprints in both the Roda and Thuringian forest area cannot be fully explained, with the influence of rainfall during the acquisition only being a possible explanation for the Thuringian forest footprints. In addition, values that form clusters of outliers in the scatterplots (
Figure 7 and
Figure 11) are dispersed over the study areas and not spatially clustered.