1. Introduction
A detailed understanding of the magnitude and source of error in LiDAR elevation data and its derived products (i.e., digital elevation models (DEMs) and canopy height models [CHMs]) is necessary for operational use of LiDAR in deriving accurate forest inventory metrics [
1,
2,
3,
4]. Errors in the DEMs propagate to vegetation metrics hindering multitemporal comparison [
2,
3,
5]. However, the accuracy of the LiDAR derived DEMs has been often overlooked, although each stage of the modelling process potentially introduces error into the DEMs [
6,
7].
The general principle of assessing the vertical accuracy of an elevation dataset is to compare those elevations with a reference data, so that statistical parameters, such as the root mean square error (RMSE), can be calculated [
8]. Usually, the quantitative assessment of LiDAR elevation data is conducted by comparing “true” terrain checkpoints with LiDAR ground elevations by using DEMs [
2,
3]. The two DEMs are paired, and an elevation difference is calculated for each pairing (i.e., DEMs of difference) [
4,
8,
9]. The main drawback of this approach is that field surveying is very time consuming and, in some situations, such as in densely forested areas, it is impossible to collect elevation data [
10]. When true ground points are not available, interpolated checkpoints from high-density LiDAR ground data have been used as benchmark DEM to either assess the quality, correct other less accurate DEMs, or both, as photogrammetric [
9,
11] or satellite derived ones (i.e., SRTM) [
12,
13] using DEMs of difference as procedure. This method has demonstrated being robust to estimate vertical errors but only a few times, it has been used to correct raw elevation values [
12,
13]. In the latter cases, elevation differences were corrected by means of regression analysis, using several checkpoints but the corrected DEMs continued being affected by the random nature of the elevation errors. Using DEMs of difference as a local pseudo-geoid (i.e., all interpolated points with elevation deviations) allow adjusting less accurate elevation data (e.g., low-density LiDAR) by means of continuous elevation surfaces instead of only checkpoints. This procedure reduces random and other methodological elevation errors increasing the comparability between DEMs and derived vegetation products.
However, distinguishing between real changes and instrumental or methodological noise requires appropriate error analysis to ensure that DEMs of differences are reliable [
14]. Moreover, it has been seen that LiDAR errors are not constant within a site line but possibly even neither within a single scan line [
15]. The quality of LiDAR elevation data depends on several factors that can be grouped as follows: instrumental, methodological, and site-specific [
3,
4,
5,
15,
16,
17,
18]. Developing suitable corrections for such bias has been challenging because several factors contribute to the underestimation or overestimation of elevation with respect to the benchmark one.
The elimination or reduction of instrumental errors in the system parameters have been the focus of LiDAR research [
15,
19,
20,
21]. Instrumental errors are related to vertical accuracy, density and spatial distribution of elevation points [
8]. Errors in the vertical accuracy of survey instruments (e.g., global navigation satellite systems (GNSS), total stations (TS), the inertial navigation system (INS) and its derived inertial measurement units (IMU)) with respect to a specified vertical datum are one of the most important factors affecting the accuracy of LiDAR derived DEMs [
4,
22,
23]. The misalignment between the IMU system and the scanner (the boresight misalignment) is the largest source of systematic error in a LiDAR and must be addressed before the sensor can be effectively used because errors propagate into the subsequently derived products [
20]. These misalignments may cause systematic errors in DEMs of centimetric range over flat terrain, rising to decimetric range over steep slopes [
9]. Overall, vertical errors associated with instrumental issues occurred near strip boundaries [
15,
24] and increasing flying height and off nadir scan angle [
9,
19]. In addition, LiDAR point densities and spatial distribution of ground LiDAR points have significant effects on the accuracy of derived DEMs [
3,
10,
14,
25,
26,
27]. The quality of the terrain model is determined by the number and distribution of pulses that successfully reach the ground. Dense vegetation obstructs the LiDAR pulses and then, fewer pulses reaching the ground are available for DEM construction [
3]. It has been shown that very low pulse density during initial acquisition of LiDAR data is likely to compromise the quality of DEMs and derived canopy metrics [
14,
28]. However, with high-density LiDAR data point density reduction can be carried out without compromising elevation accuracy [
10,
27,
29].
Moreover, DEMs’ accuracy can be affected by methodological factors including the filtering algorithm to classify LiDAR points as ground or not ground and the interpolation methods [
4,
18]. Point cloud classification filters and their settings may induce DEM errors by misclassifying understory or groundcover vegetation as ground returns [
3,
4,
30,
31,
32,
33,
34]. Classification filters can be grouped into three families: progressive densification, morphological and segmentation-based. The most popular progressive densification algorithm is the progressive TIN densification developed by Axelsson [
35] and implemented in LasTools software [
36]. The progressive TIN densification filter is specially designed for airborne LiDAR and works very well in natural environments, such as mountains and forests [
37]. Regarding morphological filters, the most outstanding is the progressive morphological filter (PMF) proposed by Zhang [
38]. Although the use of the PMF has proven to be successful, the performance of that algorithm changes according to the topographic features of the area, and the results are usually unreliable in complex and very steep areas [
39]. Finally, within the segmentation-based filtering, the “Cloth Simulation Filter” (CSF) developed by Zhang [
39] is the most used, and compared with a reference DEM, this filter preserved the main terrain shape and the microtopography.
In addition, to transform raw LiDAR points into elevation surfaces (DEMs); those points must be interpolated onto a regular grid. Even though LiDAR points are sampled at very small separation distances, the interpolation from points onto a grid can introduce a degree of uncertainty into DEMs [
10,
22,
27,
40]. The interpolation methods for constructing a DEM can be classified into the following [
41]: (i) deterministic methods, such as the inverse distance weighted (IDW) (which assumes that each input point has a local influence that decreases with distance) and the Delaunay triangulation-based interpolation that performs interpolation within each TIN (triangulated irregular network); and (ii) geostatistical methods, such as Kriging, which take into account both distance and the degree of autocorrelation (the statistical relationship between sample points) [
42]. IDW worked well for dense and evenly distributed sample points [
5]. However, if the sample points are sparse or uneven, the results may not sufficiently represent the real surface [
41], and Kriging produced better elevation estimates [
13,
42,
43]. Nevertheless, there is still a lack of consensus about which interpolation method is most appropriate for the terrain data, and none of the interpolation methods is universal for all kinds of data sources, terrain patterns or purposes [
5,
18,
44,
45]. Finally, the level of uncertainty in DEM accuracy can vary greatly with different grid sizes for resampling. Spatial resolution has significant effects on the accuracy of LiDAR derived DEMs [
10,
27]. Behan [
46] found that the most accurate surfaces were created using grids which had a similar spacing to the original points. It has been seen that the generation of high-resolution DEMs from low-density LiDAR data are more likely to represent the shape of the interpolator used than the actual terrain (i.e., interpolation artefacts will become significant) [
47,
48].
Furthermore, the vertical accuracy of the DEMs is affected by site conditions (mainly vegetation and terrain morphological characteristics: steepness and roughness) [
5,
16,
17,
22,
49,
50,
51]. These influential site characteristics may be the largest contributors to DEM error, exceeding, in some areas, that caused by instrumental or methodological factors [
17]. Moreover, the steepness and roughness of the terrain are also responsible for errors in the acquisition of the original LiDAR data [
22,
52]. It has been found that as terrain slope increases, the vertical error in a LiDAR-derived DEM increases [
17], raising uncertainty on dependent forest metrics [
53,
54]. Bollmann et al. [
52] found a very low vertical error (±0.05 m) at slope angles <40° and high error (±1.0 m) for slope angles >40°. On steep slopes, the spatial arrangement of ground and vegetation returns have similar characteristics: large elevation differences within small horizontal distances [
53,
55]. Consequently, the magnitude of DEM errors and therefore the accuracy of LiDAR-derived terrain models in complex terrains can be very highly variable. Finally, dense canopy vegetation caused large vertical error in LiDAR derived DEMs due to the reduced ground point density [
2,
10,
13,
56,
57]; making that error increases rapidly with tree height [
13]. For example, Tinkham et al. [
33] showed DEM accuracies to vary from RMSEs of 0.13 m in forb meadows to 0.30 m under coniferous forest canopies and woodland ecosystem. The effect of vegetation on DEMs has resulted in both an overprediction [
16] and underprediction [
32,
33] of terrain elevation.
The aim of this study was to apply an improved empirical method (i.e., DEMs of difference) to make comparable multitemporal LiDAR datasets collected from different sensors with different point density. The improvement consisted of using continuous surfaces of elevation differences, that worked as a local and dense “pseudo-geoid”, instead of a collection of checkpoints, for adjusting elevations of low-density LiDAR to a high-density LiDAR benchmark. Moreover, we propose to use “the best DEM of difference” (with the lowest vertical error) obtained from comparing the vertical errors raised from different methodologies (i.e., classification filters, interpolation methods and spatial resolutions) to more accurately correct the elevation of low-density LiDAR in each environment. This approach will allow capturing of the micro-topography of each site and will reduce random or methodological errors that would be very difficult to correct by using only checkpoints. To accomplish this main objective:
- (i)
we explored the effects of methodological factors (i.e., classification filters, interpolation methods and spatial resolution) on vertical errors of DEMs (from the difference between low- and high-density LiDAR derived DEMs) carrying out a factorial ANOVA;
- (ii)
we assessed how site-properties (elevation, slope and vegetation height as well as the distance to the nearest national reference geoid point) explained the observed vertical errors of DEMs by running GAMs using a random sampling,
- (iii)
we recalculated the DEMs and the canopy height models (CHMs) from the corrected low-density LiDAR data using the best method (filtering-interpolation-resolution); and finally,
- (iv)
we assessed if vegetation height changes continued or not related to DEMs elevation errors.
3. Results
The Q-Q plots of the elevation differences between low-and high-dense LiDAR derived DEMs showed that vertical errors in each site followed a non-normal distribution (
Figure S2). Accordingly, the RMSE overestimated the vertical error, whereas the 50th percentile (P
50), underestimated it; the NMDA being more appropriate for being resilient to outliers in DEMs [
65]. Factorial ANOVA indicated that there were significant differences in elevation errors among sites (93% of the variance in residual heights was explained by sites (not shown)); and according to the NMDA and RMSE, among classification filters values too (the CSF filter was significant different of the rest showing the highest error) (
Table 5). On the contrary, there were not significant differences in vertical errors aggregating by the interpolation method or the spatial resolution. When vertical errors were analyzed within each site, significant differences among classification filters were found but not among interpolation methods or spatial resolutions (
Tables S2–S4). The greatest errors were in site 1 using the CSF filter (P
50: −2.38 m; NMDA: 1.07 m and RMSE: 3.19 m) and the lowest ones in site 2 using the TIN-SW2 filter (P
50: 0.01 m; NMDA: 0.28 m and RMSE: 0.41 m) (
Tables S2–S4).
Visually checking the DEMs hillshades, we assessed that the TIN densification algorithms using default parameters (DEF) or low spikes (0.5) (SF, SW2) were smoother (
Figure 3 and
Figure S3) showing lower vertical errors (
Table 5) whereas the CSF and the TIN densification algorithms using large spike (3) (i.e., WILD), followed by the morphological filter (PMF), showed a lot of bumps and spikes (
Figure 3 and
Figure S3); and also, high elevation differences (
Table 5).
After subtracting the elevation differences to the raw low-density LiDAR data using the best method (with the lowest vertical errors), a good adjustment was obtained (
Figure 4). Histograms of elevation differences before and after correction for each site using the best method are shown in
Figure S4. In addition, visual checking of the corrected hillshades allowed assessing of the qualitative improvement in respect to the hillshades derived from the original ground classification of low-density LiDAR data (
Figure 5). Furthermore, a cluster analysis on uncorrected and corrected elevation values using the k-means algorithm and the Dunn´s non-parametric test for Kruskal-type ranked data confirmed the significant effects of the correction of raw low-density LiDAR elevation data (
Figure S5).
The corrected P
50 ranged from −0.004 to −0.016 m, the NMDA from 0.06 to 0.10 m and the RMSE from 0.28 to 0.46 m, aggregating by sites (
Table 6). The corrected P
50 did not show significant differences among sites, classification filters, interpolation methods or spatial resolutions. However, the corrected NMDA and RMSE indicated that the CSF filter algorithm, the spatial resolution at 2 m and the IDW interpolation method, this latter only in accordance with the NMDA, showed the greatest errors (
Table 6). Moreover, factorial ANOVA within each site showed significant differences among interpolation methods and spatial resolutions. The greatest errors continued in site 1 using the CSF filter (P
50: −0.091 m; NMDA: 0.14 m and RMSE: 0.69 m) and the lowest ones in site 3 using the TIN-DEF filter (P
50: −0.001 m; NMDA: 0.04 m and RMSE: 0.09 m). Overall, the TIN densification filters with low spikes (SF, SW2) and default settings (DEF) showed the lowest errors as well as the Kriging interpolation method and the spatial resolution of DEMs at 5 m (
Tables S5–S7).
To explain vertical errors from site-factors, GAMs (with linear terms) at different levels of aggregation were carried out. The global multivariate GAM including all sites and classification filters explained ca. 47% of total deviance (
Table 7). The slope and the distance to the nearest geoid point were the most important explanatory variables. Overall, as the slope increased, the elevation residuals were more negative (underestimation of benchmark elevations) whereas as the distance to the nearest geoid point increased, the elevation residuals were more positive (overestimation of benchmark elevations) (
Figure 6). Univariate GAMs for each site including all filters indicated that the distance to the nearest geoid point was the best explanatory variable for all sites, except for sites 2 and 6 that were poorly explained by the site conditions (
Table 8). Finally, multivariate GAMs for each classification filter within each site indicated that vertical errors in sites 1 and 4 were better explained by site factors than those in the other sites (deviance explained 40% ± 0.07 and 57% ± 0.07, respectively) as well as the TIN densification filter with default settings (DEF) and the PMF filter. Nevertheless, all these GAMs showed low fitting ability (from 13% to 57% of deviance explained) (
Tables S8–S13).
Finally, we assessed if changes in vegetation height could be related to the vertical errors found before correction. The results of this analysis indicated that changes in vegetation height were decoupled from elevation vertical errors in nearly all sites (
Tables S14 and S15). Nevertheless, we found a significant relationship in site 4 and in site 6 (
Figures S6 and S7). In those cases, there was a positive relationship indicating that sampling points with large negative vertical error had the lower vegetation height changes, and points with lower elevation errors (less negative or positive) had larger vegetation height changes. As preliminary results, we observed that from 2014 to 2019, the greatest vegetation height increases occurred in site 1 (+3.9 m on average), followed by site 3 (+3.0 m) and 2 and 4 (ca. +2.0 m). The lowest increases were found in site 6 (+1.33 m) and site 5 (+0.4 m) on average (
Figure 7).
Quantitative assessment of vegetation changes by repeated LiDAR data is a challenge because it requires producing reliable change maps that are robust to differences in survey conditions, free of processing artifacts, and that consider various sources of uncertainty, such as different point densities, georeferencing errors and geometric discrepancies [
58,
59]. Here, the raw elevation of low-density LiDAR was adjusted to the elevation of high-density LiDAR using the best DEM of difference as local pseudo-geoid (i.e., all interpolated points). This approach improved the vertical accuracy of the target DEMs, minimized the random nature of vertical errors and decoupled vegetation changes from ground elevation errors.
4. Discussion
The first step before comparing low- and high-density LiDAR datasets was to assure that there were no systematic or instrumental errors in data collection [
15]. The calibration method to eliminate boresight misalignment in the benchmark high-density LiDAR ensured high internal vertical and horizontal consistency (1–6 cm of error in XYZ) and the vertical accuracy given by the vendor was around 3–6 cm. In addition, low-density LiDAR strips were aligned using GNSS ground checking points, and the vertical accuracy was 15–20 cm. Main differences in vertical accuracy of both LiDAR data sets can be attributed to the greater beam size and less sensitivity of the laser as site altitude increases. For example, Hyyppä et al. [
17], who used data collected in three separate sites, observed that the increase of site altitude from 400 to 1500 m increased the random error of DEM derivation by 50%. This was mainly due to the decrease of the pulse density and increase in the planimetric error (for non-flat surface).
Nevertheless, DEM derived from low-density LiDAR data (ca. 1 point/m
2 in our case) is hampered by poor quality of ground returns [
10,
14,
32,
67]. Accordingly, the adjustment of low-density LiDAR elevation with respect to a benchmark with higher point density is recommended [
7]. James et al. [
11] demonstrated that extracting control points from high-density LiDAR derived DEM can produce a photogrammetric DEM of comparable quality to that achieved with high accurate ground checkpoints. However, to achieve such positive results, it is important to choose suitable control points and to ensure that change between epochs is minimal [
11]. Moreover, the inadequate quantity and spatial distribution of GCPs will in turn degrade the accuracy of DEM products [
9]. In this sense, other studies assessed certain limitations in correcting less accurate DEMs (i.e., SRTM DEM) by applying regression models using checkpoints from LiDAR data [
12,
13]. For example, Su and Guo [
12] observed that, although vertical errors decreased (from over 12 to −0.8 m) after correction, large errors continued by using the regression model. Likely, Su et al. [
13] following the same methodology assessed that the corrected SRTM DEM continued being about 2.5 m lower than the GLAS elevation on average. These results might indicate that the random nature of elevation errors can limit the usefulness of elevation corrections from regression analysis based on some checkpoints and their properties.
Here, we corrected elevation data from low-density LiDAR (1 point/m
2 on average taken at 3000 m of site height) to high-density LiDAR elevation data (300 points/m
2 on average and taken at 50 m of site height) using the entire grid of interpolated elevation differences as true checkpoints with respect to the high-density LiDAR instead of only some control points. By significantly increasing the number of checkpoints used in the adjustment, the redundancy produced better results than using an inferior number of ground control points [
11]. Moreover, by means of the DEMs of difference, we conformed high dense surfaces of elevation deviations (i.e., dense local “pseudo-geoids”) that served to adjust the raw elevation of low-density LiDAR points to the benchmark ones giving a straightforward quantitative control test whereby differences closest to zero represented the best quality DEM surfaces [
4,
8,
9].
However, before making such elevation corrections, a previous checking of the effect of the classification filters, the interpolation methods and the spatial resolution on the vertical errors of DEMs is required. It has been shown that the quality of a DEM is dependent on a correct classification of these points as ground [
3,
23]. The performance of the classification filters differs under different LiDAR point density as well as terrain and vegetation conditions [
2,
34,
35,
68]. Moreover, there is little guidance in the literature regarding the selection of parameters to optimize filtering [
37]. Accordingly, due to the lack of optimal classification filters, quality control becomes necessary to select the most suitable in each context [
31].
We observed large differences in vertical errors among classification filters. As the Q-Q plots showed that vertical errors in each site followed a non-normal distribution, the NMDA, resilient to outliers, was used to report more appropriate vertical errors in DEMs [
65]. Before correction, and according to all error metrics, the CSF and the TIN-WILD filters showed significant higher vertical errors than TIN-DEF, TIN-SW2 and TIN-SF in all sites. After correction, the vertical errors dropped significantly. The NMDA ranged from 0.14 (site 1 using CSF filter) to <0.06 m (in all sites and all classification filters except the CSF and TIN-WILD). These errors were within the accuracy specifications of low-density LiDAR data given by the vendor (0.15–0.20 m). Moreover, a detailed visual analysis of DEMs of difference after correction revealed that vertical errors were more randomly distributed although reflecting some artefacts derived from the used classification filter [
47,
48].
The TIN densification filter with “Default” settings (DEF) or low spikes (i.e., SW2 and SF) for both LiDAR datasets drew smoother DEMs whereas the segmented-based (CSF) filter and TIN densification one using large spikes (i.e., WILD) showed many artifacts as bumps and peaks. CSF and TIN-WILD erroneously included low and dense vegetation as ground (type II error) but drew cliffs and steep slopes perfectly (see Viedma et al. [
68] for further details). On the contrary, TIN densification filters using low spikes (TIN -DEF, -SW2 and -SF) correctly classified understory or low/dense shrubs as non-ground but resulted in several ground classification errors in steep slopes, cliffs and deeply incised stream banks (type I error) [
68,
69]. In this way, as none of them worked perfectly, and all methods are susceptible to both omission and commission errors, users might consider testing different classification filters to avoid unpredictable results [
30] and the possibility of combining multiple classification procedures to exploit the strengths of each other [
68,
69]. For example, Viedma et al. [
68] combined different classification filters to estimate low (<2 m) and high (>2 m) vegetation in a very complex terrain in eastern Spain, removing type I and II errors.
Besides the effects of the classification filters on vertical accuracy of DEMs, the interpolation from points onto a grid can also introduce a degree of uncertainty into the DEMs [
6,
10,
28]. Here, we asserted that before correction there were not significant differences on vertical errors due to the interpolation method neither among nor within sites (even by controlling the filter classification in the ANOVA (not shown here)). Nevertheless, after correction, and according to the NMDA, the vertical errors were significantly lower using the Kriging method than using the IDW in all sites. It has been observed that the vertical errors in DEMs are more dependent on the choice of interpolation method for low points density than for high sampling density [
67]. At lower sampling densities, Kriging yielded the best elevation estimations improving IDW (inverse distance weighted) [
5,
8,
13,
27,
67,
70]. However, if sampling data density is high, the IDW method performs well [
10,
42,
67]. Moreover, Yilmaz and Uysal [
18] found that the TIN interpolation was superior to other interpolation methods with the lowest RMSE. According to these results, it is evident that these interpolation methods are suitable for some specific conditions, and their general applicability is limited; there still being a lack of consensus about which interpolation method is most appropriate for the different terrain data. None of the interpolation methods is universal for all kinds of data sources, terrain patterns or purposes [
18,
42].
Furthermore, we found that there were not significant differences on vertical errors due to the spatial resolution of DEMs. Before correction, none of the error metrics showed significant differences in relation to the spatial resolution neither among nor within sites (even by separating/controlling the filter classification in the ANOVA (not shown here)). However, after correction, NMDA, followed by the P
50 and RMSE, indicated that the larger the pixel size was, the lower the vertical error. It was expected that the most accurate surfaces were created using grids which had a similar spacing to the original points (ca. 2 m) [
10,
46]. However, Guo et al. [
27] showed that DEMs at high resolutions (0.5 and 1 m) produced relatively higher RMSE than DEMs at lower resolution (5 and 10 m), mainly by the smooth effect of large pixel size.
On the other hand, the vertical accuracy of the DEMs is affected by site conditions (mainly vegetation and terrain morphological characteristics) [
3,
4,
5,
17,
22,
33,
49,
50,
51]. In relation to the effects of vegetation, we showed that vertical errors in DEMs were low and explained by vegetation height (CHM); reaching a maximum ca. 10% of deviance explained in some sites. However, despite the low explanatory power of this factor, vertical errors increased significantly with vegetation height favoring the overestimation of benchmark elevations. These results agreed with other ones finding positive elevation bias in high vegetated areas and negative bias in open vegetation and agricultural land covers [
7,
9,
11,
13,
53,
66,
71]. Nevertheless, other studies found the opposite [
12,
23,
68]. For example, Chaplot et al. [
67] observed that elevations were overpredicted (6.0 cm) in open vegetated areas and underpredicted (−3.8 to −6.0 cm) in brush low/trees and evergreen forests. Likely, Su and Guo [
12] assessed that the percentage of pixels with negative difference increases to nearly 100% in the group with the maximum vegetation height at all study sites. Finally, others have not found any relationship [
2,
8,
33]. In the case of Hodgson et al. [
32], it was found that LiDAR derived elevation was significantly underpredicted in all studied land cover classes; although the underprediction was largest in pine forest areas, by up to 0.24 m.
Regarding the effects of terrain complexity on vertical accuracy of DEMs, it has been shown that slopes, roughness or topographic variability (closely related to slope) contribute significantly to the errors of DEMs [
2,
14,
17,
23,
28]. For example, Razak et al. [
51] observed that DEM errors in slopes over 40° were significantly higher than the overall RMSE (1.53 m vs. 0.89 m). Tinkham et al. [
2], reported that the vertical error increased on slopes exceeding 30° and Hodgson and Bresnahan [
23] that the elevation error in slopes about 25° was twice those in flatter slopes (1.5°). Finally, Mukherjee et al. [
66] showed that significant impacts occurred only with slopes above 10° on ASTER and SRTM DEMs accuracy. On the contrary, Su et al. [
13] showed that, compared to the influence of vegetation, slope had no significant influence on the mean difference between the SRTM DEM and the LiDAR DEM. Likely, Hodgson et al. [
32] reported that, except for low grass, none of the other land cover categories exhibited a statistically significant relationship between LiDAR elevation errors and terrain slope. Here, terrain slope was the best predictor of vertical errors among sites explaining ca. 28% of total deviance but this effect was lower within sites. These results could be explained by the wide range of slope variability (0 to 40°) covered among sites and the low range of variability within them. Consistently, higher slopes increased vertical errors and underpredicted benchmark LiDAR elevations (negative elevation differences).
In addition, we included as site-specific variable, the distance to the nearest geoid point. This variable was never analyzed in previous studies and resulted in high explanatory power (24.3% of total deviance), following the slope. Results indicated that increasing the distance to the nearest geoid points, the greater the positive vertical error (overprediction of benchmark elevation). For example, Daniels [
72] observed that the elevation difference between the ellipsoidal heights and the orthometric ones increased as one travels north from the Columbia River toward Washington. This regional trend implied that the use of an orthometric height correction factor calculated for the base station alone will not correct all offsets in the data, and that the correction factor required may vary based on the location of the LIDAR instrument in relation to the base station.
Finally, we assessed that changes in vegetation height were decoupled from elevation vertical errors. The significant relationship observed in some sites did not explain casualty; mainly because we expected similar vertical errors in DEMs and CHMs after correction; and in all cases, height differences were significantly larger than elevation errors; agreeing with the results obtained by Dubayah et al. [
73]. Finally, and as preliminary results, we quantified an average annual growth of 0.39 ± 0.24 m/y that ranged from 0.13 (in site 5) to 0.87 m/y (in site 1) for 5 years (2014–2019). These different growth rates could be related to the age of vegetation, assuming that younger trees had a faster growth rate than older ones (site 1 was composed by trees burned in 1990 and site 5 by older trees never burned in the last 40 years). For example, Vepakomma et al. [
74], using multitemporal LiDAR, showed that younger trees (ca. 10–15 m) grew ca. 2 m in 5 years (0.5 m/y) whereas the older ones (>15 m) around 0.4 m (0.08 m/y) in Canada. Conversely, others have found larger growth rates in old forests using repeated LiDAR. For example, Englhart et al. [
75] quantified a tree height increase of 2.3 m in 4 years (ca. 0.6 m/y) in unaffected forests and Hopkinson et al. [
76] an annual growth of 0.5 m/y in a red pine planation in Canada. Moreover, Hudak et al. [
77] reported even larger annual growth (0.8 m/y) in a mixed-conifer forest in the US. These large height changes could be attributed to the way conifers trees grow. Conifer trees grew by elongating their tips vertically, and by elongating existing branches horizontally [
74]. Accordingly, points that have hit on low surfaces near the crown borders during the first LiDAR survey will be much higher in the last one as the result of hitting on the crown due to lateral growth (
Figure 8). These results indicated that changes in canopy architecture over time are also important.
On the other hand, it has been reported that low-density LiDAR showed a noticeable underestimation bias of trees height [
78,
79]. For example, Zhao et al. [
79] quantified a bias of −1.5 m (n = 598) in respect to field data mainly due to the increased probability of missing treetops as pulse density decreases. Here, we show, as an example, the cloud of points of some “supra crowns” to visually assess the canopy differences between low- and high-density LiDAR datasets (
Figure 9).
As future work, several issues need to be resolved for improving the estimation of vegetation growth from repeated LiDAR with different point densities: (i) a more robust comparison of vegetation height changes from low- and high-density LiDAR datasets, (ii) unmixing the effects of vertical and lateral growth and (iii) to develop robust models between field growth measures (for example, from multitemporal national forest inventories) and observed LiDAR values. In relation to a reliable comparison between low- and high-density LiDAR datasets, we could apply the approach proposed by Zhao et al. [
79] who corrected the relative biases of low-density LiDAR measured tree heights using the LiDAR data with the highest laser pulse rate and thinning it at a series of lower point densities. This approach built a regression model whose coefficients were used to correct tree heights derived from low-density LiDAR. In addition, to guarantee the comparability of forest metrics from low- and high-density LiDAR surveys, area-based LiDAR metrics will be applied for being more reliable in estimating forest metrics [
79,
80,
81]. Compared to individual tree analyses, area-based analysis is less affected by the inconsistencies between low- and high-density LiDAR data to measure forest dynamics; and allow the classification the spatiotemporal patterns of forest changes [
79]. To unmix the effects of vertical and lateral growth, height measures will consider only the common crown section, giving more accurate height changes; and finally, to assure the robustness of the relationship with ground data; separated LiDAR models for each LiDAR survey with temporally coincident ground data can counteract the inconsistency in repeat LiDAR data with different point densities [
2,
75].
5. Conclusions
Standardization of ground elevation is obligatory for comparison of LiDAR derived forest metrics at different dates. After standardization, it can be reasonably assumed that the relative accuracy between CHMs is like the DEM accuracy. In this paper, it has been assessed that although high-density LiDAR data does not provide a perfect elevation model, it was considerably more accurate compared to low-density LiDAR; and therefore, it was a good benchmark. Our approach was novel mainly because continuous surfaces of elevation differences, instead of a collection of checkpoints, were used to correct the raw elevation of low-density LiDAR. Moreover, using the “best DEM of difference” approach, based on comparing the vertical errors raised from different methodologies (i.e., classification filters, interpolation methods and spatial resolutions), the elevation of each site was more accurately corrected reducing random or methodological errors that would be very difficult to correct by using only checkpoints.
Overall, it was observed that the classification filters based on TIN densification algorithm using default parameters or low spikes (TIN-DEF or TIN-SF), the Kriging interpolation method and 5 m pixel size adequately corrected elevation deviations in all sites. After correction, vertical errors declined drastically, reaching low values (from 0.04 to 0.08 m) using the best DEM of difference in each site. Nevertheless, none of the classification filters worked perfectly in each site, and all of them are susceptible to both omission and commission errors. Accordingly, the possibility of combining different classification filters in the same environment is recommended to exploit the strengths of each other. On the other hand, the vertical errors observed between low- and high-density LiDAR datasets were partially explained by site factors (maximum deviance explained was 57% ± 0.07). The slope and the distance to the nearest geoid point were the most important explanatory variables. Overall, the higher the terrain slope and the distance to the nearest geoid, the higher the vertical errors. Vegetation height played a minor role, but vertical errors increased significantly with vegetation height. Finally, we assessed that changes in vegetation height were decoupled from elevation vertical errors at all sites.
The findings of the study are important to understand the sources of DEMs error by analyzing the role of methodological and site-specific factors; and also, to correct vertical errors of low-density LiDAR derived DEMs by using “the best DEM of difference” method. This approach considerably reduced the effects of elevation errors on vegetation height changes. The study recommends that, before comparing LiDAR datasets with different point densities, “the best DEM of difference” should be considered for correcting the target elevation dataset; mainly because each environment has its own physical properties, and this method allows adapting of the best DEM to each site. Finally, the estimation of vegetation growth from repeated LiDAR with different point densities must be improved, requiring further studies.