Integrating LiDAR, Multispectral and SAR Data to Estimate and Map Canopy Height in Tropical Forests

Developing accurate methods to map vegetation structure in tropical forests is essential to protect their biodiversity and improve their carbon stock estimation. We integrated LIDAR (Light Detection and Ranging), multispectral and SAR (Synthetic Aperture Radar) data to improve the prediction and mapping of canopy height (CH) at high spatial resolution (30 m) in tropical forests in South America. We modeled and mapped CH estimated from aircraft LiDAR surveys as a ground reference, using annual metrics derived from multispectral and SAR satellite imagery in a dry forest, a moist forest, and a rainforest of tropical South America. We examined the effect of the three forest types, five regression algorithms, and three predictor groups on the modelling and mapping of CH. Our CH models reached errors ranging from 1.2–3.4 m in the dry forest and 5.1–7.4 m in the rainforest and explained variances from 94–60% in the dry forest and 58-12% in the rainforest. Our best models show higher accuracies than previous works in tropical forests. The average accuracy of the five regression algorithms decreased from dry forests (2.6 m +/− 0.7) to moist (5.7 m +/− 0.4) and rainforests (6.6 m +/− 0.7). Random Forest regressions produced the most accurate models in the three forest types (1.2 m +/− 0.05 in the dry, 4.9 m +/− 0.14 in the moist, and 5.5 m +/− 0.3 the rainforest). Model performance varied considerably across the three predictor groups. Our results are useful for CH spatial prediction when GEDI (Global Ecosystem Dynamics Investigation lidar) data become available.


Introduction
As part of the response to mitigate the current environmental and biodiversity crises, several measures to monitor global biodiversity change, termed essential biodiversity variables (EBVs), have been proposed [1,2]. Some EBVs should be informed by remote sensing data to facilitate long-term global monitoring [3]; thus, developing accurate links between EBVs and remote sensing data is critical to mapping and monitoring EBVs. Canopy height (CH) and vegetation phenology (leaf phenology) are highly relevant for developing EBVs for tropical forests, a group of carbon-rich terrestrial ecosystems with the highest biodiversity on Earth [4]. penetrate only the upper canopy layer. However, high correlations between the C-band backscatter and forest structural variables were found lately in temperate forests [48]. Also, recent studies combining L-and C-band backscatters showed slight accuracy improvement in estimating above ground biomass of tropical [38], temperate [49], and mangrove [50] forests.
Here our objective was to develop a LIDAR/multispectral/SAR method to improve the prediction and mapping of CH in tropical forests at high spatial resolution (30 m), with methods that could be applied for wide area mapping once GEDI CH data are available. Our primary datasets were airborne LiDAR surveys of CH and annual metrics of multispectral and SAR predictors. Our analyses were focused on the relation between CH and multispectral and SAR predictors calibrated with LiDAR structure variable estimates [21,46,47]. We hypothesized that annual metrics (annual means and annual standard deviations) of multispectral and SAR bands together should be more sensitive to variability in CH than using multispectral or SAR separately, allowing the generation of CH maps with better accuracy than achieved previously. CH is essential to build maps of other forest ecological variables (e.g., carbon stock, biomass, vertical structure); thus, accurate CH estimation may also improve other ecological assessments. To test this hypothesis, we estimated CH in tropical forests of three ecoregions in South America across a precipitation gradient (dry forest, moist forest, and rainforest) using discrete return airborne LiDAR. The LiDAR surveys included a range of CHs to establish how performance of multispectral and SAR predictors changes with CH and tropical forest type. Our analyses were done at a spatial resolution similar (30 m) to the GEDI (Global Ecosystem Dynamics Investigation) LiDAR footprint (25 m) in order to further assess the potential of future GEDI data for similar analyses across tropical and temperate forests.

Study Areas and Forest Types
Since a rainfall regime is one of the main determinants of leaf phenology (greenness maximization) and microclimate in the tropical forests [51], we selected three locations in three types of tropical forest areas of South America with different annual rainfall totals ( Figure 1): (1) Mato Grosso dry forest (MAT) is a transitional forest located in the Mato Grosso and Pará provinces of Central Brazil. MAT presents a high diversity of plants, animals and indigenous peoples. Average temperature is 23.5 • C and annual rainfall is around 2050 mm with eight moist months from December to May and four dry months from June to September [52]. The seasonal climatology in MAT is similar to the south American moist forests and savannas; however, this transitional forest typically receives about 200 mm less rainfall per year than the moist forests and 500 mm more rainfall than the savannas [53].
(2) Tapajós-Xingu moist forest (TAP) is located between the Tapajós and Xingu rivers that flow within the Amazon Basin of central-eastern Brazil. These forests host impressive levels of biodiversity and are characterized by a high density of lianas (woody vines), which creates a low and open understory. Average temperature is 25-28 • C, annual rainfall is around 2147 mm, and dry months are from August to October [54]. (3) Chocó-Darien rainforests (CHOCO) is located along the Colombian Pacific coast and the southeastern of Panama [55]. Past geologic events have turned this forest into one of the most diverse and endemic tropical forest areas on the Earth [56,57]. CHOCO is among the rainiest areas on the planet with an annual average rainfall between 8000 to 13000 mm and a lower rainfall season from January to March. The average temperature is 23 • C [58].

LiDAR Data and Canopy Height
Airborne discrete return LiDAR data were acquired during the dry months of 2018 for MAT and TAP by the Sustainable Landscapes Brazil project supported by the Brazilian Agricultural Research Corporation (EMBRAPA), the US Forest Service, USAID, and the US Department of State, using an OPTECH ALTM 3100 at an average flight altitude of 750 m with average return density of 22 returns/m 2 . For CHOCO, the LiDAR data were acquired using an OPTECH ALTM3033 during lower rainfall months of 2018 at an average flight altitude of 1000 m with average return density of 24 returns/m 2 . The LiDAR surveys were located in lowland, nonflooded areas of the three forest types at 44 m above sea level elevation average in MAT, 51 m in TAP, and 12 m in CHOCO. These data were processed by summing the returns within a laser footprint calculated at 1 m spatial resolution and 1-m vertical resolution [23,59]. Firstly, we selected the ground returns (Classification = 2

LiDAR Data and Canopy Height
Airborne discrete return LiDAR data were acquired during the dry months of 2018 for MAT and TAP by the Sustainable Landscapes Brazil project supported by the Brazilian Agricultural Research Corporation (EMBRAPA), the US Forest Service, USAID, and the US Department of State, using an OPTECH ALTM 3100 at an average flight altitude of 750 m with average return density of 22 returns/m 2 . For CHOCO, the LiDAR data were acquired using an OPTECH ALTM3033 during lower rainfall months of 2018 at an average flight altitude of 1000 m with average return density of 24 returns/m 2 . The LiDAR surveys were located in lowland, nonflooded areas of the three forest types at 44 m above sea level elevation average in MAT, 51 m in TAP, and 12 m in CHOCO. These data were processed by summing the returns within a laser footprint calculated at 1 m spatial resolution and 1-m vertical resolution [23,59]. Firstly, we selected the ground returns (Classification = 2 according to LAS file format) and generated 1-m DTMs (digital terrain models) using the k-nearest neighbor approach with Remote Sens. 2019, 11, 2697 5 of 20 the inverse-distance weighting (IDW). We then obtained normalized heights (heights above ground) for each point of the point cloud, subtracting the estimated ground elevation from the raw point elevation value. We then selected the highest reflections in the 1-m spatial resolution to compute 1-m CH models. Secondly, we calculated height percentile metrics at 98% of the cumulative points relative to the ground (Figure 2a). We used height metrics at 98% as estimation of the CH (Figure 2b); previous analyses have shown this height metric is the most precise canopy estimator since height metrics at 100% can be noisy due to its association with the first return [21]. The CHs of 1 m of spatial resolution ( Figure 2c) were subsequently resampled at 30 m of spatial resolution using bilinear interpolation to align with 30 m Landsat-8 grids. These procedures were implemented using the R packages "Raster" and "lidR" [60,61].
Remote Sens. 2019, 11, x FOR PEER REVIEW 5 of 21 according to LAS file format) and generated 1-m DTMs (digital terrain models) using the k-nearest neighbor approach with the inverse-distance weighting (IDW). We then obtained normalized heights (heights above ground) for each point of the point cloud, subtracting the estimated ground elevation from the raw point elevation value. We then selected the highest reflections in the 1-m spatial resolution to compute 1-m CH models. Secondly, we calculated height percentile metrics at 98% of the cumulative points relative to the ground (Figure 2a). We used height metrics at 98% as estimation of the CH ( Figure 2b); previous analyses have shown this height metric is the most precise canopy estimator since height metrics at 100% can be noisy due to its association with the first return [21]. The CHs of 1 m of spatial resolution ( Figure 2c) were subsequently resampled at 30 m of spatial resolution using bilinear interpolation to align with 30 m Landsat-8 grids. These procedures were implemented using the R packages "Raster" and "lidR" [60,61].

Multispectral and SAR Data
We used data from three satellite instruments that observe different regions of the electromagnetic spectrum: (1) Four bands of the multispectral sensor Landsat-8: shorth wave infrared1 -SWIR1 (1.566-1.651 µm), short wave infrared2 -SWIR2 (2.107-2.294 µm), brightness temperature1 -Thermal1 (10.60-11.19 µm), and brightness temperature2 -Thermal2 (11.50-12.51 µm). In addition, we included the Enhanced Vegetation Index (EVI) and the Normalized Difference Vegetation Index (NDVI) estimated from Landasat-8. These indices detect complementary vegetation characteristics. EVI reduces atmospheric influence and is sensitive in denser canopies where NDVI tends to saturate; NDVI is better than EVI at detecting vegetation cover with low biomass and CH [62][63][64]. We did not use blue (0.452-0.512 µm), red (0.636-0.673 µm) and near infrared-NIR (0.851-0.879 µm) of Landasat-8 in our analysis since these bands are included in EVI and NDVI estimations. (2) The HV and HH polarization backscattering coefficients of the radar L-band generated by the SAR sensor ALOS-2-PALSAR (Phased Array type L-band Synthetic Aperture Radar). (3) The VV and VH polarization backscattering coefficients of the radar C-band generated by the SAR sensor Sentinel-1 (H refers to horizontal and V refers to vertical).
These data were obtained in Google Earth Engine [65]. For the Landsat data, we used the USGS Landsat-8 Surface Reflectance Tier-1 product. In this product, the visible bands, NIR, and SWIRs are processed to orthorectified surface reflectance while the two thermal bands (Thermal1 and Thermal2) are processed to orthorectified brightness temperature. These data are also atmospherically corrected using LaSRC (Landsat 8 Surface Reflectance algorithm), which includes cloud, shadow, water and snow masks [66]. These filters produced between seven and nine corrected values for each band per year in the study areas; we used these values to estimate annual means (X-means) and annual standard deviations (X-Sds) of the Landsat-8 data. All the Landsat bands are acquired at 30-m resolution, except the thermal bands that are resampled from 100 m to 30 m using cubic convolution to match the other spectral bands [67]. For the ALOS-2-PALSAR data, we used the Global PALSAR-2/PALSAR Yearly Mosaic [40]. This product has only one date per year; thus, we used three years (the previous, the actual, and the posterior years) to estimate X-means and X-Sds. The original spatial resolution of this product is 25 m; thus, it was resampled to the 30-m Landsat spatial resolution using bilinear interpolations. For the Sentinel-1 data, we used the product Sentinel-1 SAR GRD. This collection provides weekly data processed using the Sentinel-1 Toolbox to generate a calibrated and ortho-corrected product [68]. We estimated X-means and X-Sds using these weekly data. The original spatial resolution of this product is 15 m; therefore, we also resampled these data to the 30 m-Landsat spatial resolution using bilinear interpolations. We evaluated possible relations between the CHs of the LiDAR surveys and the X-means and X-Sds of these multispectral and SAR data using linear and nonlinear correlation analyses. We also used these X-means and X-Sds as the predictors of our models (see Modeling and Mapping). Our method for integrating discrete return LiDAR, Lansat-8, Sentinel-1, and PALSAR-2/PALSAR data was not a strict remote sensing fusion; we used the CHs estimated from LiDAR surveys as ground reference see [21,46,47] to model CHs to areas with not LiDAR coverage using annual metrics of the multispectral and SAR as predictors.
We estimated annual curves using only the corrected values for each band per year (multispectral and SAR data) at spatial resolution of 30 m in the areas corresponding to the LiDAR surveys. We applied a Gaussian filter with a temporal window of two corrected values in each annual curve for smoothing and improving its differentiation. The annual curves of each band were compared between lower CHs and upper CHs in the three forest types. Lower CHs were areas with vegetation canopy < 4 m while upper CHs were areas with vegetation canopy > 25 m in MAT and >30 m in TAP and CHOCO; the sizes of the upper CHs were different among forest types due to the availability of data. To build these annual curves, we first selected randomly 74 pixels for each of both CH classes in MAT, 100 pixels in TAP, and 69 in CHOCO. The number of evaluated pixels varied among the three study areas due to the availability of the data. We then averaged these curves across the CH classes, bands, and forest types and evaluated possible significant differences between the two CH classes per band in each study area using paired t-tests.

Modeling and Mapping
We used five different regression algorithms to model the relationship between LiDAR estimated CH and the X-means and X-Sds of the multispectral and SAR predictors-linear regression (lm), random forest (RF), support vector machine (SVM), multivariate adaptive regression splines (MARS), and Lasso and Elastic-Net Regularized Generalized Linear Models (GLM.net). We initially had 20 predictors; 10 X-means and 10 X-Sds corresponded to EVI, NDVI, SWIR1, SWIR2, Thermal1, Thermal2, L-band polarizations HV and HH, and C-band polarizations VV and VH. We grouped the variables in three different sets to run the five regression models. A first group formed by the 20 predictors variables (Vars.1). A second group of predictor variables (Vars.2) that included only the X-means and X-Sds with significant linear (significant r > 0.2) or nonlinear (significant k > 0.2) correlations to the CH in any of the three study areas since small r and k do not affect the trends of the response variables and could overfit the models [8]. We then examined the multicollinearity among the predictors with significant relations to the CH (r > 0.2), performing variance inflation factor through a stepwise procedure to eliminate correlated predictors and thus reduce possible overfitting effects in the regression models [69] using the R package "usdm" [70]. These predictor variables formed the third group (Vars.3).
We ran the five regression algorithms using 30-m pixel training sites where CH was estimated by the LiDAR surveys: 7408 training sites in MAT, 3207 in TAP, and ,024 in CHOCO ( Table 1). The training sites were selected using a spatial filter of 60 m 2 to reduce spatial autocorrelation. We used RMSE (root mean squared error) and MAE (mean absolute error) to evaluate the accuracy of the models. RMSE and MAE express average model prediction error in units of the variable of interest; however, RMSE gives a relatively high weight to larger errors and MAE penalizes greater variances of the errors [71]. We reported RMSE in the results and MAE in the Figure S1. We also estimated the explained variation or explained variance (adjusted R 2 ) by the predictors in the models to evaluate the fit of the predictors (Vars.1, Vars.2, and Vars.3) in the models where higher adjusted R 2 corresponds to better fit. We performed 100 iterations for each of the five regression models in the three study areas using the R Package "caret" [72]; each iteration included a fivefold cross-validation to estimate RMSE, MAE and adjusted R 2 . We performed stepwise model selections by AIC using the R packages "MASS" [73] for the lm regressions. For the RF regressions, we used 500 trees and three as the number of variables randomly sampled as candidates at each split of each tree; we used the R package "randomForest" [74]. For the SVM regressions, we performed linear kernels and used a value of one as the cost of constraint violation in the models; we used the R package "e1071" [75]. We used the R packages "earth" [76] and "glmnet" [77] and followed their default settings to implement the MARS and GLM.net regressions. MARS is an extension to linear regression that captures nonlinearities and interactions between variables [78] while GLM.net fits generalized linear model via penalized maximum likelihood [79]. Using these regression algorithms, we finally generated maps of the CH based on the multispectral and SAR predictors. The maps were create using the R package "raster" [60] and the spatial prediction (extent of maps) was less than 100 times the coverage of the LiDAR surveys [22]. In this work, we only present the maps generated by the most accurate regression model.

Results
Differences between annual curves of multispectral bands and vegetation indices were discernable in MAT and TAP when lower and upper CHs were compared (p < 0.05; t > 2.6); however, only the NDVI annual curves showed significant differences between both CH classes in CHOCO (p < 0.05; t > 3.6) ( Table 2) annual curves of both polarizations of the L-band were discernable between lower and upper CHs in MAT (p < 0.05), but differences were discernable only in the polarization HV in TAP (p < 0.3) and only the polarization HH in CHOCO. Annual curves of both polarizations of the C-band were discernable between lower and upper CHs in MAT and TAP, but only the VH polarization in CHOCO (p < 0.05) ( Table 2). Table 2. Annual curve comparison of multispectral and SAR bands for lower canopy heights (CHs < 4) m and upper canopy heights (CHs > 25 m in MAT and CHs > 30 m in TAP and CHO) in three tropical forest types. Annual precipitation of the forest was included in the name of the forest type. Paired t-tests test differences in mean band values between CH classes in the forest types. Bold and underline highlight significant p and t values. Significant p value ranges; p < 0.001(***), p < 0.01(**), and p < 0.05(*).   All X-means of the bands and vegetation indices were linearly correlated to CH in MAT with r magnitudes > 0.2, eight of these variables in TAP, and six in CHOCO. The magnitudes of these linear correlations were higher in MAT, lower in TAP, and even lower in CHOCO for each vegetation index and band, except the C-band polarizations in CHOCO that were higher compared with TAP. Xmeans of the two SWIRs decreased with CH in the three forest types while EVI and NDVI increased. All X-means of the bands and vegetation indices were linearly correlated to CH in MAT with r magnitudes > 0.2, eight of these variables in TAP, and six in CHOCO. The magnitudes of these linear correlations were higher in MAT, lower in TAP, and even lower in CHOCO for each vegetation index and band, except the C-band polarizations in CHOCO that were higher compared with TAP. X-means of the two SWIRs decreased with CH in the three forest types while EVI and NDVI increased. X-means of both thermal bands decreased with the CH in MAT and TAP but did not show significant trends in CHOCO. X-means of L-bands and C-bands increased with CH in the three forest types (Table 3, Figure S2). Four of the X-SDs were linearly correlated to CH in MAT with r magnitudes > 0.2, two of them in TAP, but none in CHOCO. These four X-SDs (EVI, NDVI, and the C-bands) decreased with the CH and their r magnitudes were higher in MAT than in TAP (Table 4, Figure S3). In general, the nonlinear coefficients of X-means and X-SDs for each vegetation index and band showed lower correlation magnitudes than the linear coefficients (Tables 3 and 4). Table 3. Linear (Pearson's r) and nonlinear (Kendall's tau) correlations between canopy height and annual means of multispectral and SAR bands (X-means) in three tropical forest types. Annual rain of the forest was included in the name of the forest type. Bold and underline highlight significant r > 0.2 or k > 0.2. Significant P value ranges; p < 0.001(***), p < 0.01(**), and p < 0.05(*).  Table 4. Linear and nonlinear correlations between canopy height and annual standard deviations of multispectral and SAR bands (X-Sds) in three tropical forest types. Bold and underline highlight significant r > 0.2 or k > 0.2. Significant P value ranges; p < 0.001(***), p < 0.01(**), and p < 0.05(*).   Table 3), and the third group by no-collinear predictors with significant correlations (significant r >0.2 or k>0.2) to the CH (Vars.3).

Tapajós-Xingu Moist Forests
Another multifactorial ANOVA showed an analogous pattern for the explained variance; forest type (F = 21615347; P < 0.001; n =4500), regression model (F = 2999577; P < 0.001; n =4500), and method of predictor variable selection (F = 599039; P < 0.001; n = 4500) affected significantly the explained variance of the models for predicting CH (Figure 4b). By forest type, the five regression models in the Model accuracy was estimated by RMSEs (root mean squared error) in meters, while explained variance by R 2 . Five regression models were exanimated: random forest (RF), multivariate adaptive regression splines (MARS), liner regression (lm), Lasso and Elastic-Net Regularized Generalized Linear Models (GLM.net), and Support Vector Machine (SVM). Three groups of SAR and multispectral predictors also were evaluated: the first group formed by the 20 predictors variables (Vars.1), the second by 14 predictors that had significant correlations (significant r > 0.2 or k > 0.2) to the CH (Vars.2) (see Table 3), and the third group by no-collinear predictors with significant correlations (significant r > 0.2 or k > 0.2) to the CH (Vars.3). CHOCO) and MARS resulted in the second highest adjusted R 2 (0.75+/-0.2 in MAT, 0.64+/-0.2 in MAT, and 0.38 +/-0.7 in CHOCO). For variable selection, the adjusted R 2 tend to be slightly higher for Vars.1 than for Vars.2 and Vars.3 in each regression sets of MAT and TAP (differences lesser than 0.07 in general), but adjusted R 2 was substantial higher in the regression sets of CHOCO (differences higher than 0.13 in general). Finally, the accuracies of CH-models using the best regression algorithm (RF) were higher in MAT (1.22 to 1.33 m) than in TAP (4.7 to 5 m) and CHO (5.13 to 5.83 m) ( Figure  5). Three groups of SAR and multispectral predictors also were evaluated: the first group formed by the 20 predictors variables (Vars.1), the second by 14 predictors that had significant correlations (significant r > 0.2 or k > 0.2) to the CH (Vars.2) (see Table 3), and the third group by no-collinear predictors with significant correlations (significant r > 0.2 or k > 0.2) to the CH (Vars.3).
Another multifactorial ANOVA showed an analogous pattern for the explained variance; forest type (F = 21,615,347; p < 0.001; n = 4500), regression model (F = 2,999,577; p < 0.001; n = 4500), and method of predictor variable selection (F = 599,039; p < 0.001; n = 4500) affected significantly the explained variance of the models for predicting CH (Figure 4b). By forest type, the five regression models in the three methods of variable selection produced the highest average adjusted R 2 in MAT . For variable selection, the adjusted R 2 tend to be slightly higher for Vars.1 than for Vars.2 and Vars.3 in each regression sets of MAT and TAP (differences lesser than 0.07 in general), but adjusted R 2 was substantial higher in the regression sets of CHOCO (differences higher than 0.13 in general). Finally, the accuracies of CH-models using the best regression algorithm (RF) were higher in MAT (1.22 to 1.33 m) than in TAP (4.7 to 5 m) and CHO (5.13 to 5.83 m) ( Figure 5).

Improving CH Mapping
The first key contribution of our work was the use of annual metrics (X-means and X-SDs) of multispectral and SAR predictors for improving the mapping of CH LiDAR measures in tropical forests. Depending on forest type, our CH models reached errors close to 1 m and explained 94% of variance in CH, which are better than reported in previous work for tropical forests [36,42,47,80]. Because we used free and open resources (imagery and software), this improvement of the CH mapping can be adopted to facilitate the accurate long-term global monitoring of this EBV. Annual metrics of multispectral and SAR predictors also could be examined to model spatially other EBVs (such as the other ecosystem structure variables) using discrete or waveform LiDAR sensors as ground reference (including GEDI that is available from 2019) because both LiDAR technologies generate accurate measures of vegetation structure.
The improvement of the spatial estimation of CH is a result of the integration of two main factors: (1) The use of bands of different regions of the electromagnetic spectrum increased sensitivity to CH, likely due to forest characteristics such as leaf phenology, leaf density, moisture, and temperature; (2) The use of annual metrics reduced the errors produced when data of specific dates are used as we showed in our analysis of annual curves (Figure 3), where lower and upper CHs had similar band and vegetation indices values during some parts of the year while X-means were different. Other work has also demonstrated improved vegetation characterization when annual metrics of multispectral sensors are used, supporting our results [36,64,81,82].
We found the X-means of ten predictors (EVI, NDVI, two SWIRs, two thermal bands, two polarization of the L-band, and two polarization of the C-band) and four of their X-SDs (EVI, NDVI, and two polarization of the C-band) varied with CH, making them useful predictors of our CH map models and confirming ecological characteristics that have been researched previously. We specifically found that X-means of EVI and NDVI increased with the CH whereas their X-SDs decreased, suggesting variations of leaf phenology and leaf density related to CHs. Previous studies have found that a greenness maximization in well-preserved tropical forests (the areas of upper canopies in tropical forest) results in high and persistent yearly greenness [8,10,32]. Our results show that this greenness maximization is reduced in earlier succession (the areas of lower statured canopies in tropical forest), resulting in lower X-means and higher X-SDs of EVI and NDVI in these areas.
We also found negative correlations between the X-means of SWIR bands and CHs, indicating vegetation moisture changes with CH variations since the SWIRs of Landsat-8, specially SWIR2, are sensitive to water content of soil and vegetation [83]. The areas of upper CH in the well-preserved tropical forests are characterized by higher and more stable moisture during rainy and dry seasons. Conversely, areas of lower CH and biomasses or earlier succession (secondary vegetation and degraded forests) are more susceptible to incidence of light, air temperature change, and wind, which decreases the relative humidity in soil and litter, suggesting a mechanism for our results [13,14,84]. Other studies have identified similar gradients of moisture content using SWIRs in soils [85] or have inferred degrees of vegetation water stress and subsequent reduction of plant productivity [86,87], also supporting our results.
Our results show negative correlations between the X-means of thermal bands and the CH. Thermal band values are directly related to land surface temperature [88]; thus, our results show reductions of the land surface temperature with increasing CH. High evapotranspiration consumes more heat, decreasing the land surface temperature [89]. Evapotranspiration is higher in forest with high biomass compared with earlier successional forest [90]. These differences in evapotranspiration are consistent with our findings of cooler temperatures in taller forests. Using multispectral sensors, others have found land surface temperature is affect by land cover type and biophysical characteristics, such as elevation and aspect, at moderate to coarse spatial resolutions [91,92]. Our methods were able to detect differences in land surface temperature at moderately high spatial resolution in the same vegetation type, elevation, and aspect, confirming that annual metrics increase the ability of remote sensors to distinguish ecological characteristics in tropical forests.
X-means of the polarizations of the L-band were positively correlated with CH, which is consistent with previous studies showing the backscatter of SAR sensors can be used for estimation of CH [38,93]. Until P-band spaceborne SAR data, which is able to more deeply penetrate forest canopies, becomes available from the ESA BIOMASS Mission, L-band is the most suitable operational spaceborne SAR data for estimation of forest structure variables [94]. The X-means of the C-band also increased with CH. Although the C-band is not able to deeply penetrate forest canopies, the higher backscatter of both C-band polarizations in upper CHs is consistent with high densities of leaves in the upper canopies. Emergent tall trees and lianas compete for light in the well-preserved forest producing dense canopy strata [95] and generating the high values for the C-band backscatter captured in our analyses. Leaf density is lower in earlier succession, consistent with lower C-band backscatter.

Map Accuracy Among Forest Types
The second key contribution of our work was demonstrating that the accuracy of the CH maps in tropical forests decrease from dry forests to moist and rainforests. Variations of our analyses, which were generated by different remote sensing resources and methods, each supported this finding. All our model combinations using five regression algorithms and three selections of predictor variables produced the highest accuracies and explained variations in the dry forest (MAT), were lower in the moist forest (TAP), and were even lower in the rainforest (CHOCO). These patterns were generated due to the reduction of predictors correlated with CH and the decreasing magnitude of these correlations in moist and rainforest. A lower quality of the multispectral sensor data in moist and rainforests could explain the general decline of these correlations since these forests present high levels of aerosol gases and clouds that attenuate multispectral imagery [64]. Our multispectral predictors were likely to be slightly less affected by aerosol gases and clouds since we used annual series formed by two-month mosaics of Landsat-8 using CFMask (The C Function of Mask) filters (i.e., filters for selecting pixels with low aerosol gases and no-clouds). Additionally, correlations between SAR predictors and CH also decreased in the moist and rainforest, despite the fact that C-and L-band SAR data are not substantially affected by aerosol gases and clouds. Thus, changes of ecological characteristics from the dry forest to the moist and rainforest would be the main cause of the general declining of these correlations and their subsequent effects in the accuracy of our CH maps and models.
We found higher magnitudes of the correlations in the dry forest, indicating that early succession (lower CHs) and late succession (upper CHs) in dry forest areas present higher ecological contrasts than early and late succession in moist and rainforest areas. Others have confirmed that environmental conditions are more restrictive for vegetation regeneration in dry forests than in moist and rainforests, such as lower rainfalls, higher temperatures, intense light incidences, and stronger winds, supporting our results [96,97]. Stronger negative correlations between thermal bands and CHs in the dry forest confirm lesser growth, biomass, and evapotranspiration [98], resulting in higher contrast of land surface temperatures between upper and lower CHs compare with moist and rainforests. We also found lower and upper CHs had different values of SWIRs, EVI and NDVI in the dry forest through the year whereas they were similar most of the year in the rainforest areas. High rainfall in the rainforest (>8000 mm) throughout the year generates high moisture levels and leaf densities across stages of forest succession, including stages with low CHs. Thus, SWIRs, EVI, and NDVI can show similar values in different CHs of CHOCO, reducing the magnitude of their correlation to the CH. Conversely, the lower rainfall in the dry forest (2050 mm) with four dry months in the year generates lower moisture levels and higher leaf deciduousness during dry season, affecting lower CHs more strongly. Hence, SWIRs, EVI, and NDVI show lower values in the lower CHs, increasing the magnitude of their correlation with CH.

Map Accuracy Among Predictor Sets
The final key contribution of our work was showing possible overfitting effects when multispectral and SAR predictors are used to build CH maps. In the three tropical forest types using five different regression algorithms, our results showed differences in the accuracy of the CH map models agree with three sets of predictor variables. The best classification accuracies and explained variances were generated by a set of 20 multispectral and SAR predictors (ten X-means and ten X-SDs), including six predictors that did not show linear or nonlinear correlations to CH. When these six predictors were eliminated, the CH map model accuracies and explained variances decreased slightly in the dry forest (+/−0.1 m and −1%) and moist forest (+/−0.2 m and −2%) but considerably in the rainforest (+/−0.5 m and −11%), showing a higher effect of the accumulation of predictors in the accuracy estimation where the relationship between CH and predictors presented the highest uncertainty, suggesting a higher susceptibility to overfitting in this set of predictors [99]. We then eliminated collinear predictors, resulting in another slight decrease of the accuracies and explained variances in the dry forest and moist forest compares with a substantial decrease of the rainforest, showing again higher possible effects of overfitting in the rainforest by the accumulation of predictors. The previous examinations have confirmed the high susceptibility of remote sensing analyses to overfit their training set data [100].
The RF regressions produced the best accuracies and explained variances than the other regressions algorithms. Others have also found RF models are more accurate than other machine learning algorithms in remote sensing analyses [101,102]. Interesting, we also found that the accuracies produced by RF regressions were less affected by the changes of predictor sets than the other four regression algorithms. Observed accuracy did not differ substantially among the three predictor sets in the dry forest and changed only slightly in the moist forest. This lower susceptibility to overfitting would be another factor indicating RF has advantages over other learning algorithms in remote sensing analysis.

Conclusions
The accuracy of canopy height mapping using LiDAR surveys as reference sample can be improved by integrating multispectral and SAR band predictors. Each band's reflectance or backscatter is related to different forest ecological characteristics that vary with changes in canopy height, producing differential relations between canopy height and each band reflectance or SAR backscatter. Likewise, annual metrics improve the relationship between canopy height and our predictor variables, resulting in more accurate mapping of canopy height in tropical forests. We found the accuracy of canopy height maps produced from our analyses decreased from dry forest to moist and rainforest, likely a result of more structurally complex canopies presenting less annual and interannual variability due to effective partitioning of light and water resources at different canopy levels. Consequently, uncertainty and overfitting of canopy height is more difficult to constrain over rainforest areas. Although we did not model potential effects of the GEDI mission's spatial sampling approach, we processed our airborne discrete return LiDAR data to a similar spatial resolution (30 m). Thus, our results have relevance to improving the prediction and mapping of canopy height using this new data source that will be available in the near future and increase through time. Overall, our results suggest that metrics derived from multispectral and SAR imaging sensors will allow development of more accurate maps of canopy height and other forest structural variables across a range of forest types while GEDI data will allow the generality of these results to be tested on a much wider array of forest types than what is currently feasible with limited aircraft LiDAR coverage. The measure of other forest traits could be also improved by integrating annual metrics of multispectral and SAR bands, such as forest degradation, diversity, and phenology. We have interpolated the response and predictor and predictor variables to the spatial resolution of 30 m, the lowest spatial resolution among these variables. Thus, the estimation of canopy height and other forest traits could be improved by integrating annual metrics of remote sensors with higher spatial, spectral, and temporal resolution, such as the multispectral sensor Sentinel-2.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/11/22/2697/s1, Figure S1: Accuracy of the map models estimated as MAE (Mean Absolute Error) in three tropical forests, Figure S2: Linear and nonlinear correlations between canopy height (CH) and X-means of multispectral and SAR predictors in three tropical forest, Figure S3: Linear and nonlinear correlations between canopy height (CH) and X-SDs of multispectral and SAR predictors in three tropical forest.