Weak Environmental Controls of Tropical Forest Canopy Height in the Guiana Shield

: Canopy height is a key variable in tropical forest functioning and for regional carbon inventories. We investigate the spatial structure of the canopy height of a tropical forest, its relationship with environmental physical covariates, and the implication for tropical forest height variation mapping. Making use of high-resolution maps of LiDAR-derived Digital Canopy Model (DCM) and environmental covariates from a Digital Elevation Model (DEM) acquired over 30,000 ha of tropical forest in French Guiana, we ﬁrst show that forest canopy height is spatially correlated up to 2500 m. Forest canopy height is signiﬁcantly associated with environmental variables, but the degree of correlation varies strongly with pixel resolution. On the whole, bottomland forests generally have lower canopy heights than hillslope or hilltop forests. However, this global picture is very noisy at local scale likely because of the endogenous gap-phase forest dynamic processes. Forest canopy height has been predictively mapped across a pixel resolution going from 6 m to 384 m mimicking a low resolution case of 3 points · km − 2 . Results of canopy height mapping indicated that the error for spatial model with environment effects decrease from 8.7 m to 0.91 m, depending of the pixel resolution. Results suggest that, outside the calibration plots, the contribution of environment in shaping the global canopy height distribution is quite limited. This prevents accurate canopy height mapping based only on environmental information, and suggests that precise canopy height maps, for local management purposes, can only be obtained with direct LiDAR monitoring.


Introduction
The height of a tropical forest canopy is a key feature of the functioning and dynamics of a tropical forest ecosystem [1]. Canopy height is mainly determined by the natural regime of disturbances inducing the gap-phase forest dynamic [2], but also depends heavily on the impacts of human activity such as fire occurrence in disturbed forests [3] or logging activity in production forests [4]. In the global change context, precise knowledge of forest canopy height is also a prerequisite to accurately predict the aboveground forest biomass, a key component of the terrestrial carbon cycle [5]. At the individual tree level, forest canopy height is an important parameter to improve tree allometric equations [6]. Indeed, most allometric equations estimate this height value from a statistical inference procedure the relationship between forest structure and environmental variables and the effect of sampling strategy on the accuracy of the canopy height map prediction with airborne LiDAR in French Guiana. It is interesting to understand the mechanisms that control forest structure and provide a map of forest canopy height for forest manager at large scale. However, a common feature of data regularly sampled in space, such as the LiDAR and environmental variables used in this study, is that they typically exhibit spatial autocorrelation [34]. Most statistical tests of association require independence between observation data, a condition always being violated when autocorrelation is present within the data [35]. In this study, we take advantage of a randomization method recently developed by [36] to test the association between two spatially structured variable sets. In the present study, we use high-resolution maps of LiDAR-derived DEM and DCM to investigate the role of a large panel of potential environmental drivers on variations in forest height in tropical forests. More specifically, we address the following questions: 1.
What is the scale and magnitude of spatial autocorrelation of tropical forest heights? 2.
Are there links between environmental variables and canopy heights in tropical forests and do these links depend on the spatial resolution of the LiDAR-derived maps? 3.
Are environmental variables important drivers of tropical forest canopies and do they bring useful information in the perspective of predictive mapping?

Study Site
The study site is located in the Regina forest (4 • N, 52 • W) ( Figure 1). The most common soils in the Regina forest are ferralitic soils. The site is located on slightly contrasting plateau-type reliefs that are rarely higher than 150 m on average. The forest is typical of Guianan rainforests. The dominant families in the Regina forest include Burseraceae, Mimosoideae and Caesalpinoideae. The site receives 3806 mm of precipitation per year, with a long dry season from mid-August to mid-November, and a short dry season in March [13].

LiDAR DEM and DCM
LiDAR data were acquired in 2013 over 30,000 ha of forest by a private contractor, Altoa [37] by aircraft. The LiDAR system used for acquisition was a Laser Riegl LMS-Q560. This system was composed of a scanning laser altimeter with a rotating mirror (Riegl LMS-Q560), a GPS receiver (coupled to a second GPS receiver on the ground), and an inertial measurement unit to record the pitch, roll, and heading of the aircraft. The laser wavelength was 0.9 µm (near infra-red) . The aircraft flights were conducted at 500 m above ground level with a ground speed of 180 km·h −1 . Each flight derived two acquisitions. The LiDAR was operated with a scanning angle of 60 • and a 200 kHz pulse repetition frequency. The laser recorded the last reflected pulse with a precision better than 0.10 m with a density of 5 pulses·m −2 . The ground routine of Terrascan software (Terrasolid Ltd., Helsinki, Finland) was used to filter raw points and identify ground echoes from which the Digital Elevation Model (DEM) was derived from a triangular interpolation network. A Digital Surface Model (DSM) was then obtained by extracting the highest echo on a 1 m grid, and a canopy height model (DCM) was finally derived by subtracting the 1 m-resolution DTM from the 1 m-resolution DSM. Because Deblauwe's [36] algorithm requires square raster maps, we clipped three plots in the natural forest: p1, p2, and p3 ( Figure 1).

Environmental Variables
In order to evaluate the relationship between the DCM and the environment, we extracted several topographical and hydrological variables that we believe are important for driving canopy height variation ( Figure 2) ( Table 1) The Hydraulic Altitude (HA) was computed from the 3rd order hydraulic system. The HA of a cell is its altitude above the closest stream of its hydraulic basin. • In order to map terra firme versus floodplain forest, we used a predictive model adapted from the HAND topographic algorithm (Heigh Above the Nearest Drainage ) using a high resolution DTM. The HAND binary index has been developed to model soil water conditions in Amazonia rainforest and proved to provide good proxy of permanent water saturation in other contexts [38]. We classify pixels with HAND ≤ 2 m as floodplain and other pixels as terra firme.
Topographical variables

•
We computed the slope from the DEM as the maximum rate of change in elevation from a cell to its eight neighboring cells over the distance between them. • The Terrain Ruggedness Index (TRI) catches the difference between flat and more mountainous landscapes. The TRI was calculated using Arc-GIS 10.2 as the sum of the change in altitude between a grid cell and its eight neighbors. • We used the Topographic Exposure index (TOPEX) to measure exposure to the wind. TOPEX is a variable that represents the degree of shelter assigned to a location. TOPEX was derived from the quantitative assessment of the horizontal inclination. The values of this index are closely correlated with the wind-shaped index. Exposure is calculated based on the height and distance of the surrounding horizon, which are combined to obtain the inflection angle. We used this angle to quantify topographic exposure. A higher inflection angle is equal to a lower exposure or more shelter [23].

Data Analysis
As input data, we extracted three large plots (p1, p2, p3, each 3782.25 ha, corresponding to a grid with 1230 × 1230 grid cells of 25 m 2 ) in our study area. The three plots were chosen in order to cover the maximum area of LiDAR coverage. The shape of the plots is important because the Deblauwe algorithm what we use only works with square rasters (number of lines = number of columns).

Spatial Autocorrelation
The scale and magnitude of spatial autocorrelation in canopy height was determined using the Moran index [39]. Moran's Index is based on the a priori specification of neighborhood connections between observations, taking values close to 1 in case of positive spatial autocorrelation, close to −1 in case of negative spatial autocorrelation, and close to 0 when there is no spatial autocorrelation. We obtained a Moran correlogram, which represents the values of the Moran Index according to distance (cell size: 5 m) in plots p1, p2, and p3.

Link between Forest Canopy Height and Environmental Drivers at Different Scales
We used a wavelet-based technique developed by [36] on squared plots, to test pairwise association between the spatial structure of the digital canopy model and the environmental variables at multiple scales. The procedure can be applied to an unbiased test of any test statistic of association. We chose to test the Pearson's product-moment correlation coefficient r obs between gridded data sets of canopy height and environmental data.
We calculated confidence intervals for the null hypothesis of no correlation (H 0 : r = 0) using Monte Carlo methods. Specifically, we randomized one of the processes, in our case the DCM (canopy height data), in a way that preserves the same probability density function as the original template and also preserves the autocorrelation function. From Monte Carlo simulations, we generated 999 Dual-Tree Complex Wavelet Transform (DT-CWT) data surrogates for the DCM and environmental variable (in a quadrat subset) to determine the p-value of the null hypothesis. DT-CWT, is an improvement on classical discrete wavelet transform techniques that is immune to shift dependence, i.e., major variations in the distribution of discrete wavelet transform coefficient energy over scales and orientations caused by small integer displacements of the pattern. DT-CWT also has the advantage of a better distinction and reproduction of possible pattern directions. We then compared the correlation between the observed DCM and environmental variables with the 95% confidence interval generated from these surrogates. In the case of Pearson's product-moment correlation coefficient, which may vary between −1 and 1, the test is two-tailed, and we consider the proportion of simulations for which the correlation r sim falls outside the interval [−r obs , r obs ] where r obs is the observed Pearson's correlation coefficient. Because hydrological and topographic variables are structured at multiple scales, and in order to assess the importance of the spatial scale in shaping the link between forest canopy height and environmental drivers, we apply the test to the grids re-sampled at six different resolutions: 1024, 512, 256, 128, 64, and 32 pixels, which correspond to 6,12,24,48,96,192, and 384 m per pixel, respectively. Conversion to larger pixel size was done by using the "aggregate" function of the Raster package, using the mean height estimate.

Prediction of Forest Height from Environmental Drivers
We predicted the forest canopy height using Kriging with the following Equation (1): where z(s 0 ) is the predicted value at an unsampled location s0, m(s 0 ) is the fitted trend, e(s 0 ) is the Kriging residual, λ i are the Kriging weights determined by the spatial autocorrelation structure (variogram) of the residual, and e(s i ) is the residual at location s i . We used the Ordinary Kriging (OK) model, allowing the interpolation of unsampled data from the semivariogram [40]. The semivariogram plots the semivariance γ as a function of the distance between samples h: where γ(h) is the semivariance as a function of the lag distance h, N(h) is the number of pairs of data separated by h, and z is the estimated canopy height at locations us i and (s i + h). Three parameters are commonly used to model the behavior of semivariograms: • The range is the lag distance at which the sill is reached. The range is the distance where the spatial correlation vanishes and the variogram levels off. • The sill is the semivariance at the lag distances above which there is no spatial autocorrelation. • The nugget is the semivariance at a lag distance of zero; it represents the spatial variability below the sampling frequency.
In this study, we postulated an exponential form to the covariance model, which takes the form: where a is the range, h is the lag, c 0 is the nugget, and c 0 + c is the sill. For each case, the mean semi-variance was computed for 10 distance classes (i.e., limits at 250 m, 500 m, 1 km, 1.5 km, 2 km, 2.5 km, and 3 km) and compared with the null hypothesis of an absence of spatial structure simulated by 999 randomizations of canopy height. OK was then used to estimate values of z at location s 0 using the following Equation (4):

Strategies to build height maps
We first sampled 500 points from p1, p2, and p3, and for each pixel size (6,12,24,48,96,192, and 384). We chose 500 points because this number corresponds to a low-resolution case (GLAS spaceborne) of about 3 points·km −2 . Then, we made two types of predictions: • Inside the calibration plots, where we compared the prediction with and without environmental covariates (Slope, TOPEX, TRI, HA, DA, HAND) • Outside the calibration plots, where we compared predictions with environment-only to a null model (no calibration points, no environment) To do so, we defined a regularly spaced grid covering the study area with pixel sizes of 6,12,24,48,96,192, and 384 m. For each pixel size and for each type of prediction (with and without environment, inside or outside), 10 maps were computed, a new sampling of 500 points each time to generate confidence intervals. For each pixel size, the canopy height was predicted and the accuracy of the spatial prediction of canopy height (H) was assessed by comparing predicted values with LiDAR values (H) from DCM using the Root Mean Square Error (RMSE): Arc-GIS 10.2 software was used to create environmental variables from DTM. MATLAB software (The MathWorks Inc., Natick, MA, USA, 2012) was used to perform environmental variable association tests. Most of the analysis was performed with the R project software [41] and more specifically, with the following packages: rgdal [42] and maptools [43].

Spatial Variability of Canopy Height
In order to characterize the range of magnitudes of the spatial autocorrelation, Moran's Index was used for each plot (p1, p2, p3). The three correlograms are very similar and significant at low distances (I p1 = 0, 018; I p2 = 0.019; I p3 = 0.017) (Figure 3). The main conclusion is that Moran's Index decreases with distance and is non-significant beyond 2500 m.

Environmental Drivers of Canopy Height
We found that the investigated environmental variables were significantly correlated with forest canopy height and that the degree of correlation varied strongly with the pixel size for all variables except the slope (Figure 4). We note a decrease in the absolute value of most coefficients of correlation when pixel sizes decrease. We observed (i) two variables negatively correlated with forest height: HAND and DA; and (ii) four variables positively correlated: TOPEX, HA, TRI, and SLOPE.

Effect of Hydrological Variables
We found a very strong positive correlation between HA and canopy height variation for a pixel size of (192 m × 192 m), r = 0.47, p-value < 0.001. The HAND index also showed a very high correlation with forest canopy height, with the highest for a pixel size of 3.6 ha (192 m × 192 m), r = −0.62, p-value < 0.001.

Effect of topographical variables
TOPEX was positively correlated with canopy height r = 0.34, p-value < 0.001; when the canopy was exposed to wind, we observed an increase in canopy height. TRI was a positive predictor of canopy height variation with a pixel size of 0.14 ha (12 m × 12 m), where we observed a maximum of coefficient correlation, r = 0.17, p-value < 0.001. Finally, we found non significant correlation between slope and canopy height with r = 0.066, p-value = 0.41.

Discussion
In this study, we investigated the spatial structure of canopy height variations and its relationship with the environment across spatial scales of 6 to 384 m using high-resolution maps of LiDAR-derived digital canopy height and elevation for 30,000 ha of tropical forests in French Guiana.

Spatial Auto-Correlation in Forest Canopy Height
As hypothesized, canopy height is spatially auto-correlated at a fine scale. However, we highlight a residual spatial structure even at long distance (i.e., 2500 m) that was not really expected. This suggests that canopy height variation results from complex interactions with biotic and abiotic factors at various spatial scales. Taking into account this spatial structure is thus of primary importance for predictive mapping. For instance, Fayad et al. [33] strongly improved the precision of canopy height estimates using regression Kriging (from 6.5 to 4.2 m in RMSE using the GLAS dataset, and from 5.8 to 1.8 m using an airborne LiDAR dataset).

Spatial Structure at Fine Scale
At fine resolution, the individual tree crown structure may be detectable and drives the auto-correlation signal [44]. In the same way, a forest gap induced by a single tree fall [45] can be observed, and structures the LiDAR signal. Gap phase dynamics play an important role in forest structure, because the formation of gaps creates openings from 10 m 2 to 1000 m 2 , which provide ideal conditions for tree growth but also modify the forest height by creating disrupted forest canopies. At this fine spatial scale, competition for light between individual trees colonizing the forest gaps drives the changes, in time and in space, in forest canopy height. Such competition for light leads to tree crown rearrangement and to the clustering of trees growing together in the form of homogeneous patches of similar canopy height up to 60 m [46,47]. These higher biotic interactions, typical of many tropical forests, are likely to drive the spatial structure of forest height at fine scale. Our results thus suggest that observations of canopy height with high resolution give useful information on the spatial extent of forest dynamics. Another possible explanation for the spatially structured forest height at fine scale is related to the fine-scale spatial structure of the forest environment [48]. Clark et al. [2] observed that the spatial structure of abiotic factors such as soil types or topographical levels in tropical forests may modify tree growth in both diameter and height. We thus have to keep in mind that abiotic factors such as edaphic conditions may also indirectly contribute to create spatial structures of forest height at very fine scale.

Spatial Structure at Large Scale
The spatial auto-correlation at large scale (>100 m) could be explained by landscape-scale variation in topography. For instance, several studies observed that dendrometric (diameter and height) measurements and allometric relationships for one species could be different according to the type of soil [6], which is directly linked to the type of water drainage [11]. Indeed, in French Guiana [49], examined in detail the spatial distribution of vegetation in 19 ha of tropical forests and concluded that large soil drainage classes affect tree height. Moreover, at even larger scales, geomorphology and landform variations may contribute to modify the forest canopy height [50]. On our three plots, we observed a deep vertically draining soil and the same type of landscape [11]. We may thus suppose that the absence of spatial structure in forest canopy height at very long distances (greater than 2500 m) is partly linked to the low geological and geomormophologic variation of our study area.

Detecting Environmental Drivers with Multi-Scale Analysis
Many authors link the canopy height of mature forests to edaphic factors, and notably, to drainage conditions [50] . Our results expand on previous findings on the influence of topographic variation on tropical forest structure by identifying the major topographic variables and the scales at which relationships are strongest, and by clearly illustrating their link to hydrological processes.

Observations from Fine to Large Scale
Regardless of scale, the behaviors of the Pearson's correlation coefficients are coherent, i.e., even with a change in pixel size, and the signs of all the correlation coefficients are not reversed. A second common observation is related to the increase in the absolute value of the correlation coefficient when the scale of observation increases. This result could be explained by the fact that at fine scale (<0.05 ha), we observe the height of each tree and the forest gap dynamic fingerprints, while at large scale (>1 ha), we observe forest height variation due to topographical heterogeneity. We thus highlight that the transition from fine to large scale removes the effects of local variations that are related to specific local forest history.

Effects of Specific Environmental Variables
All the environmental drivers used in this study were successfully related to forest canopy height variation. However, hydrology-related variables such as HA, HAND, and DA are much more significantly related to forest canopy height than SLOPE or TRI. Sabatier [51] noticed that the distribution of stand heights reflects soil conditions, while Lescure and Boulet [52] highlighted that soils with deep vertical drainage have a greater number of taller trees. Our results are in accordance with larger landscape scales. We observed an unexpected effect of slope and terrain ruggedness on forest height, unlike Ashton and Hall [12], who observed that in Borneo, the tallest forests are generally found on lower slopes and flat or undulating land. The Regina forest is characterized by a gentle hilly landscape comprised of ferralitic soils [49]. The ferralitic soils may have mechanical properties that make it possible to have a high canopy. Topographic exposure to wind showed a significant effect on forest height; however, this result was again not expected, i.e., canopy height was higher on the more exposed pixels. The Amazonian forests are rarely thought to be strongly affected by wind or hurricanes. However, pixels that are most exposed to wind are on hilltops where environmental conditions such as drainage, nutrients, and soil chemical properties are highly favorable to tree growth [11]. The effects of wind are well known on insular systems or in places where hurricanes are frequent [20]. In French Guiana, this is clearly not the case, and we believe that the positive effect of exposure is likely due to an indirect soil effect.

Mapping Canopy Height with Low-Resolution Remote Sensing Information
Using a few points (N = 500) to map the canopy height inside the calibration plots, we showed that (i) RMSE values remained quite high compared to a constant forest height null model; and that (ii) the input of additional ancillary environmental covariates only slightly improved the prediction accuracy ( Figure 5). This result is quite surprising given that many recent studies have mapped canopy height at large scale from LiDAR spaceborne data. For instance, [47] estimated, from GLAS and 500 m Moderate Resolution Imaging Spectroradiometer (MODIS) data, a canopy height with an RMSE of 5.9 m. Building on [47], Simard et al. [9] drew a global canopy height map (RMSE of 6.1 m) making use of environmental ancillary data such as the annual mean precipitation, seasonal precipitation, annual mean temperature, seasonal temperature, and data from a DEM and percentage tree cover provided by MODIS. In these two recent studies, the use of GLAS allowed the mapping of canopy height at coarse resolution on the global scale. Our results show that forest height can indeed be estimated using points sampled from LiDAR waveforms ( Figure 6), but when high-resolution mapping is needed (e.g., for forest management) it fails mostly because the environmental heterogeneity, at this scale, is too low to deeply shape the forest canopy height variation. Other problems that arise in comparing results from low-resolution spaceborne to high-resolution airborne LIDAR are that (i) the altimetric error of airborne LiDAR is better than that of spaceborne LiDAR; and (ii) GLAS Lidar values are strongly influenced by terrain slope. Indeed, Hilbert and Schmullius [53] showed that terrain slope plays a key role in the spaceborne LiDAR height estimate, while airborne LiDAR is only slightly influenced by terrain slope effects. This may explain the a priori low performance of our height model map when compared to spaceborne LiDAR studies.

Do Environment Variables Really Help to Map Forest Canopy Height?
On one hand, outside of the calibration plots (cases with absolutely no information), the environmental covariates only slightly improved the RMSE. On the other, inside the calibration plots (cases with low-resolution information), the environmental covariates barely helped to improve the forest height accuracy. This raises questions as to the usefulness of environmental data in predicting forest canopy height at the forest management scale (dozens of hectares). This also means that the prediction accuracy will significantly depend on the LiDAR sampling intensity and pixel resolution ( Figure 5). Outside of the calibration plots, the effects of the spatial Kriging quickly disappear [32] and only the linear covariables (environment) effects remain. Because of this strong spatial structure, another sampling strategy, e.g., using transect lines, may be a good option [32,33]. However, when transect lines or subsamples with high sampling intensity are not available, the computing costs of the spatial model may seem prohibitive for mapping canopy heights at large scale. Finally, the contribution of environmental covariates is higher at low-resolution where the control of topography, landforms, and hydrological processes on the variability of forest canopy height is clear. At fine resolution, the height variability is more controlled by forest endogenous dynamic processes, e.g., the gap phase dynamic typical of many undisturbed tropical forests. It may be expected that the environment plays a key role in this endogenous dynamic [19], but its effect is hardly detectable when looking only at the forest canopy height.

Conclusions
In this study, we evaluated the spatial structure of the canopy in natural forests and we analyzed environmental factors affecting canopy height distribution in French Guiana. Our results provide evidence of the scale-dependent linkage of the topographical and hydrological variables in shaping the canopy structure of tropical forest. Furthermore we mapped forest canopy height on 30,000 ha with LiDAR data, it appears that it could be possible to map forest canopy height even if at fine scale they have high error values of prediction. On one hand, using the kriging method shows that prediction accuracy of forest canopy height strongly depends on sampling intensity. On the other, when we try to predict canopy height with environmental variables we observed that the pixel resolution play an important role in the final error of prediction. Future work should evaluate the generality of our results at a regional scale, and aim to elucidate the ecological mechanisms that underlie them. Studies should specifically investigate how canopy height is control by soil depth, geological formation, climate and, and how spatial variance in DCM scales in these Guiana Shield forests. Understanding the roles of different factors is likely to be challenging because many factors covary spatially. Study design, and especially appropriate choice of scales, is thus a critical issue. Well-designed studies of this kind have the potential to greatly improve our understanding of tropical forest structure for forest managment, and our ability to project the responses of tropical forests to anthropogenic global change.