Topography and Three-Dimensional Structure Can Estimate Tree Diversity along a Tropical Elevational Gradient in Costa Rica

This research seeks to understand how tree species richness and diversity relates to field data (1-ha plots) on forest structure (stems, basal area) and lidar derived data on topography and three-dimensional forest structure along an elevational gradient in Braulio Carrillo National Park, Costa Rica. In 2016 we calculated tree species richness and diversity indices for twenty 1-ha plots located along a gradient ranging from 56 to 2814 m in elevation. Field inventory data were combined with large footprint (20 m) airborne lidar data over plots in 2005, in order to quantify variations in topography and three-dimensional structure across plots and landscapes. A distinct pattern revealing an increase in species’ richness and the Shannon diversity index was observed in correlation with increasing elevation, up to about 600 m; beyond that, at higher elevations, a decrease was observed. Stem density and basal area both peaked at the 2800 m site, with a mini-peak at 600 m, and were both negatively associated with species richness and diversity. Species richness and diversity were negatively correlated with elevation, while the two tallest relative height metrics (rh100, rh75) derived from lidar were both significantly positively correlated with species richness and diversity. The best lidar-derived topographical and three-dimensional forest structural models showed a strong relationship with the Shannon diversity index (r2 = 0.941, p < 0.01), with ten predictors; conversely, the best species richness model was weaker (r2 = 0.599, p < 0.01), with two predictors. We realize that our high r2 has to be interpreted with caution due to possible overfitting, since we had so few ground plots in which to develop the relationship with the numerous topographical and structural explanatory variables. However, this is still an interesting analysis, even with the issue of overfitting. To reduce issues with overfitting we used ridge regression, which acted as a regularization method, shrinking coefficients in order to decrease their variability and multicollinearity. This study is unique because it uses paired 1-ha plot and airborne lidar data over a tropical elevation gradient, and suggests potential for mapping species richness and diversity across elevational gradients in tropical montane ecosystems using topography and relative height metrics from spaceborne lidar with greater spatial coverage (e.g., GEDI).


Introduction
Tropical forests are experiencing high rates of deforestation and degradation; this has had in a strongly adverse impact on biodiversity and ecosystem services [1][2][3].In recent years, the international community has recognized the loss of diversity-and the associated impact upon ecological functions and services-as a key threat to the sustainability of tropical ecosystems [2].Quantifying patterns and understanding the processes that maintain species diversity across tropical landscapes are considered among the ten most challenging problems that require spatial data from combined ground and remote observations to resolve [4].Tropical montane forests are of particular interest because of their high diversity, the complexity of their landscapes, the intensity of degradation from anthropogenic forces, and their vulnerability to changing climate.
Patterns of diversity along elevational gradients in tropical montane forests have been the subject of many studies [5][6][7][8].Diversity along elevational gradients was initially believed to decline linearly from the warmer lowlands to the cooler highlands, analogous to patterns exhibited by increasing latitude [8][9][10][11][12].However, recent tropical montane studies that examine the full extent of elevational gradients show a pattern characterized by a mid-elevation diversity peak of trees, epiphytes, and mammals [5,8,[12][13][14][15][16][17][18][19][20][21][22].Species diversity is hypothesized to be greatest midway up the gradient, at an ecotone type environment, with species intermingling from both lower and higher elevations [6,19,23,24].However, the structure of the forest may also be a key mechanism for diversity patterns, and may also explain why there are more species at mid-elevations.In particular, field data suggests that stem density and biomass may be associated with diversity along elevation gradients, and new technologies such as lidar are available to examine the micro-topography and three-dimensional structure of tropical forests [25][26][27].
Several aspects of forest structure data, including stem density, basal area, and aboveground biomass, are generally collected in the field and may be associated with tree diversity [28,29].In general, sites with higher numbers of individuals per unit area, e.g., higher stem density, tend to have more species [6,13,17,18,[30][31][32][33][34].A similar pattern has also been noted for basal area and biomass of forests, with the high diversity in areas with the highest basal area [28,35,36].However, there have been a number of studies in lowland rainforests that have reported only weak relationships between diversity and basal area/biomass, as just a few large trees may contain much of the biomass [37,38].
Canopy and understory structural variations and topographical metrics have been associated with patterns of diversity in tropical lowlands, and can be quantified through the use of lidar remote sensing [37,39].Lidar can provide waveforms of the returned energy from relatively small to large footprints, over large areas.Prior research has found that lidar waveform data closely matched field-collected vertical canopy profiles in the lowland forest [40,41].If there is a reasonable relationship between the lidar-derived topography, and three-dimensional structure of a forest stand and plot data are representative of the landscape, it should be feasible to predict tree diversity across an entire area covered by a lidar swath.Utilizing lidar technology over larger areas may be assessed more quickly than ground-collected data alone, and may help to pinpoint critical areas for conservation priorities [42][43][44].
There are a number of theoretical and empirical reasons why lidar metrics on topography and three-dimensional structure should be associated with patterns of diversity.The topographic variables of elevation, slope, and aspect have been reported to show varying levels of predictive power for diversity; these can easily be quantified with lidar [24,28,45].In particular, steep slopes generally result in areas with less soil development, lower levels of soil nutrients, and less water retention, and hence, less diversity than in valleys or on mild slopes [24,28,37,45,46].In addition, it has also been hypothesized that aspect, or the orientation of a site, may be associated with diversity with areas with less direct sun light.Such areas may maintain higher water holding potential, and thus support greater species richness [37,47].Studies in tropical forests have clearly shown that the development of high and structurally complex canopies results in greater species richness [37,48].In particular, the mean and standard deviation of tree or canopy height have been associated with high levels of diversity in tropical forests, with taller trees and more heterogeneity in the canopy permitting a wider niche for other tree species to persist [39,49,50].Recent studies have also shown that upper canopy variability is correlated with increased species richness among trees (>1 cm dbh), possibly due to partitioning of light resources [37].In particular, metrics on canopy height and heterogeneity have been able to predict 25% of the variation of species richness for trees ≥10 cm dbh in lowland rainforests [39].
This research seeks to assess how tree species richness and diversity changes along a Neotropical elevational gradient within Braulio Carrillo National Park in Costa Rica.This research has three primary research objectives.First, we test if species richness and diversity and the mid-elevation peak is associated with forest structure metrics (stems, basal area) collected in 1-ha plots in the field.Second, we test whether topographical and forest structural metrics derived from lidar were associated with tree species richness and diversity within 1-ha plots.Third, we examined if lidar-derived topographical and three-dimensional structural metrics within plots are representative of the overall landscape of Braulio Carrillo National Park.

Study Area
This research was conducted along a unique tropical forest elevational gradient in Costa Rica, with 1-ha plots spanning from within the La Selva Biological Station at 56 m above sea level (m.a.s.l) up to 2814 m.a.s.l. in Braulio Carrillo National Park (BCNP).This elevational gradient, protected by a National Park, is quite unique for Central America, spanning several ecoregions, from lowlands to high elevation cloud forest, and providing the conditions for high species turnover and diversity [51].The lowland forest in this region has an average annual rainfall of ~4000 mm, with ~9000 mm at mid-elevations and ~3000 mm at the peak [6,22,52].The average temperature is 26 • C in the lowlands; temperatures decrease with increased elevation, down to ~10 • C at the peak [28].Soil types vary over the region, but all are influenced by the past activity of the Barva Volcano; parent materials of basaltic and andesitic lavas from the Plio-Pleistocene age are common along the gradient, with more agglomerate tuff-like soil at high elevations [53].Due to volcanic activity, the soils are younger and have endured less weathering near the peak of the Barva Volcano [6,53].Besides the uniqueness of the protected area extending over multiple life zones, this area is also special because it offers paired field and airborne lidar data over a tropical elevation gradient, which few other places in the world offer [40,52].We use the term 'landscape' when we refer to the area within Braulio Carrillo National Park, which is a mountainous area surrounded by lowlands covered in agricultural land.Here, we are only concerned with the landscape of the mountainous and forested protected area of Braulio Carrillo National Park and La Selva Biological Station, covered by the lidar dataset.

Field Data
Vegetation surveys were conducted along the gradient within BCNP for 20 1-ha plots ranging from 56 m.a.s.l. to 2814 m.a.s.l.(Figure 1).Tree diameter at breast height (DBH, 1.3 m) was measured using a fabric diameter tape +/−1 mm resolution for all trees ≥10 cm DBH.Species were determined in field and identified using herbarium vouchers.Nine of the 1-ha plots were from a collaborative effort of Conservational International's TEAM Project [54], and 11 1-ha plots at <120 m.a.s.l. at the La Selva Biological Station, as part of the Carbono project [28].For both datasets, the survey data from 2011 was used to assess patterns of tree species' richness, diversity and structural characteristics (stem density and basal area) along this elevational gradient.The data used in this study is currently available online, at http://www.teamnetwork.org/[54].

Lidar Remote Sensing
Lidar data were used to assess topographic variation (elevation, slope, and aspect), as well as three-dimensional forest structure metrics (canopy height and understory variation) across the majority of BCNP and within the 1-ha plots (Table 1).We used NASA's Land, Vegetation, and Ice (LVIS) sensor, a medium altitude airborne laser altimeter system, operating up to 10 km above ground, with a varying footprint, but usually flown in a medium resolution with circular footprints of 10-30 m in diameter.LVIS was flown over BCNP in 2005 between March 22nd-30th on the DOE King Air B-200 platform [43,55,56].Within each footprint, lidar measures distance from the instrument to the surfaces below, resulting in several signal returns, effectively capturing canopy height with the first returns and a digital elevation model (DEM) with the last.The waveform outputs can help to derive canopy heights, vertical relative height metrics (rh metrics), and topographic features, relative to the WGS-84 ellipsoid [43].The DEM from the last return of the LVIS signal was used to calculate elevation, slope in degrees, and aspect which was converted to cos(aspect).For three-dimensional structure, the LVIS waveform was segmented into quartiles of energy, also known as relative height metrics (rh100, rh75, rh50, and rh25), to quantify three-dimensional forest structure [55].These relative height metrics allow for the assessment of forest structure from the upper canopy (rh100), the canopy (rh75) the sub-canopy (rh50) and the understory (rh25) at the 1-ha plot level.Past research has found that these four metrics provide enough information from the waveform to study forest structure [26,27,39,43].
While there is a temporal discrepancy between the LVIS data collection and more recent field collections, we did compare 2005 field data to the 2016 data for a few plots that were available, and these displayed similar values of diversity across all life forms; we are assuming the natural growth and treefall occurrence over the interim years.In addition, from multiple visits to the sites, we do not believe that the structure or levels of species richness have changed significantly between the time periods.

Data Analysis
Species richness and the Shannon diversity index were calculated for each plot [7,57].Species richness was defined as the number of tree species ≥10 cm dbh per ha, while the Shannon diversity index is a metric of diversity that also includes information on evenness or the proportion of individuals of each species, so it also takes into account the community structure as a whole (S1) [58,59].The mean and standard deviation of topographic metrics (elevation, slope, aspect) and three-dimensional structure metrics (rh25, rh50, rh75, rh100) were calculated at the plot level.Pearson correlations and regression analyses in R were used to determine the relationship between species richness and diversity and field data on forest structure (stem density and basal area from plots), and lidar-derived metrics (topographical metrics and forest three-dimensional structure within the plots).
In order to extrapolate the plot-level data to the landscape, we fit linear models that predict species richness and diversity from field-and lidar-derived topographical and three-dimensional forest structure metrics.For each response variable, we used the leaps package in R to optimize the multiple linear regression model by fitting an exhaustive set of models-including all combinations of predictor variables-and choosing the model that minimized the Bayesian Information Content (BIC) [57,60].In addition to using BIC as a measure of performance on each fitted model, cross-validation was also performed to optimize the number of predictors included in the multiple linear regression, by estimating the testing Mean Squared Error (MSE) of optimal models with each number of predictors.Because many predictors were included in the multiple linear regressions compared to the number of observations in the data, our model was very flexible.We constrained this flexibility in order to reduce overfitting by using ridge regression.
Ridge regression acted as a regularization method, shrinking coefficients in order to decrease their variability and multicollinearity.The linear regression and ridge regression models were compared using 5-fold cross-validation to estimate the testing MSE, in order to judge the extrapolation quality of our field data derived model [61,62].We compared the mean and standard deviation of topography and three-dimensional structure metric from the field plots with the landscape based on lidar derived metrics.We calculated canopy heights across 100 m bins of elevation, which we refer to as landscape data.Plot data was paired with lidar data, and a Shapiro-Wilk normality test showed that the differences between them were approximately normally distributed.Paired t-tests were performed between plot and landscape data to identify if topography (slope and aspect) and three-dimensional forest structure (rh100, rh75, rh50, rh25) from the plots were significantly different from landscape data.

Species Richness, Diversity, and Forest Structure from Plots
There were 403 identified tree species ≥10 cm dbh in the 20 1-ha plots, representing 87 different families.Tree species richness and diversity changed in a non-linear fashion along the montane gradient (Figure 2); species richness (178 sp.) and Shannon diversity (4.64) both peaked at the plot located at 552 m elevation, VB3 (Table 1).From the lowest elevation sites, species richness and diversity steadily increased up to 552 m, then declined, with an exceptionally large decline after the 2355 m plot.This 2355 m plot had a species richness of 49 and a Shannon diversity index of 3.2, and the highest elevation site (2813 m) had a species richness of 13 and a Shannon index of 1.77 (Table 1, Figure 2).From the lowest elevation sites, species richness and diversity steadily increased up to 552 m, then declined, with an exceptionally large decline after the 2355 m plot.This 2355 m plot had a species richness of 49 and a Shannon diversity index of 3.2, and the highest elevation site (2813 m) had a species richness of 13 and a Shannon index of 1.77 (Table 1, Figure 2).Stem density increased from 360 stems per hectare at the 56 m plot to 657 stems at 386 m; it then decreased to 606 stems at 552 m, and 397 stems at 1425 m, and exceptionally increased to 1221 stems at the 2814 m plot (Table 1).At lower elevations, basal area had a pattern similar to stem density, with a peak at the 106 m plot (28 m 2 ) and 552 m (27.5 m 2 ); it then decreased before a slight increase at 1425 m.The two highest plots, 2355 m and 2814 m, had the highest basal area along the elevational gradients (35.8 m 2 and 66.2 m 2 respectively).
Species richness was not significantly correlated with stem density and basal area; however, Shannon diversity was significantly negatively correlated with stem density (Figure 3) and basal area (Table 2).The results of Pearson correlation tests between each of these response variables of species richness and diversity with stem density were r = −0.272,p < 0.25 and r = −0.517,p < 0.02, respectively.The results of Pearson correlation tests between each of these response variables and basal area were r = −0.550p < 0.002 and r = −0.739,p < 0.001, respectively (Table 2).The inset shows the variations of diversity across lowland sites where more plots were available for this study.Similar variability may exist in higher elevations but no additional replicates of plots were available to verify.
Stem density increased from 221 stems per hectare at the 56 m plot to 657 stems at 386 m; it then decreased to 606 stems at 552 m, and 397 stems at 1425 m, and exceptionally increased to 1221 stems at the 2814 m plot (Table 1).At lower elevations, basal area had a pattern similar to stem density, with a peak at the 106 m plot (27.9 m 2 ) and 552 m (27.5 m 2 ); it then decreased before a slight increase at 1425 m.The two highest plots, 2355 m and 2814 m, had the highest basal area along the elevational gradients (35.8 m 2 and 66.2 m 2 respectively).
Species richness was not significantly correlated with stem density and basal area; however, Shannon diversity was significantly negatively correlated with stem density (Figure 3) and basal area (Table 2).The results of Pearson correlation tests between each of these response variables of species richness and diversity and stem density were r = −0.272,p < 0.25 and r = −0.517,p < 0.02, respectively.The results of Pearson correlation tests between each of these response variables and basal area were r = −0.550p < 0.002 and r = −0.739,p < 0.001, respectively (Table 2).
Species richness was not significantly correlated with stem density and basal area; however, Shannon diversity was significantly negatively correlated with stem density (Figure 3) and basal area (Table 2).The results of Pearson correlation tests between each of these response variables of species richness and diversity with stem density were r = −0.272,p < 0.25 and r = −0.517,p < 0.02, respectively.The results of Pearson correlation tests between each of these response variables and basal area were r = −0.550p < 0.002 and r = −0.739,p < 0.001, respectively (Table 2).The inset shows the variations of the diversity and stem numbers across lowland sites where more plots were available for this study.Similar variability may exist in higher elevations but no additional replicates of plots were available to verify.

Lidar Metrics and Species Richness and Diversity
Topographic and three-dimensional structure were correlated with metrics of species richness and the Shannon diversity index (Table 2).Lidar metrics on topographic variables were more closely correlated with Shannon diversity than species richness.For the topographic variables, elevation was significantly correlated with species richness and diversity.For height metrics, species richness was correlated with the top two canopy height metrics, rh100 and rh75, and the standard deviation of the understory, while diversity was correlated with the top three canopy height metrics and the standard deviation of the upper canopy.The lidar metrics derived from the quartiles of the lidar waveform were found to have a relatively linear relationship with tree species richness and diversity, particularly for lower rh metrics (Figure 4a,b).The multivariate regressions focused on the use of topographic and three-dimensional structure to predict species richness and the Shannon diversity index.Using best subset selection, the species richness model with optimal BIC included two predictors, rh100 and rh25 SD, and the model with the second-lowest BIC included three predictors, adding aspect to the model (Table 3).These models had r 2 values of 0.494 and 0.557, respectively.Applying this method to the multivariate regression predicting the Shannon diversity index selected an optimal model, with 10 predictors (Table 4, Figure 4).Regularization was performed on models for species richness and Shannon diversity index.We used a shrinkage parameter lambda, that minimizes the residual sum of squares, plus an L2 constraint, meaning that the shrinkage parameter lambda is the coefficient of an added term that is the sum of the squared coefficients.This constraint encourages the beta coefficient estimates to become smaller than they would be using ordinary least squares regression.Using L2 constraint, ridge regression optimized the shrinkage parameter lambda and yielded sets of coefficients (S2) with 14 degrees of freedom.This was done using the package glmnet; the value of our optimized lambda was 0.04037017 [63].Comparison of multiple linear regression subset selection and regularization methods was performed using 5-fold cross validation to estimate the mean squared error on testing data.Cross validation on multiple linear regression subset selection models for species richness and Shannon diversity yielded average validation set mean squared errors of 572.2 and 0.02 respectively, while cross validation on models using ridge regression yielded higher mean squared errors of 2628.8 and 0.39 (Table 5).Overall, the reduced multiple linear regression models formed using best subset selection (10 predictors for diversity, 2 predictors for species richness) performed the best (Figure 5).This map of Shannon Diversity Index over the Braulio Carrillo National Park area exemplifies the extrapolation capabilities, when using combined ground and airborne data (Figure 5).We see that the highest elevation areas (South West corner of BCNP, bottom left of map) tend to have the lowest species diversity.We obviously do not see a visible mid-elevation bulge of diversity in the raster image, but there are more pixels that are mid-to high-diversity in the center of BCNP, and in the mid-elevation region depicted in the digital elevation model of Figure 1b.There are some homogenous clumps of pixels in the middle of BCNP, which are likely artifacts caused by sharp changes in elevation, which can cause errors with lidar collection [39].The southeast portion of BCNP was not covered by the lidar sensor, so we masked it out of the extrapolation.Table 3. Best subsets of lidar-derived predictors to model species richness along with the r 2 value and BIC of each model.The model with the optimal number of predictors for each response variable is bolded.Elevation = Elev., rh = relative height, SD = standard deviation.

Plot and Landscape Data
Plot level lidar data on topography and three-dimensional structure were compared to landscape-level data in 100 m bins, in order to see if the lidar-derived topographic and height data from the plots was representative of the landscape as a whole.Slope and aspect from the plots were not significantly different than the landscape (p = 0.161, p = 0.818 respectively).The mean canopy heights were correlated for the plots and landscapes (p < 0.05) (Figure 6a); however, the canopy heights were significantly taller across the landscape than plots (p < 0.01).There was a significant difference in standard deviation of plot and lidar derived landscape data (p < 0.01).The standard deviation from plots were relatively similar across the elevational gradient (generally within 2 m), while the standard deviation at the landscape scale showed wide variation in rh100 and rh75 and a clear increase in standard deviation with elevation (Figure 6b,c), with exceptions of the two highest elevation sites 2355 and 2814 m.

Species Richness and Diversity
Overall, patterns of species richness and diversity decreased with elevation, with a peak at 552 m of the total 2906 m of elevation of the mountaintop.This peak in diversity had an overlap of species from both the higher and lower elevation plots, and had the highest number of shared species with the other sites, as determined through a test of beta diversity (Figure 7).This seemingly skewed bulge may also have to do with the geometry of a mountain or mountain range, as higher elevations cover less area as they come to a peak, compared with larger geographic areas at mid-elevations.According to the species-area relationship, larger geographic areas should support a greater variety of species [7].As elevation increases in simple montane environments, the total area within zones of elevation decreases in accordance with basic geometry, which would support the idea that higher elevations would have fewer species.However, the ruggedness of terrain can influence the surface area in certain bands of elevation, making the area effectively larger [7].The mid-elevation bulge of tree diversity has been observed across continents and biomes in the Himalayas [17,18,20], Costa Rica [22], Mexico [14,15], Peru [64], and Ethiopia [21].Although a monotonic linear decrease of diversity has been reported for many studies, it appears that many of these studies often do not sample the entire gradient, instead presenting a truncated form of the mid-gradient bulge pattern [12].These results suggest that if the entire elevational extent is not sampled, the overall patterns may not be correctly identified, and an incorrect assumption might be made regarding linear decreases in plant diversity with elevation change; this may cause the display of a mid-elevation bulge, as we found on our mountain gradient.

Forest Structure from Field Measurements
Patterns in species richness and diversity were not associated with increased stem density and basal area.When comparing the number of stems and basal area directly to the Shannon diversity index, we found that the three highest elevation plots (1976, 1355, and 2814 m) had a distinctly different pattern than the other sites.While most sites had a positive relationship between diversity and higher numbers of stems/larger basal area, the three high elevation plots displayed the opposite pattern.They were relatively species poor, but had higher stem density, particularly at the 2814 m site, where there were twice as many stems within 1-ha.Thus, stem number and basal area from plot data alone may not be used to predict tree species diversity at different elevations.This can be due to a variety of factors, including that our plot locations may not be representative of the gradient as a whole.Without replicates, we cannot know whether certain factors are solely due to the aspect of the plot locations.

Topography and Three Dimensional Structure from Lidar
Lidar metrics on topography and three-dimensional structure were associated with both species richness and diversity.We expected that high elevation, steep slopes, and slopes that receive the most solar radiation would have low species richness and diversity, while high forest canopies with high canopy heterogeneity would have greater species richness and diversity.The topographic metrics of elevation were the most important in this study.Although mean and standard deviation of slope and aspect was not correlated with species richness or diversity, these topography metrics improved diversity models, and combined, could explain more of the variance than the three-dimensional structure metrics.High forest canopies and understories had the highest species richness and diversity, while high canopy heterogeneity-as measured by the standard deviation-have greater diversity.This suggests that high canopies with greater variation allow for more species to persist, resulting in higher diversity.

Plots vs. Landscape
We found that forest canopy height structure displayed a non-linear relationship along the gradient, with various peaks of canopy height.At the landscape scale, when we calculated canopy height across the same elevation as plots in 100 m bins, we found two peaks of canopy height-the highest around 500 m and a secondary peak at 1900 m, with maximum heights of 52 and 45 m respectively.The height metrics extracted from just within the field plots resulted in similar patterns across mean height metrics as those of the landscape scale, although landscape values were generally taller than plots.This can have implications when we use the relationship developed from the plots to the landscape.At the plot level, we found the standard deviation of the height metrics was highest at the lower elevation sites; however, the landscape scale clearly had higher standard deviations of the height metrics increasing with elevation.This high variation in the forest canopy, and difference between the landscape and plot level analysis, could be due to higher disturbance in highlands versus lowlands, which could also affect the diversity and observed shifts in composition [22].These results differ slightly from findings on the same gradient from a study published in 1996, with ground-measured height displaying peaks of height at 300 m and 1750 m elevation, with heights of 47 m and 31 m respectively [6].This is also different from what has been found in the Andes, where tree height and biomass decrease with elevation [29].However, tree height does not always simply decrease with elevation in the Neotropics.In terms of plot locations, they were placed for field surveys, and the lidar was collected years later.If the lidar was collected before setting up the plots, the sampling could have changed, and we may have obtained slightly different results.With the lidar data, we would have had better knowledge of the topography and structural variations, and preferentially chosen certain slopes or elevations based on features seen in that data.We recommend that non-linear models be tested in the future (e.g., below and above 552 m), along with other biodiversity metrics-such as butterfly and bird diversity-that are available for some of the plots in this study.

Modeling Diversity and Species Richness
Lidar metrics of topography and three-dimensional structure can be used to predict tree diversity and species richness to a lesser extent along elevational gradients, in areas without ground data.There have been a number of studies that have examined tree species richness and diversity from plots, and spectral sensors and indices (e.g., NDVI), over tropical forests; however, these have only been able to explain between 30% and 40% of the variation in tropical tree species richness [65][66][67].Active remote sensing, such as airborne radar and small footprint lidar, have been able to explain between 25% to 44% of the variation in tree species richness (10 cm dbh) within tropical forests [37,39,68].Our results from large footprint lidar can explain 94% of the variation in tree diversity, and 49% of the variation in tree species richness.This suggests that large footprint lidar can be used to quantify tree diversity and species richness in regions with 1-ha plots, and may be able to estimate diversity and species richness along elevational gradients of forests in the tropics.For diversity, a number of topography and height metrics best explain tree diversity patterns, and these should be tested in other regions.However, since we tested a number of variables (n = 20) that have been associated with tree diversity from the literature (and observed that there was not a linear relationship with diversity), fewer variables and non-linear models should also be tested in the future.We realize that many variables, with so few plots, can cause an issue with overfitting.It is difficult to avoid overfitting in our situation, because the data collection is limited to small sample sizes, coupled with the presence of a lot of noise in real life.However, this is still an interesting analysis, even with the issue of overfitting.For species richness, the overall canopy height (rh100) and the heterogeneity in the understory (standard deviation of rh25) may be the two best metrics associated with tree species richness at 1-ha for tree 10 cm dbh.Tree species richness may be the best variable to test in the future using air and spaceborne lidar data, since the two metrics are easy to calculate, extrapolate, and because they explain nearly half of the tree species richness.We would expect to find canopy height and understory structure to be correlated with diversity for other tropical forests along elevational gradients.Indeed, Guo et al. [8] identified peaks of diversity in the tropics, and these peaks may be directly related to canopy height and topographic complexity.

Future Research
Active remote sensing systems like lidar sensors are becoming increasingly popular, making new forest analyses possible.Lidar has the ability to collect tree height and other aspects of forest structure and topography that are important for accurate measurements and monitoring of forest inventory information [25,[69][70][71].LVIS lidar has been collected along elevational gradients for a number of forests across the United States; however, the coverage over tropical forests remains limited to Costa Rica (Land, Vegetation, And Ice Sensor (LVIS) [55]).Globally comparative canopy height data (e.g., relative height) are available from the Geoscience Laser Altimeter System (GLAS) sensor, and have been found to correlate with results from LVIS [72,73].Thus, it should be possible to compare topography and three-dimensional structure data from GLAS tracks to tropical field plots, in order to test results from our research, and possibly map diversity and species richness.NASA's Global Ecosystem Dynamics Investigation Lidar (GEDI) will provide a global dataset in the future on canopy structure, similar to GLAS and LVIS, but with a footprints of 25 m.This dataset may be used to maps patterns of vegetation structure using relative height metrics and diversity across elevational gradients in the tropics, and at larger scales than the LVIS airborne lidar used in this study.

Conclusions
We examined tree species richness and diversity from field data (20 1-ha plots) on forest structure (stems, basal area) and large footprint lidar derived data on topography and three-dimensional forest structure along an elevational gradient in Brauilo Carrillo, Costa Rica.This study is special because airborne lidar data, with paired field collections, is rare over tropical elevation gradients.Slope and aspect from the plots seemed to be representative of the surrounding landscape of the national park; however, the canopy height was significantly taller across the landscape, and there were differences in standard deviation of lidar derived plot and landscape data.Species richness and the Shannon diversity index showed a distinct pattern of increasing up to about 600 m elevation, and then decreasing at higher elevations.Stem density and basal area were negatively associated with species richness and diversity, with both peaking at about 2800 m.Elevation was negatively correlated with species richness and diversity, while the two tallest relative height metrics (rh100, rh75) derived from lidar were both significantly positively correlated with species richness and diversity.The best lidar-derived topographical and three-dimensional forest structural models showed a strong relationship with the Shannon diversity index (r 2 = 0.941, p < 0.01), with ten predictors, while the best species richness model was weaker (r 2 = 0.599, p < 0.01), with two predictors (rh100, standard deviation of rh25).The resultant map of Shannon Diversity shows the lowest diversity at the highest elevations in the southwest of BCNP, with higher pixel values of diversity in mid-elevation areas.This suggests a potential for mapping species richness and diversity across elevational gradients in tropical montane ecosystems using spaceborne lidar.

Figure 1 .
Figure 1.Map of Braulio Carrillo National Park using Landsat 8 data collected 26 January 2017, with 20 1-ha plot locations/Braulio Carrillo National Park (black outline) and La Selva Biological Station (white outline).Inset shows location of BCNP in the country of Costa Rica.Plots used in analysis are in Red (TEAM) and Green (Carbono), red circles are around TEAM elevation transect plots.On the right is a Digital Elevation Model (DEM) from the Shuttle Radar Topography Mission (SRTM).

Figure 2 .
Figure2.Species richness and Shannon Diversity Index along an elevation gradient in Costa Rica.The inset shows the variations of diversity across lowland sites where more plots were available for this study.Similar variability may exist in higher elevations but no additional replicates of plots were available to verify.

Figure 2 .
Figure 2. Species richness and Shannon Diversity Index along an elevation gradient in Costa Rica.The inset shows the variations of diversity across lowland sites where more plots were available for this study.Similar variability may exist in higher elevations but no additional replicates of plots were available to verify.

Figure 3 .
Figure 3. Stem density and Shannon Diversity Index from field data along the elevation gradient.The inset shows the variations of the diversity and stem numbers across lowland sites where more plots were available for this study.Similar variability may exist in higher elevations but no additional replicates of plots were available to verify.

Figure 5 .
Figure 5. Map of predicted Shannon Diversity.

Figure 6 .
Figure 6.Plot-level and landscape-level mean lidar-derived relative height (rh) metrics across the elevational gradient (a).Red/orange shades are at the plot-level, green shades are the landscape patterns.Standard deviations of lidar-derived relative height (rh) metrics across the elevational gradients (b) Plot-level; (c) and landscape-level mean.

Figure 7 .
Figure 7. Simple beta diversity metric of shared species between sites based on elevation difference between the two sites.Jaccard and Sorensen indices showed same pattern, as did geographic distance instead of elevation difference.

Table 1 .
Data from 20 1-ha plots along an elevational gradient in Costa Rica.The relative height (rh), slope, and aspect (cosine) were derived from lidar.

Table 4 .
Best subsets of lidar-derived predictors to model Shannon diversity index along with the R-squared value and BIC of each model.The model with the optimal number of predictors for each response variable is bolded.Elevation = Elev., rh = relative height, SD = standard deviation.

Table 5 .
Five-fold coefficient of variation (CV) estimates of Mean Square Error (MSE) for Multiple Linear Regression (MLR) subset selection and ridge regression models.