Article Modelling Forest α-Diversity and Floristic Composition — On the Added Value of LiDAR plus Hyperspectral Remote Sensing

The decline of biodiversity is one of the major current global issues. Still, there is a widespread lack of information about the spatial distribution of individual species and biodiversity as a whole. Remote sensing techniques are increasingly used for biodiversity monitoring and especially the combination of LiDAR and hyperspectral data is expected to deliver valuable information. In this study spatial patterns of vascular plant community composition and α-diversity of a temperate montane forest in Germany were analysed for different forest strata. The predictive power of LiDAR (LiD) and hyperspectral (MNF) datasets alone and combined (MNF+LiD) was compared using random forest regression in a ten-fold cross-validation scheme that included feature selection and model tuning. The final models were used for spatial predictions. Species richness could be predicted with varying accuracy (R2 = 0.26 to 0.55) depending on the forest layer. In contrast, community composition of the different layers, obtained by multivariate ordination, could in part be modelled with high accuracies for the first ordination axis (R2 = 0.39 to 0.78), but poor accuracies for the second axis (R2 ≤ 0.3). LiDAR variables were the best predictors for total species richness across all forest layers (R2 LiD = 0.3, R2 MNF = 0.08, R2 MNF+LiD = 0.2), while for community composition across all forest layers both hyperspectral and LiDAR predictors achieved similar performances (R2 LiD = 0.75, R2 MNF = 0.76, R2 MNF+LiD = 0.78). The improvement in R2 was small (≤0.07)—if any—when using both LiDAR and hyperspectral data as compared to using only the best single predictor set. This study shows the high potential of LiDAR and hyperspectral data for plant biodiversity modelling, but also calls for a critical evaluation of the added value of combining both with respect to acquisition costs.

final models were used for spatial predictions.Species richness could be predicted with varying accuracy (R 2 = 0.26 to 0.55) depending on the forest layer.In contrast, community composition of the different layers, obtained by multivariate ordination, could in part be modelled with high accuracies for the first ordination axis (R 2 = 0.39 to 0.78), but poor accuracies for the second axis (R 2 ≤ 0.3).LiDAR variables were the best predictors for total species richness across all forest layers (R 2 LiD = 0.3, R 2 MNF = 0.08, R 2 MNF+LiD = 0.2), while for community composition across all forest layers both hyperspectral and LiDAR predictors achieved similar performances (R 2 LiD = 0.75, R 2 MNF = 0.76, R 2 MNF+LiD = 0.78).The improvement in R 2 was small (≤0.07)-if any-when using both LiDAR and hyperspectral data as compared to using only the best single predictor set.This study shows the high potential of LiDAR and hyperspectral data for plant biodiversity modelling, but also calls for a critical evaluation of the added value of combining both with respect to acquisition costs.

Introduction
Biodiversity describes the genetic and functional variability within and between all taxonomic ranks of life as well as within and between their organizational aggregates, such as communities or ecosystems.Biodiversity is a key element in the provision of ecosystem services influencing ecosystem functioning and stability [1][2][3][4].Owing to land-use changes, climate change, invasive species and others [5], global biodiversity is in decline and predicted to remain so within the near future, although there is a high degree of uncertainty and future predictions of biodiversity trajectories vary by orders of magnitude [6].Despite considerable political efforts up to the highest international levels, such as the Convention on Biological Diversity (CBD) [7] and its consecutive protocols, the estimated biodiversity loss is still unhalted [8].
One reason for the uncertainty in biodiversity estimates is the inherent difficulty in measuring and monitoring biodiversity.Nevertheless, in order to slow down negative biodiversity trends, knowledge about the spatial distribution and temporal dynamics of biodiversity on scales relevant to the conservation decision makers is needed.Indeed the CBD calls upon each party to monitor "through sampling and other techniques, the components of biological diversity [. . .] paying particular attention to those requiring urgent conservation measures" and further to "identify processes and categories of activities which have or are likely to have significant adverse impacts on the conservation and sustainable use of biological diversity, and monitor their effects" [7].
Conservation planning is usually made on landscape to regional scales [9] where local plot information is too sparse and continental or global estimates are too coarse, yet the area to be covered is too large for direct inquiry.Remote sensing techniques promise to close this gap by providing spatially continuous information that can be related to point estimates of biodiversity at comparatively little cost.Moreover, remote sensing allows frequently repeated recording of environmental information, thereby providing time-series which are essential to monitoring and understanding biodiversity gradients and their temporal trajectories.
According to Turner et al. [10] two basic approaches in assessing biodiversity aspects from remote sensing data can be distinguished, namely direct and indirect ones.The former aims to detect the variable of interest, e.g., species or communities, by directly relating their occurrence to remote sensing derived measures, such as species-specific spectral signatures determined by leaf chemical constituents [11,12] or plant species cover based on species-specific phenology [13].Indirect remote sensing of biodiversity refers to models based on remote sensing derived environmental variables such as primary productivity [14], time series of vegetation indices [15,16] or habitat structure [9,17].
In Europe, forest biodiversity is one of the major conservation targets and at the same time provides one of the major challenges for the establishment of conservation strategies at regional scales.This is because most of Europe's typical natural land-cover would be forests, although much has been lost or converted due to human activities [18].Using remote sensing technologies for direct mapping of plant biodiversity has been reported to work with promising results in open type ecosystems (e.g., [15,[19][20][21]).However, forest biodiversity is difficult to assess due to their explicit three-dimensional structure.Passive remote sensing systems are only able to detect the forest surface, the canopy and the contributing tree species.Consequently many studies on forest plant diversity focus on trees (e.g., [12,[22][23][24][25][26][27]).However, considering vascular plants, the most species rich element of temperate European forests is often the herb layer [28], while species richness of trees is mostly small.
It is commonly expected that very high spatial resolution optical imagery (hyperspatial) is needed for remote sensing based biodiversity assessments [38], since hyperspatial imagery reflects individual small-scale structures such as tree crowns.However, increasing the spatial resolution brings with it also an increase in variability in the reflected signal, which complicates any classification task [26,39].Hyperspectral remote sensing on the other hand allows to very precisely distinguish surface characteristics by recording additive sub-pixel spectral signatures [32], thereby reducing the need for extremely high spatial resolutions and avoiding the corresponding additional noise.
The key open question is whether canopy species composition can be a surrogate for sub-canopy communities.Assuming that communities are entities across forest strata, hyperspectral remote sensing can then be expected to allow modelling complete and sub-canopy forest communities by assessing only a fraction thereof, i.e., canopy composition.This of course will very much depend on the ecosystem and communities in question.However, independent of community relationships, another mechanism may come to aid: Hyperspectral data reflects environmental conditions acting upon plants, such as soil pH, water availability, nitrogen availability and others [40,41], which are known to influence species distributions and community composition.Low canopy nitrogen levels, for example, might then reflect substrate conditions which affect and shape herb layer communities.
On the other hand, active systems like RaDAR and LiDAR directly provide information on the sub-canopy layers of forests by penetrating canopies and preserving the position of reflecting objects.The structural information provided has been shown to be highly valuable for biodiversity modelling encompassing vegetation density distribution in different forest strata, which is an essential variable describing vegetation itself [42] but also for explaining distributions of other groups such as birds [43][44][45], forest beetles [9] or spiders [17].
It is expected that hyperspectral and LiDAR data will complement each other and result in improved modelling capabilities once combined [46], especially for modelling of multi-layered forests.While hyperspectral data provides detailed surface reflectance characteristics, LiDAR adds detailed three dimensional positional information.Studies using hyperspectral and LiDAR data for modelling biodiversity mostly report increased accuracy when using both hyperspectral and LiDAR derived information compared with when using only one of them, e.g., in forest tree species identification [25,46,47].
These hyperspectral and LiDAR approaches have to be seen in context of upcoming space-borne sensors.While studies relying on airborne remote sensing are often limited in their spatial extent, civil hyperspectral space-borne sensors are about to become increasingly available, e.g., EnMap [48] or HySpiri [49].Moreover, they will be complemented by further space-borne LiDAR and RaDAR systems, which may allow assessing three-dimensional vegetation structure from space [30].
The purpose of this study is to investigate which information on vascular plant diversity can be derived from remotely sensed data.In particular, the capabilities of hyperspectral and LiDAR predictors are analysed for different vegetation strata in complex forest ecosystems.The following two hypothesis were tested.First, plant species richness and communities of different forest layers can be modelled relying only on remote sensing data.The combination of LiDAR and hyperspectral predictors is expected to improve the predictive power of the models because they reflect complementary information.Furthermore, we hypothesised that LiDAR performs better than hyperspectral data in modelling sub-canopy patterns.We expected this because the former provides direct information on the vegetation while the latter can do so only via proxy relationships of canopy composition and biophysical state to sub-canopy communities and environmental condition.Second, we expected that reducing the dimensionality of the learning problem prior to modelling the biodiversity metrics would improve model predictions.We used up to 135 predictors, many of which are noise variables.Therefore, we expected an increase in model performance when only relevant predictors are used as input variables.

Study Site
The study area is situated in the Bavarian Forest national park in Germany at the border to the Šumava national park of the Czech Republic (Figure 1).The Bavarian Forest national park covers an area of 24,000 ha and ranges from 600 to 1,440 m a.s.l.It is located at about 49 • N, 13 • E and hence subject to a mixture of oceanic and continental climate.The prevailing vegetation type is montane mixed forest on predominantly acidic soils, dominated by Picea abies, Fagus sylvatica and Abies alba.Towards higher altitudes Picea abies becomes the dominating species [43].Temporally and spatially distributed monitoring of biodiversity patterns at landscape scales is especially important in the Bavarian Forest national park because of its highly dynamic environment created by the European spruce bark beetle Ips typographus, which has contributed to a rapidly changing landscape of different successional stages [51].

Hyperspectral Data
Two scenes of hyperspectral imagery were acquired in August 2009 by the airborne HyMap TM sensor [52], which was operated by the German Aerospace Center (Figure 1 and supplementary materials, Figure S1).The HyMap sensor is a whisk broom sensor that records 128 bands, of which 125 bands remain after preprocessing, in the wavelength range of 450-2,480 nm.Technical issues with the differential GPS unit during the flight campaign meant that the ordinary GPS position had to be used, resulting in a reduced spatial accuracy.
The two scenes were spaced about 10 km apart.Both the northern and the southern scene were approximately 4 km wide and 8 km long.This corresponds to a covered area of 3,116 ha for the northern and 2,713 ha for the southern scene.Flight altitude was approximately 4,000 m a.s.l.resulting in a spatial resolution of 7 m.Because of a non-ideal flight time (10 AM), significant shadows in both scenes and clearly visible bidirectional reflectance effects (BRDF) in the northern scene (flight heading ≈ 96 • ) could not be prevented.Cloud cover was 0% in both cases.
Geometric, radiometric and atmospheric preprocessing of the HyMap data was conducted by the DLR.Orthorectification and atmospheric correction was carried out using ATCOR4 [53] with settings for rugged terrain, variable water vapour and rural aerosol type.The final product was Level 2 processed surface reflectance.The geometric accuracy of the HyMap scenes was improved by manual geo-referencing based on aerial colour infra-red (CIR) imagery of 0.4 m spatial resolution.Total root mean square errors were 2.7 m and 5.5 m, respectively.
In order to reduce the high redundancy in the hyperspectral imagery, the data were subjected to a minimum noise fraction (MNF) transformation [54] as a means of information concentration using the forward MNF transform implemented in ENVI v4.7 [55].The transformation was based on a mosaicked image of the two HyMap scenes in order to ensure that the MNF components reflect the same features in both scenes, which was a necessary condition for combined modelling of the northern and southern scene.The eigenvalues of the MNF components dropped to a minimum of approximately 6.Using an elbow criterion, one could have cut-off the MNF components at about the 50th MNF component.However, in visual examination of the MNF images, features could still be identified in the 125th component.Therefore, all MNF components were used as predictors for further analysis.

LiDAR
Airborne, full-waveform LiDAR data were acquired in May 2007 after leaf flush using a Riegl LMS-Q560 scanner [43].The laser operated at a wavelength of 1,550 nm.Average point density was 25 pt m −2 at a vertical accuracy of 0.15 m.Flight altitude was 400 m above ground, which resulted in a footprint diameter of 0.25 m.Three transects of about 400 m width were scanned, which equals an area of 558 ha within the northern and 429 ha within the southern HyMap scene.
The waveform echoes were decomposed into discrete returns by fitting Gaussian functions as described in Reitberger et al. [56].One laser pulse generated one to eleven returns depending on the forest structural complexity.Subsequently, these were used to calculate a digital terrain model of 1 m resolution, which was subtracted from the height of all other returns to convert them from altitude above sea level to height above ground.See Müller et al. [43] for a detailed description of these LiDAR preprocessing steps.
Based on these 3D point-clouds, ten structural indices were calculated.A canopy model of a spatial resolution of 0.25 m was derived, from which mean canopy height (HMEAN), standard deviation of canopy height (HSTD), maximal canopy height (HMAX) were calculated for 7 m raster cells.For the same raster we calculated penetration ratios, a measure of vegetation density, for multiple forest strata as the number of returns below the lower layer boundary divided by those below the upper layer boundary.Layer ranges reflected ecologically meaningful heights of herb layer (PR0-1.5, 0-1.5 m), shrub layer (PR1.5-5,1.5-5 m) and tree layer.The latter was further split up into two categories (5 to 12 and 12 to 50 m), mainly dividing between young and old trees.Furthermore, overall penetration ratios over all layers were calculated (PR0-50, PR2-50, 0-50 m and 2-50 m respectively).Finally, the height of median return energy (HOME) was calculated based on the initial waveforms as described in Drake et al. [57].The HOME parameter reflects both canopy cover and the vertical distribution of sub-canopy elements.

Biodiversity Data
A vegetation survey was carried out in July and August 2006 as part of the BIOKLIM project [58].Sampling followed two main altitudinal transects within the area covered by remote sensing data (supplementary materials, Figure S1).In total 181 circular plots (radius = 8 m) were established, spaced 100 m apart.At each plot, vascular plant species cover-abundance was recorded for herb layer (HL), shrub layer (SL) (<5 m), tree layer (TL) (>5 m).Eventually, HL, SL, and TL were combined into a single presence-absence scaled set (ALL).Cover was estimated using the Braun-Blanquet scale.
Because of large-scale bark beetle outbreaks, several of the plots have lost their canopy cover completely from the date of their recording in 2006 up to the date of the HyMap overflight in 2009.Thirty-two plots which have been subject to bark beetle infestations were excluded from any further analysis, based upon visual change detection using aerial CIR photography from 2006 and 2009, leaving a total of 148 plots.Since not all layers were present in all plots, this corresponded to 147;92;129 plots in HL, SL and TL respectively.
Vascular plant α-diversity was analysed in terms of species richness SR and Shannon index H . SR was calculated as the count of all species in a plot.Additionally, the abundance based H was modelled because of its usefulness in conservation, giving emphasis to rare species [59].Since the calculation of the Shannon index requires abundance data it was not calculated for the presence-absence dataset ALL.
While α-diversity in itself reflects only a limited aspect of biodiversity, it becomes very useful for conservation planners when augmented with taxonomic identity in the form of community composition.Community composition was examined by non-metric multidimensional scaling (NMDS) [60,61].NMDS maps the position of sites in species space, in our case represented by the Bray-Curtis distance, onto a predefined small number of axes in an iterative search for an optimal solution.As with most descending gradient algorithms, the solution of NMDS can be sensitive to starting configurations when local minima are present in the objective function.In order to circumvent this problem, we conducted multiple NMDS runs with random starting configurations until a convergent solution was identified.The number of NMDS dimensions was fixed to two for all layers after visual examination of the scree plot of stress vs. dimensions.Species scores in NMDS space were estimated a posteriori as abundance weighted average scores.Subsequently, site scores on the NMDS axis were modelled separately for each axis, i.e., NMDS 1 and NMDS 2 .

Data Preparation
For all plots, the corresponding values of the MNF transformed HyMap data and the LiDAR indices were extracted using an 18 m buffer around the plot centre.All pixels falling into the buffer were aggregated by their mean.The buffer size of 18 m was decided upon based on the plot size of radius 8 m and a typical GPS error of the true plot location of 10 m, owing to poor signal transmission in dense forests.
After the extraction of the predictor values, the LiDAR predictors were combined with the MNF transformed HyMap predictors.Three final sets of predictors were used for all further analyses: LiDAR (LiD, 10 variables), MNF components (MNF, 125 variables), LiDAR and MNF components (MNF+LiD, 135 variables).,H ) and non-metric multidimensional scaling (NMDS) ordination scores of two NMDS axes were modelled as response variables (NMDS 1 ,NMDS 2 ).The procedure was repeated for herb, shrub and tree layer and their total, using LiD, MNF and MNF+LiD predictors both with and without feature selection.

Statistical Analysis
A multitude of algorithms has been used in the literature for remote sensing based biodiversity modelling, such as support vector machines [46], neural networks [62], partial least squares regression [63] or random forests [64].We used random forests [65], because of their good predictive performance in high dimensional problems [66].
See Figure 2 for a schematic of the methodological framework.All biodiversity metrics were modelled individually per vegetation layer as response variables as a function of the three sets of predictor variables.Reported correlations are in terms of Kendall's τ rank order correlation coefficients since variables were not distributed normally.We selected Kendall's τ because it makes fewer assumptions than Spearman's ρ.While both Spearman's and Kendall's correlation coefficients are rank-based and thus non-parametric, Spearman's ρ assumes equal distances between adjacent ranks, whereas Kendall's τ only utilizes ordinal information.Maps are displayed in WGS84 UTM 33N projection.
Model building was embedded in a 10-fold cross-validation scheme.In order to ensure similar distributional properties in each fold, they were assembled by means of random sampling stratified on response variable quartiles.This follows the advice from Kohavi [67] who found stratified cross-validation to be the most suitable accuracy estimation scheme for model selection giving the least biased estimates of accuracy.Cross-validation was necessary to obtain unbiased prediction error estimates after model selection and tuning [68,69].Two cross-validation schemes were employed, one with internal feature selection and one without (details below).Within each fold, the following steps were taken.
First, all predictor variables were used to fit random forest models.Bootstrapping was conducted with replacement.The number of predictors tried on each split, mtry, was optimized by means of a extensive grid search over a fixed set of values.The number of trees was fixed to ntree = 1,000 as visual analysis showed convergence of the out-of-bag (OOB) error estimate R 2 OOB to stable values around 600 trees.All other tuning parameters were fixed to their default values.The steps above were repeated for each mtry i and the best model was chosen subsequently based on maximum R 2 OOB .Finally, the selected random forest was used to predict the partition of data left out by the cross-validation scheme.The predictions were then aggregated over all folds and the cross-validated prediction accuracy for regression was quantified by R 2 CV , which are reported with a bootstrapped estimate of their standard deviation.
As an expansion of this approach, a feature selection based on all predictors was carried out using the Boruta algorithm [70] before fitting the random forest but within each cross validation group in order to prevent over-fitting [68,71].This step was motivated by an expected increase in model predictive power by decreasing the amount of noise variables [72][73][74].The algorithm tests whether variable importance of each individual predictor is significantly higher than expected by chance.To this end, many random forest models are fit iteratively, either until all predictors are classified as confirmed or rejected at a significance level of 5% or until a predefined maximum number of iterations imax is reached.Predictors which are not significantly better or worse than random are then labelled as tentative.Boruta was executed with imax = 1,000 and ntree = 1,000.Feature selection was followed by random forest fitting based on the subset of confirmed and tentative predictor variables and all further steps as described above.
After model evaluation, the final ten random forest models were used in prediction mode based on an input raster covering the overlapping area of MNF and LiD.The ten prediction rasters were then averaged to a single final prediction.Additionally, the standard deviation σ CV of the ten predictions was calculated as a measure of uncertainty.In parallel to σ CV we present the spatial distribution of the cross-validation residuals.
All statistical analysis was carried out using R v2.14 [75].A transcript of the R-code is provided in the supplementary materials to this article.NMDS was conducted using the metaMDS function of the vegan package [76].Random forests were fit using the randomForest package v4.6-6 [77].Feature selection was conducted with the R-Package Boruta v1.6 [78].Graphics were prepared using the ggplot2 package [79].
Overall species richness ranged from 1 to 43 species per plot with twelve species on average.About 75% of the plots hosted no more than 15 species.Overall species richness was dominated by the herb layer, which featured the same distributional moments.Shrub layer and tree layer hosted only two species on average, namely Picea abies and Fagus sylvatica, and never more than five species, i.e., species richness was strongly skewed to the left in all layers.There was almost no correlation between species richness and altitude (τ = 0.07) and also no apparent spatial clustering (supplementary materials, Figure S2).Species richness exhibited high small-scale variability, where plots with high species richness mostly coincided with open areas or proximity to roads, abandoned forest tracks and settlements.In particular, the species richness of HL and also of SL was nearly independent of TL species richness (τ = −0.05 and 0.21).
Values of the Shannon index were highly correlated to species richness (Table 1) for shrub and tree layer (τ > 0.8), yet only modestly correlated in the case of the herb layer (τ = 0.5).
NMDS ordination onto two dimensions resulted in stress values per layer of 0.18, 0.21, 0.08 and 0.09 for ALL, HL, SL and TL.The scores of the first NMDS axis were weakly correlated to species richness and Shannon index (τ ≤ 0.3) within all layers.However, within ALL, HL and TL the scores on the second NMDS axis were moderately correlated to SR and H with −0.6 ≤ τ ≤ −0.4 (Table 1).Within the layers ALL and HL, high NMDS 1 scores corresponded to species typically found under sparse or open canopies, such as Deschampsia cespitosa L., Trientalis europea L. or Epilobium angustifolium L. (Figure 3).Contrastingly, species with their optima at low NMDS 1 scores were typically closed-canopy, shadow tolerating species such as Anemone nemorosa L., Lamium montanum Hoffm.ex Kabath, Carex sylvatica Huds.or Dryopteris filix-mas (L.) Schott.This could be confirmed by correlating species scores on the first NMDS axis with their respective Ellenberg light values (L) [80], which yielded a correlation of τ = 0.5 (-0.2 for NMDS 2 ).Other Ellenberg values, moisture F , nitrogen N , reaction R and temperature T were only weakly correlated to any of the NMDS axes (τ < 0.3).The ordination space of SL and TL mainly separated Picea abies and Fagus sylvatica, yet one has to bear in mind that they co-occurred in almost all plots and differed only in their abundance (supplementary materials, Table S1).
Table 1.Kendall's τ rank order correlation between response variables species richness (SR), Shannon index (H ), first and second NMDS axis scores (NMDS) for herb (HL), shrub (SL), tree (TL) layer and their total (ALL).Diagonals are not displayed.For the presence-absence scaled layer composite ALL H could not be calculated.
Bi-plot of site and species scores in NMDS space for herb (HL), shrub (SL), tree layer (TL) and their total (ALL).Species acronyms consist of the first three letters of genus and species respectively (see supplementary materials, Table S1 for decoding).Only species occurring in more than 20% of the plots are shown.Colour mapping depicts species richness.

Predictor Variables
There were strong correlations between some of the LiDAR variables up to a degree of |τ | ≤ 0.8 (Figure 4).Contrastingly, almost no correlation remained in the MNF predictor set.The small degree of correlation that did exist was due to the aggregation of MNF values in the buffer around the ground plots.Between the LiDAR and MNF predictor variables, a moderate degree of correlation was observed (|τ | ≤ 0.6), however in most cases there was no correlation.

Model Fitting
Optimization of mtry, the number of variables tried per node, lead to an average improvement in R 2 OOB of 0.09 and 0.07 with and without feature selection as compared to the worst model within each group.Moreover, maximal improvements of R 2 OOB reached up to 0.28.No generality for a unique optimal value of mtry could be observed.All tried values were chosen regularly as the optimal solution, although there was a tendency towards higher mtry values when no feature selection was conducted.
The predictors selected by the Boruta algorithm were mostly stable.Multiple runs on the same dataset typically resulted in about 81% consistently confirmed variables, while about 19% of the variables were identified only in some runs.After Boruta feature selection, the number of selected predictor variables ranged from 5 to 31 variables (supplementary materials, Figure S6).
From the LiD predictor set, most of the ten variables were selected as important in all cases for the layers ALL and HL across all response variables.For SL and TL not all LiD predictors were selected.Most of the selected MNF predictor variables were found within the first 35 MNF components.However, there were always some higher order MNF components which were chosen in most if not all ten cross-validation groups.There was high stability in modelling with the predictor composite MNF+LiD, in which mostly the same predictor variables as in the respective single sets were selected (see Figures S6-S8 in supplementary materials for detailed band selection histograms).

Model Results
Both the calculations with and without feature selection yielded very similar model performance (Figure 5).There was a maximal improvement of 0.09 in R 2 CV for MNF based prediction of HL species richness due to preceding feature selection.However, in other cases, performance dropped by 0.12 after feature selection was enabled.Nevertheless, differences in R 2 CV between runs with and without feature selection were almost always found to have overlapping bootstrapped standard deviations.In the majority of cases, feature selection provided no improvement.
In most cases, the OOB estimates matched the cross-validated estimates closely, with the notable exception that in all cases involving MNF predictors there was a clear mismatch between R 2 CV and R 2 OOB estimates after feature selection.In these cases, the OOB estimates exceeded the R 2 CV by up to a maximum of 0.3.Only in very few cases of SL did R 2 OOB underestimate the actual R 2 CV by up to 0.2.Species richness could be modelled with moderate success (Figure 5, row one).Best results were obtained for SL which could be modelled with R 2 CV = 0.55 without feature selection.ALL and HL reached R 2 CV values of 0.3 and 0.26 respectively.LiD performed the best in predicting ALL, HL and SL richness, while only TL richness was explained better by MNF and MNF+LiD.However, there was a drop in performance when using MNF+LiD as compared with LiD in the case of ALL, HL and SL.Contrastingly, TL benefited from the combination of MNF and LiD, reaching an R 2 CV of 0.29.For MNF predictors, feature selection almost always resulted in a small improvement in model performance, which however remained in the range of ±1σ of R 2 CV without feature selection.Modelling abundance based H resulted in smaller R 2 CV values than SR (Figure 5, row two).Without feature selection, SL R 2 CV still reached 0.25 for LiD predictors.The contribution of MNF was small, leading to R 2 CV = 0.27 when combined with LiD and feature selection.HL and TL H models achieved only poor predictive performance with each predictor set (R 2 CV < 0.17).Again in all MNF cases, after preceding feature selection, R 2 OOB values, which were previously close to the R 2 CV values, were 0.13 to 0.3 higher than the latter.
Scores of the first NMDS axis were modelled with high accuracy for ALL and HL by any of the predictor sets (Figure 5, row three).R 2 CV did not differ much between LiD, MNF and MNF+LiD ranging from 0.75 to 0.78 (ALL) and 0.49 to 0.52 (HL) without feature selection.This is confirmed by tightly overlapping density distributions of the bootstrapped estimates (see Figure S.10 supplementary materials).Feature selection had almost no effect on model performance.SL NMDS 1 model performance benefited from the combination of LiD+MNF.R 2 CV increased from 0.34 (LiD) and 0.4 (MNF) to 0.46 (MNF+LiD).Similarly, TL R 2 CV was higher for MNF (0.37) than LiD (0.29) and highest for MNF+LiD (0.39).In contrast to SR and H , R 2 CV decreased for SL and TL models with increasing R 2 OOB when feature selection was conducted.Finally, the scores on the second NMDS axis were modelled with moderate success (Figure 5, row four).LiD based models of ALL, HL and TL obtained R 2 CV of 0.24, 0.3 and 0.22 respectively.SL NMDS 2 scores could not be modelled successfully.As before, the same pattern of increasing R 2 OOB with decreasing R 2 CV was observed when feature selection was enabled for MNF and MNF+LiD.

Spatial Predictions
The spatial predictions of the models with highest cross-validated performance are shown in Figures 6-8, namely SR for SL as well as NMDS 1 for ALL and HL.
Overall species richness (supplementary materials, Figure S11) was generally predicted to be in the range of 5 to 20 species.The models typically overestimated SR, except for the few plots with very high SR which were grossly underestimated, as can be seen from the residuals.Open areas, mostly post bark beetle succession areas, showed highest disagreement between the ten cross-validation LiD models.MNF based predictions showed significant salt and pepper effects, owing to the high noise content in higher MNF components.The decrease in accuracy (Figure 5, row one) when combining both LiD and MNF is clearly visible in the overestimation of SR.The SL species richness (Figure 6) was predicted to be highest in open areas, especially along forest edges and in mid-altitude mixed forest.There was no clear tendency with respect to the spatial pattern of residuals.Agreement between the cross-validation models, however, was high in all cases.Predictions of overall NMDS 1 scores clearly separated closed-canopy and open-canopy areas (Figure 7).Agreement between the prediction maps of LiD, MNF and MNF+LiD was very high and the same holds for the ten cross-validation models of each predictor set.For LiD no specific tendency in the spatial distribution of the residuals was found.In the MNF based prediction, open areas and transition zones are typically slightly underestimated.This bias is reduced in part when combined with LiD.
The NMDS 1 scores of HL were, as expected, similarly distributed as the overall estimates (Figure 8).MNF based models, as compared to LiD, had an increased tendency to underestimate NMDS 1 scores in open areas and transition zones and to overestimate them under closed canopies.

Discussion
With this study we aimed to estimate the added value of combining hyperspectral and LiDAR remote sensing data for modelling vascular plant diversity patterns in complex forest ecosystems.We expected that, since both systems provide distinct information, they would complement each other, leading to an improved set of model predictors.However, we found that the added value of hyperspectral plus LiDAR data was small.While both systems achieved fair cross-validated accuracies in modelling vascular plant species richness and performed well in modelling aspects of overall, herb layer and tree layer communities, combining them did not result in improved models.This is unexpected, since the limited correlation between LiD and MNF variables suggested that each one contains distinct information (Figure 4).
Overall, LiDAR based models performed best with respect to all vegetation layers.Hyperspectral data only provided valuable information on species richness within the tree layer.This result was expected since the trees shape most of the spectral reflectance recorded by HyMap.Therefore, we conclude that the tree layer composition does not provide sufficient proxy information on the underlying strata concerning SR in our study area.In general, modelling species richness directly from remote sensing data is complicated by the fact that very different communities with different reflective properties may share the same species richness.A way to circumvent this problem is by exploiting relationships between variability in spectral reflectance and species diversity [23,36,81].Such approaches, however, are limited to mapping of "optically accessible" species, i.e., either to studies in open systems or studies concentrating on canopy species diversity.In temperate forests, where the herb layer is the most numerous element of species richness [28], this approach can only work if species richness of the herb layer is correlated to tree layer richness.Indeed, Vockenhuber et al. [82] reported such a relationship in German lowland beech forests.The presented data in contrast did not share this relationship (τ = 0.05), rendering this approach unsuited to the Bavarian Forest study site.
Model results for the abundance-based Shannon index were poor in most cases, except for the shrub layer.This is in contrast to Oldeland et al. [21] who found that using abundance based estimators of biodiversity could increase model performance as compared to using species richness alone, albeit only in some cases and for open savanna vegetation.These differing results may be due to the fact that in complex forest communities, reflectance is only a function of a fraction of the occurring species, i.e., mostly canopy species, and fractional cover does not scale with abundance in the remotely retrieved image.Nevertheless, along those lines it is surprising that hyperspectral data could not explain the tree layer Shannon index.
Community composition based on NMDS ordination scores was modelled with great success, especially for overall and herb layer composition using either LiD or MNF predictor sets, whereas tree layer community composition was predicted better using MNF predictors and best by MNF+LiD.In the case of ALL and HL, the first NMDS axis mainly reflected species turnover along a canopy openness gradient, spanned by shadow-adapted, forest-dwelling species to light-demanding, non-forest species.This explains the exceptionally good models, as canopy density is a fundamental quantity measured by LiDAR [45,83].With hyperspectral data this is less intuitive, although it has been used with moderate success in modelling forest structural attributes such as stem density [84].Our results suggest that unlike for species richness, there is a linkage between canopy and sub-canopy communities, which explains the good performance of hyperspectral predictors.
Models predicting the second NMDS axis did not reach the performance of NMDS 1 models.Nevertheless, ALL, HL and TL, unlike SL, still achieved moderate performance.This can be attributed to the second ordination axis explaining less variability of the vegetation composition in general and additionally because it reflected species richness, which in itself could be modelled with only moderate success.This is in contrast to results from non-forest ecosystems.Schmidtlein et al. [85] successfully modelled three dimensions of NMDS scores of non-forested wetland vegetation with R 2 CV of 0.92, 0.82 and 0.78 using partial least squares regression based only on hyperspectral data, however with the precondition of homogeneous plots.Using the same approach, Feilhauer et al. [37] modelled open heath land-vegetation with R 2 CV of 0.74, 0.58 and 0.54.The stagnation or even decrease in performance of several MNF+LiD models as compared to LiD alone could be due to the lower probability of LiD predictors to contribute to each tree and hence a methodological limitation.At maximum, 125 MNF variables were used together with 10 LiD variables, out of which only mtry predictors are sampled for node splitting.If there was such an effect, one would expect it to be reduced after feature selection, which reduced the numbers of MNF variables greatly, sometimes to equal numbers of LiD and MNF predictors.This, however, was not the case.Nevertheless, effects might be obscured by over-fitting as outlined below.
No studies have been published comparing LiDAR and hyperspectral data for modelling of plant species richness or communities.However, several studies have tested both for species classification.The results of this study regarding the added value of hyperspectral plus LiDAR data differ from those published in literature, where mostly improvements in classification accuracy are reported.Dalponte et al. [46] modelled 18 forest tree species using hyperspectral and LiDAR data and arrived at class [producer] accuracies of 40% to 90% with an increased accuracy for some tree species of up to 13% comparing only hyperspectral to the combination of hyperspectral and LiDAR data.Mundt et al. [86] reported an increase in overall mapping accuracies from 75% to 89% in mapping a single shrub species in a steppe environment when combining hyperspectral and LiDAR as opposed to using hyperspectral data alone.However, some studies also found that LiDAR provides improved models only in specific cases.Jones et al. [25] reported no overall improvements in classifying forest tree species, but specific improvements for some species by 5% to 12% in producer's accuracy.Moreover, Onojeghuo and Blackburn [87] found a negative impact of including LiDAR derived canopy height and surface models in a maximum likelihood classifier for mapping a single vegetation class.However, using LiDAR for post-classification correction resulted in increased accuracies.
It is a challenge of increasing importance to correlate biodiversity on the ground with remote sensing data.As a matter of fact, the detection of plant species α-diversity from remote sensing imagery, such as species richness or the Shannon index, is difficult to achieve due to complex vegetation structures (canopies) and different plant sizes.However, there is growing awareness that at the relevant scale of ecosystems and communities, other aspects of biodiversity such as β-diversity are more important for detecting and explaining changes [88].In detail there are various perceptions of β-diversity (see, [89,90]) but generally this aspect of biodiversity refers to qualitative differences between reference plots such as pixels in remote sensing data.Analysing spatial patterns of compositional contrast and integrating them to multi-plot comparisons between neighbouring datasets (e.g., raster cells) [91,92] allows to make a direct connection between the biotic information gathered on the ground and the remote sensing regarding spatial trends of dissimilarity and heterogeneity.When calculating values of dissimilarity, spatial grain (plot/pixel size) and distance decay must be considered [93,94].Summing up, β-diversity measures derived from community composition can be seen as a promising tool for the future development of remote sensing of biodiversity, as they are not bound to individual species traits or numbers.
Testing our second hypothesis, we examined whether the high dimensionality of the input predictor sets had a negative impact on model quality.Our results clearly show that using only a reduced set of relevant predictors did not substantially benefit the predictive performance of the random forest models in most cases.This confirms claims of random forests being able to deal with large amounts of non-informative predictors, rendering any attempts for preceding feature selection unnecessary [95].On the other hand, the step of selecting all relevant features [78] by means of the Boruta algorithm may not have been enough.Subsequent stepwise model building based on subsets of these features sensu Genuer et al. [72], i.e., the minimal-optimal feature set, may have improved the predictive performance, as the latter is often a unimodal function of the number of predictors in statistical models [69,74].Nevertheless, given the computational cost of up to 10,000 fitted random forests per mtry i , response and predictor variable in the presented cross-validation scheme, feature selection for random forests can be seen as a dead end.
There was substantial collinearity between the LiDAR predictors, which is a serious issue in statistical modelling.Random forests have been shown to be sensitive to collinearity among predictors concerning variable importance measures [96].However, with respect to predictive performance, which was the target quantity of our analysis, collinearity can be assumed to be negligible, as long as the collinearity among predictors remains constant [97].Nevertheless, extrapolating beyond the geographical or environmental range of training data may be problematic [98].
At a methodological level, this study calls for caution in relying only on OOB estimates for assessing prediction performance.The OOB accuracy estimates overlap very well with those obtained in ten-fold cross-validation unless feature selection is conducted in the presence of many noise variables.The increased difference between R 2 OOB and R 2 CV after feature selection may be due to over-fitting to the training dataset, although this is not expected to occur in random forests [65].This is noted also by Dudoit et al. [99] who state that "the out-of-bag method [. . .] for estimating misclassification rates results in overly optimistic estimates of the error rates" when preceded by feature selection.
So far, although we did not model species directly, species richness and communities were modelled by a direct approach sensu Turner et al. [10], i.e., we directly related reflectance to species communities or species richness, rather than first estimating environmental parameters (primary productivity, water stress, etc.) and then using these to model biodiversity variables.A promising next step is now to make full use of the information provided by the sensor systems and combining both direct and indirect approaches.This could mean including abiotic geomorphological predictors (slope, aspect, topographic heterogeneity, etc.) derived from the LiDAR terrain model, which have been shown to influence species distribution patterns at various scales (e.g., [100,101]).Moreover, these geomorphological variables, sometimes in conjunction with climate data, allow to model further predictors such as irradiation or soil wetness [42,102].
The sometimes limited predictive performance of the models might have been caused in part by the high variability in land cover types, which ranged from closed canopy forests to early-successional open areas resulting from bark beetle infestations and wind-throw.One does expect that decision tree based methods are flexible enough to deal with such difficulties.Nevertheless it remains to be tested whether, for example, a stratification of the response variable based on openness or "optical accessibility" can improve the models.
In order to model biodiversity, we broke down the problem to modelling indices and ordination scores separately for each axes.This discontinuous statistical analysis is one major limitation of the approach taken in this study, since the step of information condensation is accompanied by a loss of valuable ecological information such as taxonomic identity or multidimensional position in ordination space.We believe that further steps towards truly multivariate learning techniques are highly promising.A candidate method for direct prediction of communities, without the step of ordination or index calculation, might be for example multivariate random forests [103], yet these are still at the very early stages of development.

Conclusions
This study has analysed the potential of LiDAR (LiD) and hyperspectral (MNF) remote sensing for modelling vascular plant diversity patterns in complex forest ecosystems.Both are promising techniques for regional scale biodiversity modelling, but the information gain of their combination should be thoroughly evaluated.Overall, the added value of hyperspectral plus LiDAR data was found to be small, with a maximal improvement in cross-validated R 2 CV ≤ 0.07 across all analyses in this study.For modelling overall species richness, LiDAR predictors were the best choice (R 2 LiD = 0.3, R 2 MNF = 0.08, R 2 MNF+LiD = 0.2).Broken down into forest layers, LiDAR performed best for herb and shrub layer (R 2 LiD = 0.26 and 0.55, R 2 MNF = 0.03 and 0.02, R 2 MNF+LiD = 0.16 and 0.43, respectively), while hyperspectral predictors were better in explaining tree layer species richness, only surpassed by the combination of both (R 2 LiD = 0.11, R 2 MNF = 0.17, R 2 MNF+LiD = 0.22).When community composition was of interest, both systems provided similar performance overall and in the case of the herb layer (R 2 LiD = 0.75 and 0.55, R 2 MNF = 0.76 and 0.49, R 2 MNF+LiD = 0.78 and 0.52, respectively).For the tree

Figure 2 .
Figure 2. Processing framework.Predictor variables were derived from LiDAR (LiD) and hyperspectral (MNF) data.Biodiversity indicators (SR,H) and non-metric multidimensional scaling (NMDS) ordination scores of two NMDS axes were modelled as response variables (NMDS 1 ,NMDS 2 ).The procedure was repeated for herb, shrub and tree layer and their total, using LiD, MNF and MNF+LiD predictors both with and without feature selection.

Figure 4 .
Figure 4. Kendall's τ rank correlation for inter-predictor correlation.The frequency histograms are based on the upper triangular correlation matrix (diagonal removed).The LiDAR and MNF panels are based on correlation between all predictors within a predictor set, whereas the MNF+LiD panel shows only the correlation coefficients between the ten LiDAR and the MNF variables.

Figure 5 .
Figure 5. Model assessment of ten-fold cross-validation of random forests with and without feature selection and with mtry optimization.Response variables: Species richness (SR), Shannon index (H ) and ordination scores (NMDS 1 , NMDS 2 ).Predictor variables: LiDAR (LiD) and hyperspectral (MNF) predictor sets and their combination (MNF+LiD).Bars: cross-validated R 2 CV with error bars depicting the bootstrapped standard deviation of R 2CV .Points show the corresponding mean OOB estimates R 2 OOB averaged over the ten cross-validation groups together with their standard deviation in error bars.Values of R 2 below zero are artifacts of the calculation, should be interpreted as zero and are hence displayed as such.

Figure 6 .
Figure 6.Predicted species richness of the shrub layer (SL) for the LiDAR (LiD) and hyperspectral (MNF) predictor sets and their combination (MNF+LiD).Upper panel: averaged predictions of ten cross-validation models and plot information used to train and evaluate the random forests.Background: color-infrared composite of the HyMap data.Lower panel: standard deviation of the predictions of the ten cross-validation models.Bar charts show the residuals between predicted and observed estimates in cross-validation.Positive residuals are due to an overestimation by the model and vice versa.

Figure 6 .Figure 7 .
Figure 6.Predicted species richness of the shrub layer (SL) for the LiDAR (LiD) and hyperspectral (MNF) predictor sets and their combination (MNF+LiD).Upper panel: averaged predictions of ten cross-validation models and plot information used to train and evaluate the random forests.Background: CIR composite of the HyMap data.Lower panel: standard deviation of the predictions of the ten cross-validation models.Bar charts show the residuals between predicted and observed estimates in cross-validation.Positive residuals are due to an overestimation by the model and vice versa.

Figure 7 .Figure 8 .
Figure 7. Spatial prediction of position on the first NMDS axis of all vegetation layers combined (ALL) for the LiDAR (LiD) and hyperspectral (MNF) predictor sets and their combination (MNF+LiD).Upper panel: averaged predictions of ten cross-validation models and plot information used to train and evaluate the random forests.Background: CIR composite of the HyMap data.Lower panel: standard deviation of the predictions of the ten cross-validation models.Bar charts show the residuals between predicted and observed estimates in cross-validation.Positive residuals are due to an overestimation by the model and vice versa.

Figure 8 .
Figure 8. Spatial prediction of position on the first NMDS axis of the herb layer (HL) for the LiDAR (LiD) and hyperspectral (MNF) predictor sets and their combination (MNF+LiD).Upper panel: averaged predictions of ten cross-validation models and plot information used to train and evaluate the random forests.Background: CIR composite of the HyMap data.Lower panel: standard deviation of the predictions of the ten cross-validation models.Bar charts show the residuals between predicted and observed estimates in cross-validation.Positive residuals are due to an overestimation by the model and vice versa.